What is Scientific Truth—And Why Does it Keep Changing?

Lorraine Daston

Read as a PDF.

Follow the science. But which science, whose science, today’s science or tomorrow’s? The SARS-COV-2 pandemic turned virologists and epidemiologists into unwilling oracles, pressed by politicians, press, and public alike to provide stable guidance in unstable times. How did the virus spread, did masks work, were children at risk, was it safe to hug, did taking ibuprofen make symptoms better or worse, how many people would die, when would it all end? Disconcertingly, the scientists’ answers to all these questions seemed to change weekly. The oracle at Delphi never answered the same question twice and therefore never had to change its mind. The scientists were in contrast questioned – and queried – almost daily, and their answers changed almost as quickly. New observations by clinicians, new experiments in laboratories, new results of clinical trials corrected, contradicted, or simply confused the old answers. Following the science left everyone breathless, including the scientists themselves.

Because it thrust science into the limelight, the pandemic forced the lay public (and perhaps many scientists) to confront an apparent contradiction at the heart of the modern empirical sciences. Scientific knowledge is the most reliable knowledge we have, but it is not the eternal truth of the philosophers or the theologians. At precisely the moment when a branch of science is advancing by leaps and bounds, it is also leaving behind what we thought we once knew in a cloud of dust. The price of scientific progress is impermanence. Whatever scientific truth is, it is dynamic – more like the flowing river of Heraclitus than the eternal forms of Plato. Only if the latter sets the standard for truth do the impermanent truths of science appear contradictory.

Two Narratives of Scientific Progress

Progress is not a self-evident plot-line for history. Many historical narratives chart decline from some happier state, whether from a golden age or an Edenic paradise or simply the era of one’s childhood, nostalgically recollected in old age. By no means all of these wistful glances backwards are religiously inspired or politically conservative; there are secular versions that bemoan the downward slide of everything from education in the schools to civility in political debate. Given the never-ending struggle against the forces of entropy, such anti-progressive narratives chime with everyday intuitions: without constant shoring up, buildings crumble, institutions wane, norms lose their grip. Onward-and-upward narratives are therefore the exception rather than the rule, and for that reason are all the more noteworthy when they do occur.

Although there are isolated glimmers of such progressive narratives in the ancient world – for example, among Greek mathematicians – the idea that certain human activities might be steadily improving is quintessentially modern. In Europe, it appears first and most assertively in the sixteenth century in connection not with science or society but with the arts, understood broadly to include both the fine arts of painting and sculpture and what were then called the mechanical arts. These latter included everything from farming to military fortifications to carpentry to navigation to cooking – not only what we would call technology, but also the arts and crafts. The Italian artist and art historian Giorgio Vasari (1511-1574) argued that the history of the Italian fine arts since Giotto (1267-1337) had been one of stunning progress, and an impressive series of engravings printed in Antwerp circa 1600, the Nova reperta, documented all manner of “new inventions” in the mechanical arts, from oil paints to the printing press. But when the English statesman and natural philosopher Francis Bacon (1561-1626) took stock of the state of the sciences in 1620, he contrasted their centuries-long stagnation with the flourishing mechanical arts, which had advanced so spectacularly even in his own lifetime. Circa 1600, technology, even art, progressed; science did not.

Fast forward to 1750, and science had become the prototype of a progressive activity. Greatly impressed by the achievements of seventeenth-century science and mathematics, first and foremost Isaac Newton’s (1643-1727) magisterial synthesis of celestial and terrestrial mechanics, Enlightenment philosophers prophesized a new era of scientific progress in other domains. They looked forward to hailing the Newton of natural history, the Newton of chemistry, even the Newton of the moral sciences – the latter a title the Scottish philosopher David Hume (1711-1776) coveted for himself. The Enlightenment vision of scientific progress was expansionist but not revolutionary. New domains would be added to the Newtonian heartland, as each found its own Newton, but once secured, these territories would be forever secure, never again to be conquered by new would-be Newtons. According to this narrative of the history of science, immortalized in the Preliminary Discourse (1751) to the great Enlightenment Encyclopedia and later versions that endure to this day, all was going swimmingly for science in ancient Greece, but then waves of conquest – by Romans, Goths, Vandals, Arabs, and, worst of all, Christians – plunged Europe into the Dark Ages, from which it only emerged into the light, like the prisoners in Plato’s cave, in the seventeenth century. Sometime around 1650, so the story went, the best and the brightest realized that everything everyone had believed for centuries was most likely wrong and picked up where the Greeks had left off – hence the “middle” in the Middle Ages, viewed as a long, benighted interval between epochs of light. Yes, no doubt mistakes, terrible mistakes, had been made, but now science (and with it, civilization) was once again on the right track toward truth, to be deflected only by some disaster like the biblical Flood. Hume’s friend Adam Smith (1723-1790), himself in the running to become the Newton of political economy, reached for imperial metaphors to describe how new lands would be annexed to the invulnerable Newtonian citadel, permanent victories for scientific progress.

This was the vision of scientific progress that the British philosopher John Stuart Mill (1806-1873) described as “growth without change” and held up as an ideal for political progressives in the tumultuous 1830s, when memories of the blood-stained French Revolution still haunted governments throughout Europe and beyond. But although he was an arch-empiricist, Mill failed to register the price of scientific empiricism. The Enlightenment champions of scientific progress all agreed that systematic observation and experiment had propelled the great achievements of the era, from Newton’s experiments on the composition of white light to French chemist Antoine Lavoisier’s (1743-1794) experiments on the composition of air. True, empirical inquiry did throw up surprises from time to time, but these seemed for the most part to be tame surprises, eventually domesticated within reigning theories, as apparently anomalous perturbations in the lunar orbit had been eventually squared with Newtonian predictions. True, empirical generalizations could never achieve the certainty of those based on demonstration, the ideal of medieval natural philosophy. But surely, Enlightenment thinkers countered, empirical generalizations constantly confirmed by the universal experience of humanity became so probable as to be almost certain? Even the famously skeptical Hume made this slide from empirical probability to moral certainty the basis of his argument for the impossibility of miracles. For a very long time, roughly 150 years, it looked as if science would not have to pay the reckoning for trading demonstrative certainty for empirical probability. Scientific progress could continue its imperial advance, always expanding, never retreating.

But eventually the bill came due. By circa 1900, the surprises had turned savage. The American historian Henry Adams (1838-1918), brought up in Mill’s faith that science was growing but not changing, was shocked when science seemed to descend into chaos in the wake of perplexing new discoveries like radioactivity, which Adams described as a “metaphysical bomb” hurled onto scientists’ desks by the Polish chemist Marie Curie (1867-1934). As a former U.S. Secretary of State, Adams used words like “bomb” and “anarchy” advisedly. He experienced the disarray of scientists unable to impose order and unity on new discoveries as a violent acceleration in history. Adams was admittedly not a scientist, but his outsider’s view was informed by a close reading of the works of prominent insiders like the British statistician Karl Pearson (1857-1936), the French mathematician Henri Poincaré (1854-1912), and the Austrian physicist Ernst Mach (1838-1916). These witnesses confronted a paradox: by every measure, science in the late nineteenth century was progressing with the force and speed of a powerful locomotive; yet with progress came change, rapid, disorienting change, and scientists no longer knew the locomotive’s destination. New discoveries and new theories to explain them multiplied, only to be ambushed by the next month’s novelties. Poincaré fully expected that his treatise on electrodynamics would become outdated between the day he delivered the proofs to the publisher and when the book appeared in bookstores. He was caught up in the vertigo of scientific progress, in which no theory was safe from upheavals, not even the citadel of Newtonian mechanics.

This was the second, unsettling narrative of scientific progress – vertiginous progress, science in ceaseless flux. Science definitely progressed – more phenomena could be explained, predicted, and manipulated – but it also changed, and at a head-spinning rate. Instead of imperial expansion, the political metaphor became one of sudden and violent revolutions. No scientific truth could claim to be forever. The lesson Mach drew from the history of science was that “attempts to hold fast to the beautiful moment through textbooks have always been futile. One gradually accustoms oneself [to that fact] that science is incomplete, mutable.” Instead of a sure, steady advance toward an ever more complete, an ever more coherent truth, in which new discoveries obediently took their place alongside old, all marching to the drum of established theory, scientists accustomed themselves to earthquakes that transformed the landscape of theory and practice at irregular intervals. It became ever harder to reconcile the latter, vertiginous narrative of scientific progress with a vision of science as a repository of permanent truths. Like the rest of modern society, science was in the grip of relentless innovation – in technology, in politics, in economics, in culture – but innovation without a clear goal or even some aspiration to durability: “all that is solid melts into air,” as the Communist Manifesto (1848) diagnosed the predicament of modernity.

These two narratives of scientific progress, the one expansionist and enduring, the other vertiginous and ephemeral, became entangled in the minds of many scientists and therefore also in the minds of many members of the general public, including science journalists. Philosophers and occasionally some scientists did fret about what exactly science was progressing towards, if not eternal truths, and what survived the periodic convulsions of vertiginous scientific progress: Well-confirmed facts? Structural relationships among the facts, if not the things and forces posited by theory? Predictive accuracy, whatever its theoretical underpinnings? And scientists developed sophisticated ways of assessing the uncertainty of their empirical findings, from the method of least squares to confidence intervals. But for the most part, scientists and their lay publics resigned themselves to a confused but comfortable haziness about which narrative of progress they believed, and all concerned closed their eyes to the divergent implications for what scientific (as opposed to theological or philosophical) truth might mean. Scientific truths might be the most reliable kind we have, but how could those mutable truths be reconciled with ancient ideals of truth as immutable? Both sides of the philosophical debate over scientific realism and social constructionism still assumed this ancient ideal, the one side defending science’s claim to it and the other denying it. Science journalists compounded the confusion by erasing the error bars by which scientists communicated the uncertainty of their findings to each other when those same results were broadcast to the public. Is it any wonder that under the glare of non-stop reportage about the pandemic this muddle produced confusion at best, and downright skepticism at worst? How could today’s scientific truths be tomorrow’s errors?

What is to be done? Disentangling the two narratives of scientific progress and restoring assessments of uncertainty in reporting on science would be a start, but these measures alone would not resolve the core problem: how can scientific truth be reconciled with scientific progress? Is there any way out of this repeat match, replayed in every philosophical generation, between Plato and Heraclitus?

Science Taken at Tempo: Allegro, Andante, Largo

Science” is one of those suitcase words that begs to be unpacked. First of all, it contains a multitude of different disciplines, each with its own subject matter, methods of inquiry, standards of proof, and criteria of success. These differences are not trivial. No one would dispute that astrophysics and evolutionary biology both deserve to be called sciences, and both are primarily observational, rather than experimental sciences (although both also now have recourse to computer modelling in lieu of direct experimentation). Moreover, both are historical sciences that draw upon evidence from the deep past, from fossils to starlight originating in galaxies millions of light years away. But there the analogies end. Evolutionary biologists do not aspire to predict the future of a particular species, although they can retrodict its past history in remarkable detail. In contrast, astrophysicists predict the motions of planets in our solar system and even the behavior of black holes in remote galaxies with remarkable precision. Astrophysicists can assume a certain uniformity in the composition and life cycle of stars, but the dazzling variety of organic life on earth makes sweeping generalizations from one taxon to another a risky business.

The same goes for sweeping generalizations about science, including generalizations about what can be expected from scientific explanations and predictions. Even within the same science, there can be significant differences. Physicists who study elementary particles can predict their behavior with great precision, but their colleagues who deal with turbulence – for example, the world climate system – face challenges of complexity that boggle even the most elaborate model and the mightiest super computer. These differences can lead to misunderstandings among scientists – for example, between disciplines that have regular recourse to statistical methods to sort out possible causes of observed effects (for example, demography) and those that use controlled laboratory experiments to the same ends (for example, chemistry). Such crucial differences in methods and standards must be kept in mind when public pronouncements about science-in-general are airily declaimed, whether pro or contra. In most contexts, science-in-general is an imaginary beast, like griffins or unicorns.

We’re not done unpacking the science suitcase. There are also important distinctions to be made about scientific progress, whether we imagine it as a steadily expanding empire or as a vertiginous ride on a locomotive bound for who-knows-where. What exactly is it that is changing, and how is it changing? Bearing in mind the anti-generalization that there is no science-in-general and that specific sciences may well deviate from any given model of scientific change, we might nonetheless approximate the ways the sciences have evolved over time by analogy with three musical tempi: allegro, andante, largo.

Science ticks according to three clocks. The fastest of these, running at allegro tempo, times the pace of empirical discoveries. From the first scientific journals of the mid-seventeenth century to the latest issues of Science and Nature, these novelties from the laboratory, the observatory, and the field succeed one another at breakneck speed. The second clock, progressing at a stately andante, tracks the emergence of significant new theoretical frameworks. As more and more scientists work on more and more subjects, this second clock is speeding up, but it cannot rival the breathless tempo of the first. Its innovations are measured in decades and even centuries, not weeks and months. The third clock is the slowest of all, inching forward at a glacial largo: it times the slow accumulation of ways of knowing so fundamental to science that they seem self-evident: practices like experimenting, observing, finding correlations, mining data. This is the basso continuo of science, which unfolds over centuries and millennia. It is on this scale that the ideals and practices of scientific rationality emerge: what it means to know and how to go about knowing.

At any given moment in time, a given science may be gripped by novelty at any one of these three levels of change. During the SARS-COV-2 pandemic, for example, new empirical results in virology and immunology accelerated from allegro to prestissimo, to the point where even online preprint servers buckled under the volume of submissions. At the andante level, theoretical deliberations about how to sift through all of these results, produced in haste and not all equally reliable, which inferences to draw from them, and how to make them cohere with each other and with what was previously known about corona viruses, is still ongoing and likely to take years, if not decades. And at the slow, largo level, there is the immense challenge of squaring three ways of knowing in the biomedical sciences: one ancient (clinical observation, but this time conducted on a global scale), one about a century old (randomized clinical trials), and one brand new (data-mining in search of suggestive correlations). Attempts to integrate clinical observation and randomized clinical trials have been going on for decades and are still a work-in-progress; work has hardly begun on how to integrate data-mining with the other two.

It is precisely in situations like these that the two narratives of scientific progress collide. The locomotive model fits the breathless allegro of the latest empirical results, each hot-off-the-press, some apparently contradictory, and none digested into a theoretical scheme that can weed out likely artifacts or irrelevances and make sense of what remains. “Hot-off-the-press” is used advisedly: because the allegro tempo of empirical novelty matches the media’s own breakneck tempo and the public’s urgent desire to know anything and everything about a new disease that has brought life all over the globe to a standstill, this is the level of scientific change that snags attention. Scientists are not entirely innocent partners in this pas-de-deux with journalists: in countries in which most research is funded by the public purse, there are both good motives and bad to want to bask in the media spotlight. The journalists, for their part, hype their headlines by deleting the error bars and confidence intervals that signal uncertainty in scientific publications or even trumpeting claims before the supporting evidence has been submitted to peer review. If attention remains fixated at the allegro level, the pall-mall pace of both the latest empirical results (each only a tiny piece of an immense puzzle and perhaps not even pieces of the same puzzle) as well as the short-lived practical measures based on them can be dizzying. Disoriented and desperate, many citizens begin to lose confidence in scientific pronouncements with a shelf-life shorter than that of unrefrigerated milk in summer.

But at the andante level of scientific change, the pieces of the puzzles are being mulled over, matched, and sometimes discarded. This is slow, painstaking work and is unlikely to attract a reporter to the lab. It is also a stumble-blunder process fraught with failure and controversy, one scientist’s promising pattern may be another’s fata morgana. This is a narrative that unfolds over many years, with innumerable dead ends and blind alleys, and which rarely concludes triumphantly with a Nobel Prize ceremony – in short, a narrative that only a historian of science could love. Yet when the puzzle-solving succeeds – and there is no guarantee that it will – the results are not only more durable than those splashed across the weekly covers of Science and Nature; they also act as a sieve for the pieces that turn out to belong to another puzzle – often one not even recognized to be a puzzle until decades later. If attention were trained at this level, the overall impression would be one of greater durability, though not of eternal truths. Sooner or later, the bill for empiricism will once again come due.

Does the third, largo level of scientific change rescue those eternal truths from the uncertainty inherent in all empirical inquiry? Its results are certainly more cumulative than those at the allegro and andante levels: once acquired, a way of knowing is rarely abandoned, though it may be marginalized by a method of investigation deemed more reliable or efficient or universally applicable, as clinical observation has been increasingly marginalized by randomized clinical trials in medicine, or large-scale statistical surveys have edged out more time-consuming ethnographic fieldwork in some social sciences. Marginalized does not mean replaced. Without clinical observation to spot new syndromes, randomized trials would have nothing to test (as in the case of AIDS, in which doctors first noticed a strange new constellation of symptoms in some of their patients). Without ethnographic fieldwork, statistical surveys could not generate causal hypotheses to explain macroscopic patterns (as in the case of declining rates of teenage pregnancies in several countries). But a way of knowing, however long-lived, is not an eternal truth: it is about how to conduct inquiry, not inquiry’s end result. Nor is it a guarantee of the truth of the end result, only that at least some sources of possible errors have been eliminated.

There is a deeper reason why even the largo level of scientific change cannot deliver eternal truths, despite its impressive accumulation of durable ways of knowing. The accumulation is both its greatest strength and its greatest weakness. What the largo level of scientific change accumulates are systematic forms of empiricism, ingenious ways of finding out about different parts of the world, from volcanoes to slime molds, from ancient civilizations to migratory birds. There is by now a long list of such forms: observation; experiment; statistical surveys; archival research; chemical assays; collections; mathematical models; fieldwork; randomized clinical trials; computer simulations. Almost none of these forms of empirical research is the property of a single discipline, and many of them cut straight across the divisions among the natural and human sciences. For example, archival research is as essential to demographers as it is to historians; art historians rely on collections as much as entomologists do; geologists and ecologists conduct fieldwork along with anthropologists and archaeologists. Mathematical models and computer simulations are at home in too many disciplines to enumerate. In short, it is extremely rare for a discipline to have a monopoly on a form of empirical research – and even fewer disciplines depend on only one form of empirical research. This durable and diverse accumulation of systematic forms of empirical research is the great strength of the largo level of science.

But it is also its greatest weakness. If we imagined a timeline of the emergence of all these different forms of empirical research, we would see that they have very different histories. To take only a few examples: systematic astrometeorological observation emerges in ancient Mesopotamia and China by 3000 BCE; some of these ancient observations are still listed in NASA’s Five Millennium Canon of Solar Eclipses. Systematic experiment emerges much later, around 1600 CE, and systematic randomized clinical trials only in the 1920s with statistician Ronald Fisher’s work at the Rothamsted Experimental Station. (You may wonder why I keep repeating the word “systematic”: I’ll get to that in a moment.) And computer simulations, first used in connection with nuclear detonation models in the Manhattan Project in the 1940s, emerged within living memory.

Given that these different forms of empirical research emerged at different times, in different contexts, and for different purposes, it’s not surprising that we often have difficulty integrating or even comparing them. This is perhaps the chief reason why disciplines talk past one another, even if (especially if) they are investigating the same subject matter: how do you braid together, for example, evidence from archaeology, archival sources, historical philology, paleogenetics, and mathematical models to trace the longue durée development of languages? Even within a single discipline, the challenges can be daunting, as in the case of medicine. On the one hand, we know from surviving documents in ancient Egyptian, Greek, Sanskrit, and Chinese that clinical observation has been an essential form of empirical research for millennia. Randomized clinical trials, on the other hand, are barely a century old but by now indispensable in scientific medicine. But how to weave together these two invaluable sources of evidence remains a challenge – all the more so because the challenge often goes unrecognized. Too often different forms of empiricism are pitted against rather than linked with each other. Although the contradictions churned up in this fashion may attract only specialist attention, they may do more damage to inquiry than a scientific scandal ballyhooed on Twitter.

Finally, why do I keep repeating that word “systematic”? Almost all of the forms of empirical research I have mentioned developed from what might be called “vernacular” empiricism: the everyday practices of paying close attention to the particulars of the natural and human worlds that are found in every culture past and present. For example, the controlled experiment of the seventeenth century developed out of the medieval experimentum in the artisan’s workshop: the word originally meant “trial” or “proof” – for example, the trial of a new way to temper steel or dye wool or distill essential oils. The experimentum was successful if it worked, and in that respect resembles the systematic scientific experiment. But rarely was an experimentum designed to do what we want a scientific experiment to do: namely, to reveal the causes of effects. And although artisanal practices were certainly refined through experience, there was no interest in what we call experimental design – first and foremost, the systematic analysis of every conceivable source of error, from noise to bias to unrepresentative sampling to an instrument on the fritz to the experimenter’s bad cold.

The systematization of vernacular empiricism can be long, very long, in coming. For example, people everywhere and always have been observing clouds and basing weather predictions on their shape and appearance, as proverbs in many languages testify. For example: “Mackerel sky, mackerel sky, never long wet, never long dry.” Yet only in the early nineteenth century was there any systematic attempt to classify cloud types and only in 1896 did the first internationally coordinated and calibrated classification of clouds appear – when the mackerel sky (aka Schäfchenwolke [sheep cloud] in German, ciel pommelé [dappled sky] in French) officially became the cirrocumulus for cloud-watchers everywhere.

Diverse as the forms of empirical research are, there is none that does not conduct some strenuous form of error analysis, from the source criticism of the historians to the statistical tests of the psychologists and social scientists to the controls of the laboratory scientists. It is this relentless attempt to control for errors of all kinds, combined with an exquisite sense of just how much weight a particular form of empirical evidence will bear, that distinguishes vernacular from scientific empiricism. But the elimination of errors is not the same as the attainment of truths, much less the eternal truths that have long been assumed to be the telos of scientific progress.

Conclusion: Rethinking Progress and Truth

There is a good-news and a bad-news conclusion to this story. The good news is that there is a plausible version of progress at the allegro, andante, and largo levels of scientific change. We know ever more about many more things; we understand more about their causes and effects (and sometimes how to manipulate both to our advantage); and we are even inventing new ways of knowing. Depending on what level one focuses on, the narrative of progress looks more like the vertiginous version (allegro) or the expansionist one (largo), with andante somewhere in between, just as in music. And just as in music, scrambling the three levels creates cacophony, or worse.

The bad news is that none of the three levels produces certain, immutable truths. Knowledge that is reliable, in the sense of being able to bank on it, and illuminating, in the sense of deepening our understanding, is not necessarily the same thing as Platonic truth. The history of science and technology abounds with examples of knowledge sturdy enough to support workable technologies and insightful enough to connect apparently disparate phenomena – but knowledge eventually displaced, all the same. Just how durable scientific knowledge proves to be is highly variable, determined both by its level (allegro, andante, largo) and historical contingency (for example, cultures willing to encourage and support research are a relative rarity, historically speaking). Progress may bring improvements, but it may also bring trade-offs. For example, machine learning programs applied to Big Data may yield more accurate predictions of some phenomena, but at the price of obscuring their underlying causes. What kind of improvement is valued most – in predictive accuracy, in explanatory depth and breadth, in practical applicability – will define the direction of scientific progress. But whatever the direction and whatever the successes, progress in and of itself cannot secure the immutable truths that have so long been the standard against which all knowledge has been judged, including scientific knowledge.

Does science really need such truths? The ideal of certain, eternal truths originated in philosophy (partly inspired by mathematics) and became entrenched in some versions of theology. This ideal is incompatible with systematic empirical inquiry, both with its intrinsic uncertainty but also with its progressive character. Yet the empirical, progressive knowledge produced by science is by all accounts the very best knowledge we have. If philosophical and theological ideals of truth can’t do it justice, so much the worse for those ideals. The conclusion to draw from the restless impermanence of scientific knowledge is not that progressive knowledge can’t be true knowledge but that we need a better way of thinking about truth.

Lorraine Daston is Director at the Max Planck Institute for the History of Science, Berlin, and regular visiting professor in the Committee on Social Thought. Her work focuses on the history of rationality, especially but not exclusively scientific rationality. She has written on the history of wonder, objectivity, observation, the moral authority of nature, probability, Cold War rationality, and scientific modernity. Her current book projects are a history of the origins of the scientific community and a reflection on what science has to do with modernity. Her most recent book is Rules: A Short History of What We Live By (Princeton University Press, 2022).