Scott McLennan is a Distinguished John S. Toll Professor in the Department of Geosciences, Stony Brook University (USA). He obtained his BSc (Honors) (1975) and MSc (1977) from the University of Western Ontario (Canada) and his PhD (1981) from the Research School of Earth Sciences, Australian National University. For the first half of his career, his research focused on the trace element and radiogenic isotope composition of sedimentary rocks to address questions related to crustal evolution, plate tectonic associations, and ancient climates. Since about the turn of the millennium, he has directed his attention to the geochemistry of Mars, using data returned from a variety of landed and orbital missions and laboratory experiments.

Harrison and Lenardic (2022; hereinafter HL22) argued in a Triple Point article entitled “Burke’s Law: Toward a Reasoned Discussion of Deep Time” (published in Elements vol. 18, no. 5) that special issues exist in the research of deep time and when trying to understand early Earth, for which the geological record is increasingly sparse as one descends into that abyss. They questioned the approaches, especially regarding the role of plate tectonics and nature of the earliest crust, taken by several workers including, prominently, Taylor and McLennan (2009). I have not worked in these areas for some time and, regrettably, my co-author is no longer with us (McLennan and Rudnick 2021), but some response seems appropriate. Relevant to my recent efforts in planetary science, HL22’s essay got me thinking more generally about evaluating natural phenomena on Earth or other planetary bodies (including exoplanets) for which it is difficult or even impossible to make key measurements relevant to the important questions, that under more favorable circumstances might be routine, or for which resources needed to make measurements are inadequate or unavailable. In other words, cases where it is not possible to undertake decisive experiments to test hypotheses—a hallmark of vigorous scientific advance (Platt 1964)—regardless of whether or not they can be devised in principle. Accordingly, this contribution is not merely concerned with defending against pointed criticism, although there is some of that, but rather to pursue HL22’s recommendation to “continue this discussion.

HL22 balked at words like “myth” to characterize suggestions of extensive Hadean continental crust (although tagging ideas with a cacophonous acronym SANMLM, embedding a charge of being a “self-affirming … misconception,” feels equally egregious to me!). The irony seemed rich upon reading this, as one inspiration for this label comes from the most unrepentant proponent of early continental crust, the late Dick Armstrong, in his paper “The Persistent Myth of Crustal Growth” (Armstrong 1991). This word is rather common in scientific discourse: a Web of Science search for “myth” under science topics revealed >10,000 entries. But one can turn the question of rhetoric around with HL22’s promulgation of what they “informally” call “Burke’s Law.” Kevin Burke was indomitable in promoting plate tectonics, but “Burke’s Law” (i.e., assuming plate tectonics has operated since initial differentiation until there is evidence otherwise) is not really a law at all, as HL22 recognize, but rather an application of Ockham’s razor to a specific case. Indeed, a law—scientific or legal—has special weight that is broadly agreed upon. But, as the HL22 essay also notes, Ockham’s razor is not without controversy (nor is the cited Hume’s Law [i.e., the is–ought problem], which also carries the ominous moniker Hume’s guillotine). Asserting a new “law,” fully aware that it is not really a law, to support a point of view strikes me as also being an effective rhetorical device. (Has any paper been rejected for relying on the First Law of Thermodynamics?) Scientific literature is replete with extreme and perhaps no longer acceptable discourse; nevertheless, there remains a time and place for both neutral and precise wordsmithing and some measure of rhetorical flair—involving myths and misconceptions, laws and aphorisms, razors and guillotines. To this end, I direct readers to Ross Taylor’s counsel derived from writing 10 books (also applying to review papers): “Books should reflect the opinion of the authors. It is no service to readers to provide a list of ongoing controversies or of problems without making some assessment of a likely resolution or outcome. This is indeed not without hazard” (Taylor and McLennan 2009, p. xviii).

Few scientists would disagree with the famous antimetabole/aphorism that provides one leg supporting “Burke’s Law”: absence of evidence is not evidence of absence. However, arguments about Hadean crusts or timing of plate tectonics mostly are not quite so banal. A more charitable characterization might be that absence of evidence, that should be present and observable given the available geological record, may be evidence of absence (e.g., Sober 2015, p. 252–253). In other words, at least attempting classic Popperian falsification based on failed predictions (philosophy that for theories to be considered scientific, in principle, they must be testable and conceivably falsifiable). Taylor and McLennan (2009) did not argue that there simply was an absence of evidence for extensive Hadean continental crust and, therefore, it did not exist, but rather that extensive continents should produce Hadean-aged zircons that persisted during repeated cycles of cannibalistic sedimentary recycling and be more abundant than observed in Archean sedimentary rocks. Interpretations of the Hadean detrital zircon record certainly may evolve as more data are collected or simply may differ among workers—and so my point here is not to again argue the evidence, but simply to point out that it was indeed the evidence that was argued.

Similarly, the issue of Archean plate tectonics is more involved than opposing camps of pro- and anti-Archean plate tectonics. Many geologists have examined the Archean record (hopefully accumulating evidence in the “Burke’s Law” sense) and found modern plate tectonic models wanting. However, debates are more along the lines of what exactly constitutes “modern-style plate tectonics”—there is much room to maneuver between immobile stagnant lids and Earth’s current regime (e.g., unstable lids, squishy lids, sluggish lids, overturns, platelets, delamination, drip tectonics, flake tectonics, sagduction, hot subduction)—and exactly when modern conditions began: apparently sometime between T0 and the Neoproterozoic (Palin and Santosh 2021). From my vantage, this range of views suggests a state of affairs more like multiple working hypotheses run amok! But even “Burke’s Law” recognizes some kind of transition from an earlier state to a plate tec-tonic regime: “we should assume (plate tectonics) was operating since global silicate differentiation…” (HL22, p. 354; emphasis added). If recent planetary exploration has taught us anything, it is that the nature of initial silicate differentiation on rocky planetary bodies, involving a variety of magma ocean and other complex igneous processes, is remarkably diverse (e.g., contrast the earliest anorthositic crust of the Moon, graphitic crust of Mercury, and eucritic [basaltic] crust of 4-Vesta) and provides little comfort for imposing a unitary assumption that Earth’s first crust resembled the current continental crust (McLennan 2022). As such, there seems to be some agreement that a transition from an original crust, related to initial differentiation, to modern-style plate tectonics occurred with ensuing debate essentially being haggling over the details.

Over the past two decades, I have been involved with Mars exploration, especially using rovers (e.g., McLennan et al. 2019). Mars rovers serve as “robotic field geologists” and operate under strict resource constraints involving time, data volume, and power. Prioritizing and obtaining observations are by design hypothesis-driven processes needed to justify resource allocations on both tactical and strategic timelines. Analytical capabilities are chosen well before landing site selection and therefore not fine-tuned to the specific geological problems encountered and, hence, often cannot carry out what on Earth would be routine measurements. These circumstances may lead to operationally unfalsifiable hypotheses—those that can be tested in principle but not in practice (i.e., spacecraft do not or cannot deploy appropriate instruments—synchrotrons come to mind). For example, in a recent review of Mars’ sedimentary geology, one of my co-authors observed that “many outstanding questions … could be resolved with a single thin section!” (McLennan et al. 2019, p. 93). In my experience, just like the nature of Archean plate tectonics, hypotheses tend to accumulate over time (sustained by the guise of multiple working hypotheses) but because it is rarely possible to make decisive measurements leading to falsification, it can be very difficult to reduce an ever-growing number of acceptable hypotheses, each of which may be considered deserving of precious resources. This, in turn, could pose a risk of resource-limited missions becoming bogged down.

There are many reasons—some scientifically objective and others less so—why some theories are more accepted than others (Hoffmann 2003). But in my judgement, embedded within these discussions are the competing roles of three fundamental concepts underpinning scientific discourse: (1) Karl Popper’s doctrine of falsification (Popper 1962), (2) the method of multiple working hypotheses (Chamberlin 1890), and (3) Ockham’s razor (principle of parsimony). The first two are widely understood (although some details perhaps less so), but among the difficulties with Ockham’s razor is that its meaning is largely in the eye of the beholder (Sober 2015). At one extreme, it is improperly thought to suggest that simpler theories are more likely correct. After all, nature abounds with complexity. HL22 boil it down to “simpler is better” (p. 354). Hoffmann et al. (1997) concluded that rather than “simpler is better,” a preferable expression is that simpler is pragmatically more useful (Fig. 1). Thus, “Ockham’s razor merely keeps science on the straightest path to the truth, crooked as it may be” (Kelly 2008, p. 350).

HL22 provide ten “different rules” (p. 355) to guide investigations of deep time, many being behavioral in nature. Earth science research in general is heavily burdened by weak (i.e., practical) underdetermination (Kleinhans et al. 2010) and, accordingly, it is probably not coincidental that the method of multiple working hypotheses was devised by a geologist. Hence, rather than formulating different rules to address specific problems (where might that end?), perhaps we should refresh our memories about long-standing thinking on the general nature of scientific investigation. Such problems may well fall in the shadow of Popper’s rather murky concept of the demarcation between science and metaphysics, which is founded on the ability to test and falsify hypotheses, or uncomfortably along the continuum between progressive and degenerating problem-shifts (Lakatos 1969). Being falsifiable in principle, Popper probably would consider many questions about early Earth or those arising during planetary exploration to be scientific in nature, but I think he might remind us that the intrinsic confirmability of any hypothesis is fundamentally limited by its testability (Popper 1962, p. 256). This is very much in line with Platt’s (1964) approach of strong inference. Platt embraced a version of multiple working hypotheses but tempered it with the necessity of having “crucial experiments” (experimentum crucis of Robert Hooke and Isaac Newton?) to test and, where possible, exclude hypotheses in order to make robust scientific progress. Indeed, for nearly a century prior to the lead up to the plate tectonic revolution, geology itself was largely moribund because it did not have the tools to implement relevant experiments to test its hypotheses (Menard 1971).

But in the absence of such crucial experiments, or until breakthroughs (likely technological or discovery-based) allow us to devise such experiments, what is a practical path forward? HL22 considered Ockham’s razor the weaker second leg supporting “Burke’s Law,” but I would instead argue that vigorous use of Ockham’s razor, when aptly framed, should indeed be a key tool in reducing the number of multiple working hypotheses, thus allowing us to focus our resources (and attention) on the most prospective subset of models. The late geophysicist Don Anderson (2002, p. 59) suggested “Occam’s razor can be used to improve, simplify, and discard theories, but is most useful when it is used to compare theories.Hoffmann et al. (1997) argued that the razor is best used as a pragmatic tool serving as an operational principle and “is not a metaphysical statement about the way the universe is” (p. 14), a view I find compelling. It has even been suggested that we best reserve Ockham’s razor for trimming “Plato’s beard” only when it “is sufficiently tough, and tangled by many entities” (Popper 1972, p. 301). For cases such as early Earth that lacks an adequate geological record, planetary exploration that lacks adequate or appropriate resources, and no doubt many other underdetermined scientific problems (e.g., origin of life), perhaps the razor is also well used to help prune overgrowths of multiple working hypotheses for which no decisive experiments can be reasonably or practically devised to test their predictions.

I am grateful to John Grotzinger for reading and commenting on an early draft, Mark Harrison and Adrian Lenardic for helpful and constructive feedback, and Janne Blichert-Toft for able editorial assistance and strong support.

The readers has free access to the “free” material but MSA holds the rights