Body size has a long history of study in paleobiology and underlies many important phenomena in macroevolution. Body-size patterns in the fossil record are often examined by utilizing size data alone, which hinders our ability to describe the biological meaning behind size change on macroevolutionary timescales. Without data reflecting the biological and geologic factors that drive size change, we cannot assess its mechanistic underpinnings.

Existing frameworks for studying ontogeny and phylogeny can remedy this problem, particularly the classic age–size–“shape” space originally developed for studies of heterochrony. When evaluated based on metrics for age, size, and phenotype in populations, proposed mechanisms for size change can be outlined theoretically and tested empirically in the record. Using this framework, we can compare ontogenetic trajectories within and between species and determine how changes in size emerge. Here, we outline ontogenetic mechanisms for evolutionary size change, such as heterochrony, as well as how geologic factors can drive apparent, non-biological size change (e.g., taphonomic size sorting).

To demonstrate the utility of this framework in actual paleobiological problems, we apply it to the Lilliput effect, a compelling and widely documented pattern of size decrease during extinction events. However, little is known about the mechanisms underlying this pattern. We provide a brief history of the Lilliput effect and refine its definition in a framework that can be mechanistically tested. Processes that likely produce Lilliput effects include allometric and sequence repatterning (including heterochrony) and evolutionary size-selective sorting. We describe these mechanisms and highlight relevant examples of the Lilliput effect for which feasible empirical tests are possible.

The body size of fossil organisms has been an important area of research for paleobiologists for well over a century, because body size can tell us much about widespread trends in the evolution of major groups of living things. However, paleobiologists often study body size by focusing only on size-related information collected from fossils. Without information about a fossil organism's biology and geologic history beyond its size, we cannot understand what is driving body-size change over evolutionary time in a meaningful way.

Luckily, the ways evolutionary biologists already think about growth and development (ontogeny) and evolutionary relationships among taxa (phylogeny) can help us resolve this issue. In particular, by looking at how a species’ body size, age, and other observable traits (phenotype) change over its growth and development, we can track how a species’ body-size changes over the course of its time on Earth. Furthermore, we can compare these patterns between closely related species, and identify the sources of body-size change in deep time.

To show how these ideas are a practical solution to problems in the fossil record, we applied them to a common pattern called the “Lilliput effect.” Named after the island of Lilliput and its tiny inhabitants in Gulliver's Travels, this pattern describes a sharp decrease in organism body size during extinctions in Earth history. Despite the Lilliput effect being very common, we understand little about how it occurs. Along with providing a stronger definition for the Lilliput effect, we use our framework to note some likely processes for the Lilliput effect (such as changes to development), and some famous cases where we could easily test these ideas.

Body size has long been of interest to paleobiologists. Many frequently discussed phenomena recognized in the fossil record involve body size, including Cope's rule (Cope 1869, 1887; Stanley 1973; Jablonski 1996; Gould and McFadden 2004), Bergmann's rule (Bergmann 1847; Blackburn et al. 1999; Meiri and Dayan 2003), Foster's (island) rule (Foster 1963, 1964, 1965; Van Valen 1973; Heaney 1978; Lomolino 1985; Lomolino et al. 2012), and the Lilliput effect (Urbanek 1993; Twitchett 2007; Harries and Knorr 2009). However, despite this large body of work on body-size trends in the fossil record, there is a dearth of studies on the mechanistic underpinnings of these patterns. Body size is a complex trait that reflects many aspects of organismal biology, including phylogeny, morphology, physiology, ecology, and ontogeny (Calder 1984; Jablonski 1996; Cooper and Purvis 2010), and many studies utilize size data when evaluating macroevolutionary patterns involving these other factors. Yet body size alone can be a poor proxy for evaluating macroevolutionary patterns because of the myriad processes capable of driving size trends in the fossil record. Furthermore, apparent size trends can also reflect taphonomic processes, such as transport and sorting (Kidwell et al. 1986; Kidwell and Bosence 1991; Zuschin et al. 2005; Brayard et al. 2010), instead of biological ones. Mechanistic insight is needed to build a rigorous understanding of how body-size trends manifest over time and are preserved in the fossil record.

On their own, size measurements are agnostic to the mechanisms that drive body-size change over time, so additional data are needed to study these mechanisms in a biologically meaningful way. Ontogeny and phylogeny are particularly powerful lenses through which to interpret changes in body size, and existing frameworks for both can be broadly applied to the fossil record. Tools developed for studying heterochrony—particularly the multivariate age–size–phenotype space developed in the late twentieth century (Gould 1977; Alberch et al. 1979; McKinney and McNamara 1991; Klingenberg 1998)—are especially useful for understanding mechanisms of body-size change. By referencing how size covaries with other aspects of a species’ ontogenetic trajectory, we gain a better understanding of the mechanisms that may underlie changes in size, and whether/how those mechanisms might vary when comparing species that display seemingly similar changes in size. Furthermore, interspecific comparisons of ontogeny (e.g., when testing for heterochrony) can only be done properly in a rigorous phylogenetic framework (Alberch et al. 1979; Fink 1982). Phylogenetic patterns of body-size change provide necessary context to determine directionality and mechanisms of size change and clarify distortions brought on when considering stratigraphic patterns of size change alone.

In this paper, we briefly review multivariate heterochrony and describe its application to studies of body-size change. We begin with a general description of the framework, followed by a discussion of heterochrony and its equally important counterparts, allometric and sequence repatterning. To better demonstrate the practical applications of this model, we then explore a widely documented but poorly understood pattern of size decrease, the Lilliput effect (Urbanek 1993). Typically, the Lilliput effect has been studied by recognizing patterns of size decrease in a stratigraphic context (Twitchett 2007; Harries and Knorr 2009). With use of a heterochronic framework, however, we can identify or constrain proximal mechanisms for the Lilliput effect, such as ontogenetic shifts and size-selective sorting, and determine whether species that experienced simultaneous size decreases exhibit patterns consistent with these mechanisms. Application of this framework is not limited to the Lilliput effect, however, and the model can be applied to many cases of size change in the fossil record.

The heterochrony literature of the late twentieth century is rich with concepts relevant to change in various aspects of organismal phenotype. The classic age–size–“shape” space outlined by Alberch et al. (1979) and referred to as the “heterochronic trinity” by McKinney (1988) provides a framework to both quantify body size and other relevant metrics, and facilitate comparison of ontogenetic trajectories between populations and species (Gould 1977; Alberch et al. 1979; McKinney 1988; McKinney and McNamara 1991; Klingenberg 1998; Webster and Zelditch 2005). Throughout this paper, we refer to the classic “shape” axis as “phenotype” to be inclusive to other kinds of phenotypic data. In this framework, the age, size, and phenotype axes represent three distinct kinds of data that can be quantified in a developing organism or within a population; ontogenetic trajectories can be parameterized and modeled as a vector within this multivariate space (Fig. 1). Because lifelong ontogenetic data for individuals are typically not available for fossil and non-model organisms, we must usually infer average ontogenetic trajectories from cross-sectional or mixed cross-sectional data for a sample (Cock 1966; Alberch et al. 1979; Rice 1997). Unfortunately, the limited sample size available for many taxa in the fossil record restricts the degree to which ontogenetic trajectories can be reconstructed, and data that can be parameterized as age, size, or phenotype are not uniformly available for all taxa. However, with new methods and approaches, age, size, and phenotype can certainly be quantified in a unified empirical framework (Barta et al. 2022). Although the age–size–phenotype framework is useful for those species that do preserve these data, it also useful for recognizing when one or more types of data are missing, underscoring the limits of the fossil record when drawing conclusions on the mechanistic underpinnings of size trends. Furthermore, although ontogenetic frameworks are useful for testing for mechanisms of size change, they are also useful in discriminating between ontogenetic and other mechanisms of body-size change in the fossil record (see discussion on size-selective sorting and the Lilliput effect in “Ontogenetic Mechanisms of Body-Size Change”). Here we briefly summarize the three axes of age–size–phenotype space and provide examples of data relevant to each axis.

Body Size

Body size of a fossil specimen can be defined as the size of preserved elements of a specimen at death. There is a vast literature exploring and utilizing various metrics for organismal size. The choice of “size” variable depends on the study system, available fossil material, and type of data being collected. Thus, size can be measured in a variety of ways, including the length of one or more skeletal elements, such as basal skull length (Huttenlocker and Botha-Brink 2013; Botha-Brink et al. 2016), trunk length (Motani et al. 2018), or limb-bone dimensions in vertebrates (Campione and Evans 2012) and cephalon length in trilobites (Trammer and Kaim 1997; Hunda and Hughes 2007). For bivalves, brachiopods, foraminifera, and other taxa with fewer, simpler skeletal elements, geometric means of shell or test length and width (Jablonski 1996; Chen et al. 2019), shell volume (Novack-Gottshall and Lainer 2008), and shell outline centroid size (Lockwood 2005) are all appropriate metrics. In some cases, body-mass estimates for fossil specimens could be appropriate size metrics. However, body-mass estimates should be applied only when reliable and where other size metrics are inappropriate, because most estimates of fossil body mass and volume are calculated using proxies such as element lengths and circumferences (Novack-Gottshall 2008; Campione and Evans 2012; Field et al. 2013). Size proxies should be independent of the data used to parameterize the age and phenotype axes. For instance, raw femoral lengths would be inappropriate as a size metric for geometric morphometrics of femora, but centroid size of the same elements would be appropriate. These examples are not exhaustive and are meant to demonstrate how varied size metrics can be, rather than prescribe a specific approach for a given study system.

Age

Age is measured as absolute (chronological) time since birth or some other standard event early in ontogeny such as weaning, hatching, or metamorphosis. Although estimating the time since birth or absolute ontogenetic age of extinct taxa is often challenging, if not impossible, there are some useful proxies for estimating the relative age or life stage of an individual compared with other members of a fossil population. In bivalves, sclerochronology of shell growth increments serves as a proxy for ontogenetic age, especially when calibrated with geochemical data such as stable isotopes (Jones and Gould 1999; Schöne and Surge 2012). Similar sclerochronological techniques have also been applied to brachiopods, although to a much lesser extent (Hiller 1988; Brey et al. 1995; Angiolini et al. 2011; Gaspard et al. 2018), as have counts of external growth lines of brachiopod shells (Metcalfe et al. 2011). In many tetrapods, lines of arrested growth (LAGs) preserved in bone tissues represent annual slowing or cessation of growth, and both the number and spacing of these lines are used to estimate age (Castanet 1994; Castanet et al. 2004; Padian 2012; de Buffrénil et al. 2021). In fish, otoliths (ear bones) display annuli as well, with each annulus being shown to represent a true year in fisheries studies (Campana and Thorrold 2001; Wilson and Nieland 2001; Laidig et al. 2003). Even in taxa that do not lay down some kind of annulus in their skeletal elements, there are other ways to approximate relative “age.” For instance, instars can be an appropriate staging metric in arthropods such as trilobites (Hughes et al. 2006, 2021).

Many studies of heterochrony and ontogenetic trajectories in the fossil record consist of allometric studies of only size and phenotype because of the difficulty of collecting age data. Size is often treated as a proxy for age, but these two variables can decouple from one another (McKinney 1988; Metcalfe et al. 2011; Huttenlocker and Botha-Brink 2013; Botha-Brink et al. 2016), making size an unreliable proxy for the passage of time. Although age data and other stand-in metrics for life stage can be challenging to acquire from fossils, it is important to be aware of how an axis of time affects the interpretation of ontogenetic processes, as seen in “Ontogenetic Mechanisms of Body-Size Change.” In the absence of age data, care should be taken to avoid drawing conclusions that require a metric for time.

Phenotype

Phenotype is certainly the most complex axis to define and quantify. In classic heterochrony literature, the phenotype axis is often referred to as the “shape” axis, but “shape” has a precise meaning, especially in the context of geometric morphometrics, and only accounts for a subset of potentially important aspects of phenotype. Therefore, we refer to this axis as “phenotype” to accommodate both shape and non-shape measures of phenotype. The goal of a phenotype axis is to measure the morphological change experienced during ontogeny by members of a given taxon, independent of size and age, in a one- or multidimensional framework. Among extant taxa, this could include a wide range of features, including color, behavior, metabolic processes, and soft tissue morphology, but for our purposes in the fossil record, phenotypic data are largely restricted to the morphology of hard parts. Linear dimensions and geometric morphometric measures of phenotype are traditionally favored by paleobiologists, but they are not the only means to capture ontogenetically variable morphology. Some morphologies may be better captured as discrete-state (meristic) characters or “events” in a sequence. Discrete-state (meristic) data can also be collected from deformed and incomplete specimens where morphometric approaches might be unreliable. Furthermore, scored characters and meristic data can offer a degree of size independence that can be challenging to disentangle in morphometrics. Discrete data can reference a wide range of morphologies, including the development of elements like crests, bosses, and muscle scars (Griffin and Nesbitt 2016a,b; Griffin 2018; Barta et al. 2022) or the number of elements such as spines or shell ribs (Guo et al. 2021). For example, ontogenetic sequence analysis (Colbert and Rowe 2008) is a parsimony-based method used to reconstruct all possible ontogenetic sequences among ontogenetically variable specimens and has been used in studies of lepospondyl tetrapods (Olori 2013) and early dinosaurs and dinosauriform archosaurs (Griffin and Nesbitt 2016a,b; Griffin 2018; Barta et al. 2022).

Ontogenetic phenotype can be challenging to parameterize. Although the entirety of ontogenetic phenotype need not be exactly conserved between taxa, measured features (shape, characters, dimensions, etc.) should be homologous. Measures of phenotype can suffer from dimensionality bias (Webster and Zelditch 2005), both underparameterization and overparameterization, as seen in the morphometric size–shape allometry literature. An awareness of what the phenotype axis is intended to measure can help mitigate this problem. The goal of the phenotype axis is to measure a shared set or series of events, shape changes, or character changes, so the selected parameters should not overly simplify or complicate an organism's ontogeny. Furthermore, interpretation of ontogenetic modification must be limited to the kinds of data represented by the phenotype axis and not extrapolated to other unrepresented or unmeasured aspects of phenotype.

A common issue in studies of body-size change is a lack of phylogenetic context. Evolutionary patterns within clades cannot be evaluated without a robust phylogeny (Webster et al. 2001). Clade-level patterns of size decrease only make sense in an ancestor–descendant or stemward–crownward context (Gould and MacFadden 2004; Lockwood 2005; Harries and Knoor 2009). Interpretations of size change in a stratigraphic context alone are appealing but have the potential to miss important information. For example, in a solely stratigraphic context, it may appear as though size differences between species are directional. With the aid of a phylogeny, however, it may become clear that such size trends are inconsistent or follow a different pattern when ancestral character states and ghost lineages are accounted for (Adrain and Chatterton 1994; Gould and MacFadden 2004; Fig. 2). Fossil equids are an excellent empirical example of this, as they were once the classic illustration of Cope's rule (Matthew 1926; Simpson 1953). However, with increased fossil occurrences and a robust phylogeny, equid body-size evolution is now recognized to include varied episodes of size change rather than a single gradual increase in body size through time (MacFadden 1986; Gould and MacFadden 2004; Cantalapiedra et al. 2017).

Although phylogenetic concerns pose less of a problem to studies of size change in a single species or an anagenetic lineage, they become a more significant challenge when considering members of genera or higher taxonomic ranks. A robust, well-supported phylogeny and ample developmental information can reveal complex patterns of body-size change (Hanken and Wake 1993; Gould and McFadden 2004; Angielczyk and Feldman 2013; Cordero 2021). Moreover, the larger the phylogeny and the longer the time spanned by it, the more variation is expected and the greater the challenge in determining mechanisms and modes in patterns of size change. Neontological examples highlight that size change can present in a variety of ways and be achieved through various mechanisms. As an example, turtles have relatively low taxonomic diversity compared with other vertebrates, yet they display a wide range of body sizes. Differing, complex modes of allometric and sequence repatterning leading to size decrease have been found among members of at least two groups of turtles (Angielczyk and Feldman 2013; Cordero 2021; Heston et al. 2022), in part revealed through the abundant developmental, morphological, and life-history information available in extant systems. As mentioned earlier in the section on phenotype, comparison of ontogenetic trajectories also requires a degree of conserved ontogeny among homologous structures or features. Such conservatism is less likely as phylogenetic scale increases, given that ontogenetic systems can often change with speciation (Webster and Zelditch 2005; see also “Allometric and Sequence Repatterning”). This does not mean that clade-level studies of body size are futile, but again prompts caution regarding what can be said of a given system with the data at hand.

As mentioned earlier, the age–size–phenotype framework was developed in the heterochrony and allometry literature. Due to the wide interest in heterochrony during the late twentieth century, extensive terminology for evaluating body-size phenomena in the fossil record has been developed (Gould 1977, 2000; Alberch et al. 1979; McKinney 1988; McKinney and McNamara 1991; Klingenberg 1998; Webster and Zelditch 2005). Here we review concepts and terminology relevant to the study of macroevolutionary body-size trends.

Heterochrony

Heterochrony refers to evolutionary change in mature phenotype resulting from a decoupling of a taxon's growth trajectory from that of its ancestor along one or more of the age–size–phenotype axes, producing a parallelism between ontogenetic and phylogenetic phenotypic change (McKinney and McNamara 1991; Zelditch and Fink 1996; Mitteroecker et al. 2005; Webster and Zelditch 2005). Identification of heterochrony requires a phylogenetically conserved axis of phenotypic change and manifests as modification to the rate at which change along that axis is achieved and/or the timing of events along that axis with respect to developmental time (age) and/or size (Alberch et al. 1979; Klingenberg 1998; Gould 2000; Mitteroecker et al. 2005; Webster and Zelditch 2005). Peramorphosis (the production of a “more mature” descendant phenotype) can result from either an increased rate of phenotypic development relative to size or time (acceleration) or a longer duration of ontogenetic change (by a delayed termination of phenotypic change [hypermorphosis] and/or an earlier start to change along the phenotypic development axis [predisplacement]). Conversely, paedomorphosis (the production of a “less mature” descendant phenotype) can result from a decreased rate of phenotypic development relative to size or time (deceleration or progenesis) and/or a shorter duration of ontogenetic change (by an earlier truncation of ontogeny [neoteny] and/or a delayed onset to change along the phenotypic development axis [postdisplacement]) (Alberch et al. 1979; Klingenberg 1998). A selection of examples of the possible changes to the age–size–phenotype axes are shown in Figure 3.

As noted earlier, size is often substituted for time and plotted against phenotype. However, dissociations between size and age are possible as ontogenies change (McKinney 1988). Although important to independently quantify, changes between size and age axes alone are not heterochronic without reference to phenotypic change. Heterochronic terms set forth by Gould (1977) and later refined (Alberch et al. 1979; McKinney and McNamara 1991; Klingenberg 1998) largely deal with outlining allometric change, which has posed problems for describing changes in phenotype with respect to age. In turn, this practice resulted in the misapplication of heterochronic terminology to changes in rate itself (McKinney 1988; Gould 2000), rather than to the results of changes in rate, timing, and onset in ontogeny (Gould 2000). To remedy this problem, we recommend specifying the axis or axes involved in a given mechanism of change specifically with respect to phenotype. In earlier work, similar recommendations have been made when comparing phenotype to age and/or size to age (McKinney and McNamara 1991). However, the assertion that heterochrony can be identified by comparing a measure of body size with respect to age has in part led to the widespread misuse of the term (Gould 2000; Webster and Zelditch 2005). Therefore, heterochrony should be identified by comparing trajectories along the axis of ontogenetic phenotype with respect to age and/or size, but not size against age. In some cases, ontogeny might be modified by decoupling one axis from the other two, as seen in the examples of progenesis and hypermorphosis in Figure 3A. In other cases, all three axes might be decoupled from each other as a result of multiple modifications to ontogeny, such as the production of the smaller (blue) paedomorphic descendant in Figure 3G, which resulted from progenesis with respect to size and neoteny with respect to age, or the production of the blue descendant in Figure 3I, which achieved a smaller body size but a peramorphic phenotype due to hypermorphosis with respect to age and acceleration with respect to size. Patterns resulting from such decoupling between all three axes were also discussed by Klingenberg (1998). It is important to note that all of these modifications to ontogeny are described in terms of phenotype with respect to size and/or age. Historically, the concept of heterochrony became so broad that it nearly became a “catchall” for any morphological change. Despite this misapplication of the term, heterochrony represents a distinct, rare case of morphological evolution that occurs along an ontogenetic line of least resistance (Webster et al. 2001; Webster and Zelditch 2005).

Allometric and Sequence Repatterning

Allometric repatterning, as defined by Webster and Zelditch (2005), describes evolutionary modification to the pattern of allometric ontogenetic shape change not produced by heterochrony. Similarly, for non-shape measures of phenotype, sequence repatterning describes a similar modification to the type and order of events or characters not produced by heterochrony. Both kinds of repatterning describe dynamic (ontogenetic) phenotypic change. In this case, the phenotype axis is not shared between ancestor and descendant, and thus the parallelism of ontogeny and phylogeny is broken (Webster and Zelditch 2005). Because the axis of phenotypic change is not shared between the two trajectories, allometric and sequence repatterning pose an issue for the three-axis model, because ancestor and descendant now have different phenotype axes and so exist in entirely separate age–size–phenotype spaces. However, even if the phenotype axis is not conserved (shared) between species, this does not mean that taxa cannot be compared in meaningful ways.

Allometric repatterning has been widely documented, including in examples of early Cambrian (Webster et al. 2001) and Late Ordovician (Hunda and Hughes 2007) trilobites, modern Arctic charr (Parsons et al. 2011), modern piranhas (Zelditch et al. 2000), modern rodents (Zelditch et al. 2003), modern Leptodactylus frogs (Ponssa and Candioti 2012), and Carboniferous (Stephen et al. 2002) and Jurassic (Gerber et al. 2007) ammonoids. The study by Hunda and Hughes (2007) is particularly relevant to the present paper, because it focused on an example of body-size change. In their study, Hunda and Hughes (2007) compared two subspecies of trilobite (Flexicalymene retrorsa retrorsa and Flexicalymene retrorsa minuens) that represent sister-taxa (and potentially an ancestor–descendant pair) from the Cincinnatian Series (Upper Ordovician). Using two-dimensional geometric morphometrics of the cephalon, they found that the subspecies differed in their patterns of ontogenetic shape change during the holaspid phase and thus that the trajectory of ontogenetic shape change had been evolutionarily modified: the evolution of the small-bodied F. retrorsa minuens did not result from pure heterochrony but involved allometric repatterning in addition to whatever process was responsible for the size decrease. The authors were careful to highlight the limitations of their data, namely that: (1) only the holaspid phase of ontogeny was sampled for both subspecies, so that their study was silent regarding the nature of any modifications to earlier portions of ontogeny; and (2) developmental age data were unavailable, and the analysis was thus unable to discern the mechanism by which the change in body size occurred. By rigorously quantifying ontogenetic shape change and carefully interpreting the implications of their data, the authors uncovered subtler, more informative results regarding the ontogenetic trajectories of these subspecies than would be understood from traditional, broad considerations of heterochrony.

The mechanisms that generate size increase or decrease involve decoupling body size from its ancestral relationship with age and/or phenotypic change. At the outset, one might expect that decreases in size correspond to “juvenilized” ontogenies and vice versa with size increase. These associations may hold true in some cases, such as the craniofacial evolutionary allometry relationship seen both interspecifically among mammals (Cardini and Polly 2013; Cardini et al. 2015; Cardini 2019; Rhoda et al. 2023) and intraspecifically during ontogeny (Cardini and Polly 2013; le Verger et al. 2020), where snout length proportionally increases with body size along a line of least resistance as a general rule. However, size decrease is not always associated with paedomorphic-like trends nor is size increase always associated with peramorphic-like trends. In other cases, we might see different ontogenetic mechanisms in response to other pressures. For example, in fisheries where overharvesting of “normal” adult phenotypes has led to smaller individuals reaching sexual maturity at a younger age (Krohn and Kerr 1997; Jørgensen et al. 2009; Frank et al. 2018; Morrongiello et al. 2019; Ayllón et al. 2021; Roy and Arlinghaus 2022; see sections below on the Lilliput effect), we instead see accelerated phenotypic change with respect to a reduced adult body size and age (Fig. 3C). There has not been enough investigation into ontogenetic phenotype in these systems to say whether this represents acceleration in the heterochronic sense (and thus peramorphosis) or an increased rate of growth toward the ancestral adult phenotype, but it nevertheless illustrates the above point. Any of several kinds of modification to ontogeny can produce an increase or decrease in size.

One of the simplest theoretical ways to increase or decrease body size is to truncate or extend the growth trajectory without modifying the ancestral relationship between phenotype, age, and size (i.e., progenesis or hypermorphosis, respectively, of both age and size; Fig. 3A). In the case of size increase with hypermorphosis, the descendant adult would appear more morphologically mature in phenotype and be older in age than ancestor. In the case of size decrease with progenesis, the descendant adult would appear as a juvenilized version of the ancestor, being both younger in age and phenotype and thus morphologically immature. If the relationships between phenotype, size, and age are known for the ancestor, then measurement of any one of those variables in the descendant allows prediction of the other two in that descendant.

The next simplest scenario is to decouple size from age and phenotype but retain the ancestral relationship between age and phenotype. In this case, ancestor and descendant would attain a given phenotype at the same age but different sizes (Fig. 3B). Knowledge of the ancestral ontogeny allows prediction of age from phenotype (or vice versa) in the descendant but does not allow prediction of size from either age or phenotype in that descendant.

It is also theoretically possible to decouple age from size and phenotype but retain the ancestral relationship between size and phenotype. In this case, ancestor and descendant would attain a given phenotype at the same size but different age. At a given developmental age, the descendant would have either a paedomorphic phenotype and smaller size, or a peramorphic phenotype and larger size, than the ancestor (Fig. 3C). Knowledge of the ancestral ontogeny allows prediction of phenotype from size (or vice versa) in the descendant but does not allow prediction of age from either size or phenotype in that descendant.

Another possibility would be to decouple both age and size from phenotype but retain the ancestral relationship between age and size. In this case, ancestor and descendant would attain a given phenotype at a different size and age (Fig. 3D). Knowledge of the ancestral ontogeny allows prediction of age from size (or vice versa) in the descendant but does not allow prediction of phenotype from either age or size in that descendant. It should be noted that the ancestral size is conserved in both of the above examples (Fig. 3C,D). Although neither example represents a case of ontogenetic size change compared with the others described here, these simple adjustments to ontogeny should be considered for both possible scenarios in the fossil record and the more complex scenarios described below (Fig. 3E–I).

More complicated scenarios arise if none of the ancestral relationships between age, size, and phenotype are retained in the descendant. A given evolutionary change in size could in principle be achieved via a peramorphic or paedomorphic shift in phenotype, and with an increase or decrease in the duration of growth, as outlined in Figure 3E–I. Knowledge of the ancestral ontogeny would not shed light on the relationships between age, size, and phenotype in the descendant. We outline these ontogenetic mechanisms in this non-exhaustive list to provide context for how changes in body size relate to corresponding, decoupled, and nonexistent changes in age and phenotypic maturity. Depending upon which ancestral relationships between age–size–phenotype are evolutionarily conserved, certain predictions can be made about the combinations of age, size, and phenotype values in the descendant. Hypotheses regarding how the ancestral ontogeny was modified are therefore testable. As discussed earlier, heterochrony is ultimately a multivariate framework, and disregarding one axis as “interchangeable” with another could easily lead one to miss important mechanistic underpinnings or to mischaracterize size change in the fossil record.

Miniaturization and Gigantism

Cases of size increase and decrease are often referred to as “miniaturization” and “gigantism” in the literature, but we argue that they should be avoided as terms for any directional changes in body size. Both terms not only indicate an extreme decrease or increase in size, but also a shift in biological mode, meaning size change results in major changes to the physiology, anatomy, ecology, life history, or behavior of an organism (Hanken and Wake 1993). Yet these terms have been applied broadly to a variety of nonapplicable or unclear cases, including “miniaturization” and “gigantism” seen in insular size change, or the Island rule (Stanley 1973; Raia and Meiri 2006), and the “miniaturized” faunas described in the Lilliput effect (He et al. 2007; Harries and Knorr 2009; Huang et al. 2010; Borths and Ausich 2011). Determining whether a taxon is truly miniaturized, meaning greatly reduced in size such that it occupies a different mode of life than its ancestor, is extremely challenging without the context of ontogeny and phylogeny. Many cases of true miniaturization and gigantism likely represent instances of allometric or sequence repatterning, as the magnitude of change in the organism's biology almost certainly requires deviating from the ancestral phenotypic trajectory throughout ontogeny. As such, miniaturization and gigantism are also “phylogenetic statements” and indicate dramatic shifts in body size and mode of life with reference to other members of a clade (Haken and Wake 1993). Cases of miniaturization and gigantism are plentiful, especially among living taxa, including miniature threadsnakes of the West Indies (Martins et al. 2021); Brookesia chameleons, including the exceptionally small Brookesia nana (Glaw et al. 2012); kinosternine turtles (Cordero 2021); extinct varanid lizards (Erickson et al. 2003); baleen whales (Slater et al. 2017); and amphiumid salamanders (Bonett et al. 2009).

Taphonomic Biases and Juvenile Mortality

Although there are plentiful examples of ontogenetic shifts resulting in body-size change in the fossil record, some instances of apparent size change do not involve actual biological change as presented in a fossil population. Such apparent cases of size change include taphonomic size biases and high juvenile mortality in an assemblage. In the case of taphonomic bias, paleobiologists should be cautious of fossil assemblages that preserve only one size or age class of a given taxon. Although they can provide abundant fossil material, aggregations of particular size or age classes can strongly skew analyses of size and ontogeny without proper consideration of taphonomic setting and ecomorphology. Furthermore, size sorting, time averaging, and differential preservational potential of ontogenetic stages can produce misleading trends for those interested in ontogenetic information in fossil material (Kidwell and Bosence 1991; Aslan and Behrensmeyer 1996; Behrensmeyer et al. 2000; Britt et al. 2009; Gostling et al. 2009; Brayard et al. 2010; Chattopadhyay et al. 2013; Wosik and Evans 2022). Considering time averaging, temporally distinct populations can easily be admixed, leading to a composite distribution of phenotypes in the pooled sample (Kidwell and Bosence 1991; Aslan and Behrensmeyer 1996; Behrensmeyer et al. 2000). Transport-based size sorting and unequal preservation potential among ontogenetic stages of a population can restrict the portion of an ontogenetic trajectory available for study (Aslan and Behrensmeyer 1996; Chattopadhyay et al. 2013). Such taphonomic biases are not representative of size variation in the living population but can mimic biological phenomena such as the Lilliput effect (Brayard et al. 2010). Geologic setting and taphonomic processes are easily overlooked, but these processes should be embraced if we are to study ontogeny in the fossil record with fidelity. An understanding of the depositional setting and possible transport mechanisms of fossil material is just as crucial as an awareness of where specimens sit along a given ontogenetic trajectory.

As another example, high juvenile mortality in a fossil population can appear similar to an evolutionary (or plastic) ontogenetic shift toward a morphologically immature phenotype. In the case of the Lilliput effect (see following section), sampling of smaller (and perhaps morphologically immature) specimens in the aftermath of extinctions could be the result of a real ontogenetic response to an unfavorable environment or the result of increased juvenile mortality (Twitchett 2007; Huang et al. 2010; Botha 2020; Huttenlocker et al. 2022). In the case of juvenile mortality, body-size distributions are skewed due to preferential die-off of particular ontogenetic stages, but without a significant change to the underlying ontogeny of the organism (Twitchett 2007). Often, the mass mortality of juveniles reflects behavioral responses to environmental stressors like drought (Weigelt 1989; Varricchio et al. 2008; Viglietti et al. 2013; Smith et al. 2022) and thus are not representative of the entire size–age structure of a population. Additionally, specific reproductive strategies (or changes in reproductive strategies) have been suggested to drive overrepresentation of juveniles in the fossil record (Stephen et al. 2012; Botha-Brink et al. 2016). Although juvenile mortality does promote a “size phenomenon” in the fossil record, we do not consider it to be an ontogenetic mechanism of body-size change. The perceived size change is a result of relatively few individuals reaching normal adult size, but it does not result in a smaller adult size. Because the size change instead reflects preferential die-off of abundant juveniles rather than an evolutionary trend in the entire population, juvenile mortality is a “false” size decrease. To avoid conflating the two, rigorous sampling across stratigraphy, a range of localities, and ontogeny for multiple assemblages are necessary. Even in a population with high juvenile mortality, population numbers have to be maintained by reproducing adults, and the presence of rare, large individuals can be an indicator of increased juvenile mortality.

Potential empirical applications of the age–size–phenotype framework to macroevolutionary body-size phenomena are plentiful in the fossil record and span a variety of clades and time periods. We use the Lilliput effect to introduce how this framework can clarify aspects of size evolution as well as candidate systems where it could be used. We chose the Lilliput effect—a pattern of body-size decrease during extinctions—because of the wide interest in size trends during extinctions. Body size has long been studied as a metric of life-history patterns (Metcalfe et al. 2011; Botha-Brink et al. 2016) and ecosystem recovery dynamics (Sallan and Galimberti 2015; Chen et al. 2019) during extinctions and their subsequent recoveries. In addition, the Lilliput effect is an ideal phenomenon to apply the age–size–phenotype framework due to the relative abundance of single species and genera that thrive in the aftermath of extinctions. These “disaster” taxa are often opportunistic generalists that frequently dominate during the immediate crisis and briefly into the following recovery (Benton and Newell 2014). Resilience to the adverse conditions during the crisis often comes with extreme costs, however, and disaster taxa often experience shifts in morphology and life history during extinctions. Many disaster taxa are relatively abundant and display variation in size, age and phenotypic characteristics, making them good candidates for the age–size–phenotype framework. In the following sections, we review the history of the Lilliput effect, challenges in defining this pattern, process-based refinements to studying it using the age–size–phenotype framework, and relevant case studies of the phenomenon.

A History of the Lilliput Effect

The Lilliput effect was initially defined by Urbanek (1993) as a “post-event syndrome” that resulted in both size reduction and a “subnormal phenotype” from hindered phenotypic development. This definition attributed small size and associated phenotypic shifts to harsh postextinction environmental conditions and reduced competition in disturbed environments. Urbanek's definition outlines expectations for both the body size and phenotype of Lilliputian species in the aftermath of an extinction and restricts the phenomenon as a process that occurs temporarily within a single species. Since this initial work, the Lilliput effect has been reported in a diverse range of clades, including brachiopods (Chen et al. 2005; He et al. 2007; He et al. 2010; Huang et al. 2010; Metcalfe et al. 2011; Schaal et al. 2016; Chen et al. 2019), mollusks (Twitchett 2007; Atkinson and Wignall 2020), ostracodes (Forel et al. 2015); corals (Kaljo 1996), foraminifera (Keller and Abramovich 2009; Song et al. 2011), echinoderms (Jeffery 2001; Twitchett and Oji 2005; Borths and Ausich 2011; Brom et al. 2015), graptolites (Urbanek 1993), and vertebrates (Girard and Renaud 1996; Renaud and Girard 1999; Mutter and Neuman 2009; Huttenlocker and Botha-Brink 2013; Huttenlocker 2014; Sallan and Galimberti 2015; Botha-Brink et al. 2016; Berv and Field 2018; Botha 2020; Xinsong et al. 2020). These body-size reductions also occur during many extinction events and biotic crises, including the Ordovician–Silurian mass extinction (Kaljo 1996; Borths and Ausich 2011), the late Silurian extinction events (Urbanek 1993), the Frasnian–Famennian Mass extinction (Girard and Renaud 1996; Renaud and Girard 1999; Xinsong et al. 2020), the end-Devonian mass extinction (Sallan and Galimberti 2015), the end-Permian mass extinction (EPME) (Chen et al. 2005; Twitchett and Oji 2005; He et al. 2007, 2010; Twitchett 2007; Mutter and Neuman 2009; Huang et al. 2010; Metcalfe et al. 2011; Song et al. 2011; Huttenlocker and Botha-Brink 2013; Huttenlocker 2014; Botha-Brink et al. 2016; Schaal et al. 2016; Chen et al. 2019; Botha 2020), the end-Triassic mass extinction (Atkinson and Wignall 2020), the Cretaceous marine anoxia events (Brom et al. 2015), and the Cretaceous–Paleogene mass extinction (Jeffery 2001; Keller and Abramovich 2009; Berv and Field 2018). The widespread documentation of this phenomenon suggests that the Lilliput effect is a common response to stressful environmental or ecological conditions during biotic crises. However, there is little consensus on what the “Lilliput effect” is in a biologically meaningful sense, because the mechanisms that underlie the Lilliput effect are poorly defined and understood.

Following the work of Urbanek, many cases of decreased body size following mass extinctions were recognized as Lilliput effects (Girard and Renaud 1996; Kaljo 1996; Renaud and Girard 1999; Harper and Jia-Yu 2001; Jeffery 2001; Chen et al. 2005; Twitchett and Oji 2005; He et al. 2007). As more and more examples of the Lilliput effect were observed across time periods and clades, the concept of the phenomenon expanded. Prior work seeking to categorize and define the Lilliput effect has focused on its faunal patterns. The phenomenon has been considered to involve (1) the temporary removal of large taxa (“faunal stunting” of Harries and Knorr [2009]), (2) the origination of smaller taxa, and (3) within-lineage size decrease (Twitchett 2007; Harries and Knorr 2009; Fig. 4). However, this broadly inclusive concept, though useful in documenting the general effects of extinctions, does not inform us of the nature of the Lilliput effect and its underlying mechanisms in specific cases. Indeed, there is considerable confusion concerning which cases of size decrease count as examples of the Lilliput effect due to the lack of a process-based definition. Discussion of whether or not the Lilliput effect is an “evolutionary” or “ecological” phenomenon (Harries and Knorr 2009) further complicates our understanding. Indeed, this question presents a false dichotomy of sorts, as ecology, evolution, and other factors almost certainly interact and/or vary in influence depending on the clade, environment, and time period in question. The typical resolution of the fossil record itself also hinders us from evaluating whether Lilliput patterns are driven by microevolutionary/ecological or macroevolutionary forces. Finally, relying on “ecology” versus “evolution” generates contingency issues that cannot be resolved in the fossil record. For example, some Lilliputian taxa went extinct before they had a chance to return to their original body size, as in the case of the dicynodont Lystrosaurus (Botha-Brink et al. 2016; Botha 2020; Viglietti et al. 2021).

What Is the Lilliput Effect?

We propose that a common problem in the current definitions of the Lilliput effect is that they primarily focus on the pattern of general size decrease during an extinction interval, as opposed to the process(es) by which the size decrease was achieved. Related to this issue, the importance and utility of phylogenetic data have largely been overlooked in studies of the Lilliput effect (Harries and Knorr 2009). To resolve the confusion around the Lilliput effect, we propose two refinements that more explicitly include phylogeny and apply the age–size–phenotype framework to better focus on the process of size reduction. First, the Lilliput effect should be evaluated in a phylogenetic context as described earlier, either within a single species, anagenetic ancestor–descendant pairs, sister-species, or a small clade with a robust phylogeny. Second, the Lilliput effect should be categorized by proximal mechanisms underlying size reduction and other changes to phenotype rather than only documenting the pattern of smaller individuals or taxa during/after an extinction event. Relevant mechanisms include ontogenetic shifts, such as allometric repatterning, sequence repatterning, and heterochrony; evolutionary size-selective sorting processes, which can arise from size-selective extinction and/or origination; and potentially other non-evolutionary phenomena such as the Lazarus effect (Flessa and Jablonski 1983; Jablonski 1986; Fara 2001) as discussed by Twitchett (2007) and Harries and Knorr (2009). These proximal mechanisms likely share common ultimate causes, which include environmental stress and limited resources (Twitchett 2001, 2007). Furthermore, we disregard that the Lilliput effect must be “temporary,” as previously posited (Urbanek 1993; Twitchett 2007), because this creates contingency problems with the fossil record. Our revisions clarify what cases of size decrease count as Lilliputs, as well as how specific examples of the Lilliput effect may differ from one another. Under our revised definition, the Lilliput effect is an instance of size decrease in a single species, anagenetic lineage, or small clade observed during a widespread extinction event, diversity crisis, or other period of high taxonomic turnover compared with background intervals.

Ontogenetic Change and the Lilliput Effect

Ontogeny is a particularly interesting framework with which to evaluate the Lilliput effect, given that several examples of size decrease across mass extinctions recognize “immature” phenotypes (MacLeod et al. 2000; Forel et al. 2015; Botha-Brink et al. 2016; Botha 2020), including the original example outlined by Urbanek (1993). However, some authors posit that this size decrease is due to “developmental plasticity” rather than the Lilliput effect. Given that the Lilliput effect has been categorized as a pattern rather than a process, there is no reason to disregard size reduction via ontogenetic change as cases of the Lilliput effect. Furthermore, “developmental plasticity” is often improperly used in the paleobiological literature, as it refers to a specific process by which varying conditions in the macroenvironment of an organism prompt distinct developmental responses from the same genotype (Webster 2019). Hypotheses regarding changes in macroenvironmental sensitivity of a genotype (and thus of environmental canalization) cannot be unambiguously tested in the fossil record (Webster 2019). However, hypotheses regarding modification to the relationship between age, size, and phenotype can be (see “Ontogenetic Mechanisms of Body-Size Change”).

Ontogenetic change is an important yet overlooked mechanism in producing the Lilliput effect, particularly because there are cases in extant systems where environmental pressure produces comparable alterations to ontogeny. Fisheries provide a particularly compelling example of this phenomenon, as artificial stress has prompted rapid changes in fish populations. Harvesting the largest, oldest, and most fecund individuals, frequent harvesting, and climate change have led to rapid growth rates, smaller body sizes, earlier onset of sexual maturity, and other ontogenetic shifts in a variety of fish species (Krohn and Kerr 1997; Jørgensen et al. 2009; Frank et al. 2018; Morrongiello et al. 2019; Ayllón et al. 2021; Roy and Arlinghaus 2022). Habitat disturbance and climate change have produced similar size-reduction effects in a variety of taxa, including various species of fishes (Audzijonyte et al. 2016), salamanders (Caruso et al. 2015), toads (Vogel and Pechmann 2010; Cogălniceanu et al. 2021), ants (Gibb et al. 2018), birds (Van Buskirk et al. 2010; Weeks et al. 2020), and mammals (Smith et al. 1995; Post et al. 1997; Ozgul et al. 2009). Given these neontological data and emerging patterns in the fossil record, it is likely that environmental stress during extinctions would similarly prompt paedomorphosis and related ontogenetic responses from Lilliputian taxa (Harries et al. 1996).

To evaluate ontogenetic shifts as mechanisms to generate the Lilliput effect, we can return to the age–size–phenotype framework and compare parameters of ontogenetic trajectories before, during, and after an extinction. As discussed earlier, the Lilliput effect only requires a decrease in body size during an interval of heightened extinction or faunal turnover in a species, anagenetic lineage, or small clade, but a variety of mechanisms can produce it. Any of the mechanisms involving size decrease discussed earlier will constitute an example of the Lilliput effect by ontogenetic shift (Fig. 3). Along with identifying when ontogenetic change is a mechanism of the Lilliput effect, this framework will also allow us to recognize whether there are common modes of modification to ontogeny by which Lilliputian taxa are produced.

The abundance of many proposed Lilliputian taxa offers paleobiologists many potential opportunities to investigate the mechanisms underlying the phenomenon. The preservation biases of different taxonomic groups and taphonomic settings can limit the amounts and kinds of data available for age, size, and phenotype respectively. Nevertheless, for most groups for which size is available, data can be gathered for at least two, if not all three, axes of the model. In fact, some studies have attempted to quantify ontogenetic change in size-reduced populations, whether under the label of the Lilliput effect or not, including Cretaceous–Paleogene, Paleocene–Eocene, and late Eocene foraminifera (MacLeod 1990; MacLeod and Kitchell 1990; MacLeod et al. 1990, 2000), lingulid brachiopods of the EPME (Metcalfe et al. 2011), and non-mammalian therapsids of the EPME (Huttenlocker and Botha-Brink 2013, 2014; Botha-Brink et al. 2016; Botha 2020).

In a synthesis paper, MacLeod et al. (2000) presented a strong working example for evaluating the role of ontogeny in size decrease across extinctions, albeit solely in an allometric framework, and summarized work investigating the size and “shape” of planktic and benthic foraminifera during the Cretaceous–Paleogene mass extinction, the Paleocene–Eocene thermal maximum, and the late Eocene cooling. The authors utilized robust stratigraphic frameworks and compared allometric patterns before and after these three events. Although not called the Lilliput effect, their ontogenetic framing of size change can be usefully applied to future studies of the Lilliput effect. It should be noted, however, that some of their data suffer from dimensionality bias, for example, the use of test chamber number as their metric for phenotype to test for paedomorphosis in one case. Where appropriate, higher-dimensionality data, such as geometric morphometric shape data (utilized in a separate analysis), would serve as a richer test of conservation of phenotype across ontogeny and between species (Webster and Zelditch 2005). They also did not discuss how lacking a parameter for age affects their conclusions, and they somewhat conflated age with their phenotype metrics. Chamber count may be a better metric for age than phenotype for this reason, and in fact was used as a means of developmental staging in later work (Shi and MacLeod 2016).

Metcalfe et al. (2011) also provided a useful example with lingulid brachiopod size and growth lines in the earliest Triassic, comparing age and size axes, albeit with no measure of phenotype. The authors posited that the higher density of growth lines in the immediate post-extinction ‘Lingula’ specimens represents slowed growth and high instances of growth cessation as drivers of small body size rather than juvenile mortality. Without phenotypic data, their interpretation of their age and size data could mirror the mechanisms seen in either Figure 3F or I, where we see smaller individuals with an extended life span. Further data on the phenotypic development of ‘Lingula’, the periodicity of brachiopod growth lines, and the taxonomic affinities and phylogenetic relationships of Permo-Triassic ‘Lingula’ species are still needed for the example to fulfill all parameters of our age–size–phenotype model.

Studies of the non-mammalian therapsid Moschorhinus kitchingi during the EPME in the Karoo Basin, South Africa (Huttenlocker and Botha-Brink 2013, 2014), are especially promising for this framework, and indicate that this system is worthy of further investigation. In this research, the authors evaluated size and age metrics, and described data that could apply well to phenotype. Moschorhinus kitchingi was a common therocephalian during the EPME interval in the Karoo Basin, and its comparatively rich fossil record reveals that Triassic cranial and postcranial specimens of the organism are considerably smaller than their preextinction Permian counterparts (Huttenlocker and Botha-Brink 2013; Huttenlocker 2014). Furthermore, histological sections show that Triassic specimens present considerably fewer growth marks and appear to have grown more rapidly (Huttenlocker and Botha-Brink 2013). These data, combined with other histological metrics relevant to phenotype, indicate that M. kitchingi could represent a Lilliput effect due to ontogenetic change. The mechanism for size decrease in M. kitchingi could be the global progenesis outlined in Figure 3A or the non-heterochronic size decrease with corresponding decrease in age found in Figure 3E. The authors did not relate their data to an ontogenetic framework in the manner that we suggest here, but this could be accomplished in future work by including additional data for phenotype. Because many tetrapod crania from the Karoo Basin are preserved with varying degrees of taphonomic deformation (Kammerer et al. 2020), morphometric data may be unreliable in this system. Instead, ontogenetic phenotypic change summarized by discrete data, such as ontogenetic sequence analysis (OSA) (Colbert and Rowe 2008) and ordination techniques like nonmetric multidimensional scaling (NMDS) (Griffin and Nesbitt 2016a) are more appropriate. Moschorhinus kitchingi likely could fit into the age–size–phenotype framework with information for all three of these axes, so future work on this taxon is strongly encouraged.

Size-Selective Sorting and the Lilliput Effect

Besides ontogenetic shifts, size-selective sorting represents another mechanism important to the Lilliput effect. We use size-selective sorting rather than size-selective extinction, because the diversification of smaller taxa in the aftermath of extinctions should also fall within this mechanism. Both selection for the extinction of large forms and preferential origination/survival of small forms represent responses to a common environmental pressure, rather than two distinct mechanisms, as portrayed in prior studies of the Lilliput effect. In either case of size-selective sorting, factors such as limited resources, shifting food webs, and environmental stress could impose a size filter on a given group, resulting in extinction, migration, and/or low population numbers, all of which could produce an apparent size reduction (Twitchett 2001). At the beginning of an extinction event, the initial imposition an environmental size filter could likely result in the extinction of large-bodied taxa. As the extinction proceeds, the size filter would remain in place, selecting for small-bodied taxa among survivors and the species that originate in the earliest stages of extinction recovery (Fig. 5).

Even though size-selective sorting involves no ontogenetic response, the age–size–phenotype framework can aid in assessing whether or not size reduction has arisen via ontogeny or a size-selective process. Rather than observing an ontogenetic response, in this case, we would expect to see the disappearance of larger members of a clade as well as surviving smaller members with relatively unchanged ontogenies across the course of their stratigraphic record (Fig. 5). Under this kind of size filter, we would also expect to see small-bodied remaining clades diversifying with the opening of ecological space (Twitchett 2007). The patterns of “removal” of large taxa and “addition” of smaller taxa are treated as two separate kinds of Lilliput effect under traditional definitions of the phenomenon, but here we view them as two patterns resulting from the same process, a filter against large body size in a given taxonomic group.

Numerous reported cases of the of the Lilliput effect (sensu lato) involve the disappearance of larger taxa and/or origination of small taxa. Most of these studies do not utilize a phylogeny to confirm the evolutionary polarity of these trends, however. For many groups, especially extinct invertebrates, this is simply because robust phylogenies are not yet available. Without the context of phylogenetic relationships, it is easy to misinterpret the polarity of size trends in the Lilliput effect, as outlined in “Phylogenetics and Size-Change” and later in the section dealing with phylogeny and the Lilliput effect specifically. However, some groups with well-represented Lilliputian members have recently received rigorous phylogenetic treatments. An excellent example is found among rhynchonelliform brachiopods of the EPME. The Lilliput effect has been extensively documented in rhynchonelliforms (Chen et al. 2005; He et al. 2007, 2010; Huang et al. 2010; Schaal et al. 2016; Chen et al. 2019), largely utilizing size metrics in the absence of a phylogeny, but two recent phylogenetic studies of rhynchonellide and spiriferinid brachiopods during the EPME and recovery provide data that could clarify Lilliputian mechanisms among rhynchonelliforms (Guo et al. 2020, 2021). Along with generating a phylogenetic tree for members of Rhynchonellida and Spiriferinida during the late Permian through the Late Triassic, Guo and colleagues also assessed metrics of size and phenotype in both groups. In the rhynchonellides, a tip-dated Bayesian approach was used to generate the tree, and shell ornamentation index and shell size were plotted on branch tips and used to generate ancestral-state reconstructions (Guo et al. 2021). The ornamentation index, which measures the coarseness, distribution, and strength of shell ribs, is variable among taxa and during ontogeny. The authors highlighted that in taxa at or near the Permo-Triassic boundary, shell size is smaller and ornamentation less complex than in earlier and later members of the clade. They suggested this as evidence of “paedomorphosis,” but without additional phenotypic and age data, this could be interpreted as a case of clade-wide size-selective sorting, given that ornamentation index varies with both phylogeny and ontogeny (Guo et al. 2021). This approach is applicable to similar systems, such as the dataset of South China brachiopods presented in Chen et al. (2019). Although the latter study boasts an impressive sample size (n = 3,316), it lacks a phylogeny and other metrics for phenotype aside from body size. Expanding the recently developed phylogeny to the rhynchonellids in their dataset (which would likely entail extensive taxonomic revision) in addition to comparative analysis of size and ornamentation index, or other measures of phenotype that better differentiate phylogenetic and ontogenetic patterns, would provide a more rigorous description of the Lilliputian patterns and the mechanisms that are driving these patterns.

An additional direction for future work on size sorting and the Lilliput effect includes the coincidence of Lilliputian taxa and Lazarus taxa, which has been noted since Urbanek (1993) but rarely formally studied. The Lazarus effect describes an apparent extinction and subsequent reappearance of a given taxon within a specified time interval (Flessa and Jablonski 1983; Jablonski 1986). Prior work has discussed the potential relationship of Lazarus taxa and the Lilliput effect as a case where large taxa go unsampled due to lower population numbers only to reappear in samples during ecosystem recovery (Twitchett 2007; Harries and Knorr 2009). In the context of mass extinctions, Lazarus taxa may represent poor sampling, low population sizes, and/or migration to more favorable, though unsampled, environments or refugia (Jablonski 1986; Harries et al. 1996; Fara 2001; Twitchett 2001). Furthermore, because the Lazarus effect reflects biases of the fossil record, it is not an evolutionary phenomenon. The relationship of the Lilliput effect to Lazarus taxa has not yet been empirically studied. However, a study by Payne (2005) exploring body size among Lazarus gastropods during the EPME indicates that Lazarus taxa were dominated by small species and not large-bodied forms. Nevertheless, given the interest in the theoretical relationship of the Lilliput effect and Lazarus taxa, this topic deserves further scrutiny of other taxonomic groups and time periods.

Because individual species are characterized by particular patterns of ontogenetic development, it is logical to discuss ontogenetic shifts and size-selective mechanisms as processes acting in a single species or ancestor–descendant context. However, it is expected that these mechanisms also operate in concert at larger phylogenetic scales (Fig. 5). Size decrease within an ancestor–descendant pair, sister-species pair, or small clade (e.g., congeneric species) is relatively easier to evaluate due to simpler phylogenetic relationships, typically shorter stratigraphic and temporal ranges, and a higher likelihood of shared phenotype. Furthermore, as discussed in the earlier section on phylogeny, the kinds of information preserved in the fossil record often limit our ability to adequately describe ontogeny for a wide variety of fossil taxa in a single phylogeny. Regardless, phylogeny is key in studying mechanisms of size change in the fossil record. Without understanding phylogenetic polarity to size trends, we cannot determine whether size change is an ontogenetic response, an instance of size sorting, or a result of biases in the fossil record. For the Lilliput effect, this means being able to identify the derived or ancestral condition of small body size and thus properly determine the mechanism of size change. Furthermore, phylogeny is important to account for stratigraphic incompleteness, including ghost lineages and Lazarus taxa. In the case of both biases, phylogenetic context can help indicate where size decrease occurs in a clade's history and whether it coincides with a particular biotic crisis. Particular care must be taken when evaluating Lilliput effects at the genus or higher clade levels to avoid mischaracterizing which mechanisms of the Lilliput effect are at work, and whether a Lilliput effect occurred at all.

A good illustration of the importance of phylogeny comes from the dicynodont therapsid genus Lystrosaurus in the Karoo Basin, South Africa, during the EPME. Lystrosaurus is known from an impressively large sample size for a terrestrial vertebrate (Smith et al. 2012), spanning a stratigraphically, geographically, and ontogenetically wide range of specimens that exhibit a clear size decrease during the EPME (Botha-Brink et al. 2016; Fig. 6). Four species of Lystrosaurus are currently recognized in the Karoo Basin: the primarily Permian Lystrosaurus maccaigi and Lystrosaurus curvatus, and the Triassic Lystrosaurus declivis and Lystrosaurus murrayi (Grine et al. 2006; Botha-Brink et al. 2016; Botha 2020; Viglietti et al. 2021). Recent histological studies show that, in addition to size decrease, Triassic Lystrosaurus long bones exhibit fewer growth marks (<2) than their Permian counterparts (>3) (Botha-Brink et al. 2016). Even the largest Triassic specimens only present a few LAGs and appear to lack the thick bone cortices seen in Permian specimens, considered indicative of later stages of phenotypic maturity (Botha 2020; Fig. 7). Previous researchers proposed that Triassic Lystrosaurus populations suffered high levels of juvenile mortality, and therefore were younger and smaller and grew more rapidly to achieve reproductive maturity and equivalent body sizes as compared with their Permian counterparts (Botha-Brink et al. 2016). Similar to data from Moschorhinus kitchingi, these data apply to the size and age axes of the ontogenetic framework, and discrete metrics, such as OSA and NMDS, could be used to quantify phenotype. Unlike M. kitchingi, however, Lystrosaurus is represented by multiple species, and thus inference of Lilliput mechanisms requires phylogenetic context.

Assessing the Lilliput effect in Lystrosaurus from a classic, stratigraphic perspective suggests ontogenetic shifts as the primary mechanism, as the older, larger species (L. maccaigi, L. curvatus) occur in the late Permian and the younger, smaller species (L. declivis, L. murrayi) occur in the Early Triassic (Fig. 8A). Current phylogenetic information offers little insight, and either supports size selectivity (Fig. 8B,D) or ontogenetic shifts (Fig. 8C), depending on the study (e.g., Cox and Angielczyk 2015; Kammerer 2019; Angielczyk et al. 2021). Part of this uncertainty stems from the fact that phylogenetic relationships among Lystrosaurus species are recovered from phylogenetic studies of anomodont therapsids as a whole. These studies were intended to resolve broader issues of anomodont phylogeny and thus focus less on the specific relationships among Lystrosaurus and related dicynodontoids. Lystrosaurus species’ diagnoses are also contentious and suffer from subjectivity and inconsistency (Cluver 1971; Brink 1986; Ray 2005; Grine et al. 2006), so species’ boundaries themselves remain in question along with their relationships to one another. Furthermore, in the phylogenies that appear to support size selectivity (e.g., Cox and Angielczyk 2015; Angielczyk et al. 2021), the characters that place L. declivis and L. murrayi stemward and L. maccaigi and L. curvatus as later diverging appear to be ontogenetically variable within the genus. Interestingly, a likely new species of lystrosaurid from the Luangwa Basin of Zambia (Kammerer et al. 2020) is represented by a partial growth series in which adult crania resemble a more ancestral dicynodontoid morphotype, whereas the juveniles look very similar to the morphologically distinct Lystrosaurus, further alluding to potential ontogenetic mechanisms in operation through the evolution of Lystrosauridae. Given that interpretations of the Lilliput mechanism change greatly with different phylogenetic hypotheses for Lystrosaurus species, additional work on the alpha taxonomy and phylogenetic relationships of Lystrosaurus is crucial to properly characterize the Lilliput effect in this genus.

Despite the frequent invocation of the Lilliput effect in the literature, it continues to be poorly understood nearly 30 years after it was first discovered. Defining the effect in terms of stratigraphic faunal patterns and body size alone has led to confusion about what phenomena and taxa qualify as examples. As we reframe it, the Lilliput effect describes a pattern of size decrease observed in a single lineage or clade during periods of widespread extinction, diversity crisis, or high turnover. Framing studies of the Lilliput effect through the lens of ontogeny and phylogeny, utilizing the classic heterochrony framework, allows it to be studied in terms of mechanism and process and not body-size patterns alone. Although metrics for age, size, and phenotype, as well as robust phylogenies, are challenging to gather and construct, examples exist where all of these parameters are feasibly quantified. Approaching the Lilliput effect in this way offers the promise of much better insight into the processes by which environmental stress during biotic crises causes reductions in size, as well as the degree to which results from any particular case can be generalized to other extinctions, including modern anthropogenic extinctions. Where “complete” data are not available, this framework better highlights the limits of our knowledge and directions for future investigation.

Applications of heterochrony and other mechanistic approaches to body-size change do not stop at the Lilliput effect. The improved rigor they bring is also relevant to other body-size phenomena in the fossil record. These might include clade-level patterns such as Cope's rule, latitudinal trends such as Bergman's rule, and biogeographic trends such as Foster's (island) rule. All of these phenomena are widely documented but subject to ongoing debate, and improved ontogenetic and phylogenetic context could provide new insights. This framework also need not be limited to the fossil record, and additional data and rigor could be found with extension to neontological studies of size phenomena. Rigorous application of the heterochrony framework to problems of change in body size has the potential to reinvigorate study of these macroevolutionary phenomena for a variety of clades and evolutionary scales.

We thank S. Kidwell, J. Botha, R. Ng, and the Angielczyk Lab members for their discussion and comments to improve the article. We would also like to thank J. Botha for providing histological images of Lystrosaurus long bones included in the figures. We thank the two reviewers for their helpful comments toward the improvement of this article. C.P.A. would like to thank I. Magallanes for moral support on this ontogenetic odyssey and V. Magallanes-Abbott for only trying to chew notes and books a few times.

The authors declare no competing financial, professional, or personal relationship interests in this article.