Recurrent hierarchical patterns and the fractal distribution of fossil localities

Understanding the spatial structure of fossil localities is critical for interpreting Earth system processes based on their geographic distribution. Coordinates of marine and terrestrial sites in the conterminous United States for 17 time bins were analyzed using point pattern statistics. Lacunarity analysis shows that the spatial distributions of sites are fractal for almost every studied interval, indicating that clumping of localities occurs at multiple scales. Random hierarchical multiplicative processes provide a theoretical null model for the distribution of collecting sites, consistent with their occurrence being a complex product of numerous biological, geological, and anthropogenic processes acting at many spatial and temporal scales. Mechanistic models for the formation, preservation, and exposure of fossil localities and other geologic entities can be tested using point pattern and related spatial statistics.


INTRODUCTION
Due to their importance in forming the framework for the interpretation of Earth system history, numerous attempts have been made to quantify and model temporal patterns of fossil and sedimentary deposit occurrence (Sadler, 2004;Peters, 2006;Smith and McGowan, 2011;Kemp, 2012;Miall, 2015;Holland, 2016).Many of these studies indicate that much of the stratigraphic record is self-similar (fractal) in time (Bailey and Smith, 2005;Schlager, 2010;Kemp, 2012).
Less attention has been paid to the spatial patterns of fossil sites (Peters, 2006;Peters and Heim, 2010), although the spatial location of a fossil collection is as critical a datum as its age for paleoclimatology and paleogeography and for interpreting macroecological and biogeographic aspects of evolutionary change in the fossil record (Alroy, 1999;Barnosky et al., 2011;Marcot, 2014).Most analyses of their distribution have focused on the relationships between the distribution of fossil localities and biodiversity estimates, at either biogeographic or regional scales (Dunhill et al., 2013;Vilhena and Smith, 2013;Foote, 2014).There is also related interest, going back at least to Raup (1972), of the impact of sampling bias on patterns of spatial distribution (Kiessling, 2005).Despite the growing conceptual understanding of the controls over fossil preservation (Noto, 2010;Block et al., 2016), there are currently no spatially explicit models for the formation and preservation of fossil localities, although steps have been taken in this direction (Kemp, 2012;Miall, 2015;Block et al., 2016;Holland, 2016).There are also no statistical models for the distribution of localities that can be used for data-model comparisons to test the validity of any emerging mechanistic models.
Here I use lacunarity, a spatial point pattern analysis (SPPA) method, to characterize the spatial distribution of fossil localities.SPPA methods have become a standard tool for ecologists to describe the distributions of organisms and to either derive or test hypotheses on underlying mechanisms controlling these patterns (Velázquez et al., 2016).For example, the observed patterns can be compared with the predictions of various stochastic (null) models, to determine whether the distribution of points differs from random expectation.Here I use lacunarity analysis to compare observed spatial distributions of localities with two stochastic models: complete spatial randomness, in which the fossil sites are located randomly and independently following a Poisson process, at a single scale within the mapped area (Young and Young, 1998;Velázquez et al., 2016), and a hierarchical (multifractal) model where the distribution is produced by the interaction of multiple processes acting randomly at different spatial scales (Plotnick and Sepkoski, 2001;Cheng, 2014).

METHODS
Latitudes and longitudes in degrees of fossil collections of all taxa from the conterminous United States in 17 chronostratigraphic bins were downloaded from the Paleobiology Database (https://paleodb.org;Table 1) and were parsed into terrestrial or marine depositional environments (Fig. DR1 in the GSA Data Repository 1 ).Restricting the material to this region assured that the mapped area was contiguous and well  sampled (Kiessling, 2005).Collections with duplicate coordinates were culled to remove the effect of multiple collections at the same coordinates, which might represent multiple stratigraphic units at the same location or an artifact of data entry within the Paleobiology Database (Peters and Heim, 2010).The remaining coordinates were then mapped on 0.1° resolution grids encompassing the longitude and latitude range of points.The resulting maps have an average of 94,000 grid cells with a mean of 684 cells occupied by at least one collection (Table 1).
The mapped grids of the localities were analyzed using lacunarity analysis (Plotnick et al., 1996).Lacunarity (L) measures the amount of clustering in a pattern exhaustively resampled within square boxes of linear size l, with higher values of L indicating larger gaps among the points.Different spatial patterns of clusters and gaps at multiple scales of l are represented by plots of log L versus log l.The horizontal position of the curve is determined by the overall density of points; denser patterns have lower L values.The shape of the curve is determined by the patterns of distribution of the points at multiple scales (Fig. 1).Complete spatial randomness produces a concave curve, whereas fractal and multifractal (self-similar) patterns produce a straight line with a negative slope (Plotnick et al., 1993;Cheng, 1997;Lennon et al., 2015).Lacunarity analyses of 50 randomized distribution of sites were determined for the marine Upper Cretaceous and the terrestrial Miocene.The majority of the analyses were conducted using PASSaGE v. 2.0 (Rosenberg and Anderson, 2011; cross-checks were done with software I wrote).Further details of the methods are in the Data Repository.

RESULTS
Log-log plots of lacunarity against distance plot exactly or nearly along straight lines, as indicated by linear regressions (Table 1) for all of the time intervals and for both marine and terrestrial localities (representative plots are in Fig. 1; the complete set is in Figs.DR2 and DR3 in the Data Repository).This is consistent with self-similar patterns of spatial distribution (Plotnick et al., 1993(Plotnick et al., , 1996)); that is, the distribution of the points forms random clumps at multiple scales (clumps of clumps of clumps).Randomized distributions of the same number of points (Fig. 1) produce hollow curves with little variation among them and well separated from the data curves.Complete spatial randomness can be rejected; this result is supported by separate analyses using other SPPA methods, in particular Ripley's L(d) second-order statistics (see the Data Repository).
It is remarkable that results did not differ between terrestrial and marine localities, despite their markedly different probabilities of preservation (Heim and Peters, 2011;Holland, 2016).The signal also appears to be robust against variation in the temporal durations of the bins and the number of localities in each.The lowest correlations and greatest deviations from strict linearity are for the marine Devonian and the terrestrial Pleistocene; the former probably represents the dense recording of sites from the Devonian of New York, whereas the latter may result from similar compact clustering of glacial margin localities.

DISCUSSION
The presence or absence of a fossil collection at a given locality is dependent on the intersection of a variety of generally independent phenomena acting at multiple spatial and temporal scales.These hierarchical controls include the formation of sedimentary basins, the presence of sufficient accommodation space and the proper environment for fossil formation in those basins (Holland, 2016), tectonic and erosional processes controlling the survival and exposure of the deposits (Peters and Heim, 2010), a range of taphonomic variables (Noto, 2010;Block et al., 2016), and human factors such as the presence of roadcuts or quarries (Dunhill, 2011;Dunhill et al., 2013).Each of these controls can be represented as a form of spatially dependent probability distribution of fossil formation and preservation.For a locality to exist, all of the probabilities acting in that region must be high.
A simple numerical model for this is based on the multiplicative hierarchical (multifractal) process (Figs.2A-2C) (Plotnick and Sepkoski, 2001;Cheng, 2014).A map grid is divided into four equal areas, each of which is randomly assigned a Gaussian probability of fossil formation.This is the control acting at the largest spatial scale.Each of these map areas is again divided into four subareas (total of 16), each of which is in turn assigned a random probability, representing a control acting at the next finer scale.This process is now repeated to the desired finest level of resolution.The probabilities from each scale are now multiplied together to produce the log-normal total probability of preservation at a site, which will have a multifractal spatial distribution.If we assume that there is a minimum level of probability that will produce a recorded fossil collection, the result is a random fractal point pattern.The lacunarity plots for this model are virtually the same as for the empirical data (Fig. 2D).This random model might thus provide an appropriate theoretical model for the distribution of fossil localities (Holland, 2016).
The hierarchical and possibly fractal distribution of habitats and organisms has long been recognized in ecology (Williamson and Lawton, 1991).Many of the constraints on habitat occurrence, such as the presence of different soil types related to underlying lithology, are under the same types of controls as the occurrence of fossil collections.These original fractal patterns of organism distribution might be preserved in the rock record.Similarly, the road network in the United States is fractal (Kalapala et al., 2006); work in progress suggests that proximity to roads is a key factor in locality distribution.The location of roads is also at least partly controlled by the underlying geology.There thus may be a form of common cause (Peters and Heim, 2011) for ecological, geological, and anthropogenic spatial patterns.

CONCLUSIONS
The location of a collecting site, whether modern or fossil, is the fundamental unit of data for many ecological, macroecological, and evolutionary studies (Beck et al., 2012) and provides the geometric support (Tipper, 2016) for almost all other spatially explicit types of data.The geographic ranges and area of both living and fossil organisms are abstractions based directly on observed occurrences (Gaston, 2003) and/or predicted from environmental controls on their distribution (Myers et al., 2015).These ranges, in turn, are key components of many analyses of evolutionary and extinction dynamics (Barnosky et al., 2011;Foote, 2014;Marcot, 2014).This study shows that for fossil organisms, these sites are clumped at all scales or, alternatively, there are gaps of all scales among the sites.In terms of geographic ranges, these results suggest that the preserved range is not a simple random sample of the original range; instead, certain parts of the range are heavily sampled and others are missing.For example, although beavers are found in every county in Virginia, their fossil remains are found only in cave sites in the western region of the state.The extent to which preserved spatial patterns reflect original biological structure, geological overprint, and anthropogenic factors or a combination of all three needs to be considered prior to testing spatial models of ecological and evolutionary processes.
In sum, despite the tremendous spatial and temporal variability in the nature of the rock and fossil record (Peters, 2006), there are clear commonalities in the spatial configuration of fossil localities from across the Phanerozoic.These patterns are readily modeled by assuming that the controls on their occurrence are multiscaled, hierarchical, and multiplicative.Previous studies have demonstrated the fractal nature of sedimentary processes in time (Kemp, 2012;Miall, 2015), with the resulting stratigraphic record characterized by gaps at all scales produced as an inevitable product of depositional and erosional processes (Tipper, 2016).The current study demonstrates for the first time that this pattern is also valid in space.The development of detailed mechanistic models to explain the spatial distribution of fossil sites is still in its early stages (e.g., Block et al., 2016) and should include the range of stratigraphic, tectonic, climatic, taphonomic, and even anthropogenic factors that could control the distribution of sites.
The output of these models will need to be compared, at some point, to the data to test their validity.Lacunarity and related spatial statistical methods should be an important approach to these data-model comparisons (Plotnick, 1999).

ACKNOWLEDGMENTS
Figure 1.Lacunarity curves of locality distribution for representative time periods.A: Marine.B: Terrestrial.Values are connected in case-wise order.The correlation coefficients from a linear regression for each curve are given.All curves are nearly straight, consistent with a random fractal distribution of localities.Also shown are representative lacunarity curves for randomized patterns over the same spatial extent and the same number of points as the marine Upper Cretaceous and the terrestrial Miocene data.The hollow curves are characteristic of random patterns made of independent points (complete spatial randomness).Remaining curves are in Figures DR1 and DR2 (see footnote 1).
Figure 2. Hierarchical multiplicative model produced by applying a threshold to a random hierarchical multiplicative cascade (Cheng, 2014; Plotnick and Sepkoski, 2001).A: Transects (11) divided into progressively smaller divisions.Each division is assigned a Gaussian random number representing probability of preservation produced by a process acting at that scale.B: Light blue line is the product of probabilities from A, representing the total probability of preservation resulting from all processes.An arbitrary cutoff is applied, produced a point distribution of fossil sites.C: Simulated map of collection site distribution, produced by two-dimensional version of same process (cf.Fig. DR1 [see footnote 1]).D: Lacunarity plot for the simulated map (cf.Fig. 1 and Figs.DR2 and DR3).

TABLE 1 . NUMBER OF 0.1° GRID CELLS, NUMBER OF SITES, AND PRODUCT-MOMENT CORRELATIONS BETWEEN LOG 10 -DISTANCE AND LOG 10 -LACUNARITY FOR SELECTED TIME BINS Time bin Marine Terrestrial Number of grid cells (0.1° square) Number of sites r log10-distance, log10-lacunarity Number of grid cells (0.1° square) Number of sites r log10-distance, log10-lacunarity
Note: Each site is a longitude-latitude pair of collections downloaded from the Paleobiology Database (https://paleobiodb.org/).All r values have p < 0.0001.