## Abstract

Over the past several decades, the development of highly efficient and low-cost techniques for the acquisition of U-Pb ages has led to the rapid expansion of detrital zircon analysis and interpretation. Multidimensional scaling (MDS) is a potentially powerful approach for visualizing and interpreting such age data. This study suggests that for any study applying MDS to detrital zircons: (1) the choice of representing age distributions as either probability density plots or kernel density estimations offers no particular advantage when subjected to many inter-sample comparison measures of grain ages; (2) among the various approaches for such sample pairwise comparisons, likeness, similarity, the Kuiper test statistic, and the Sircombe-Hazelton metric are more effective than the Kolmogorov-Smirnov test statistic or the cross-correlation coefficient; (3) variable proportions of both shared and unique age components have significant effects on MDS mapping; and (4) three-dimensional MDS projections are often superior to the two-dimensional projections commonly used in detrital provenance studies. We demonstrate these findings by representing age distributions of samples from platform, basin, and passive margin settings of North America as colored surfaces defined on the basis of sample depositional age, geographic position, and importance of different detrital zircon components, and then compare these with three-dimensional MDS maps. These approaches suggest that the application of MDS techniques to detrital zircon data affords easily attainable and significant advantages in the geologic interpretation of detrital grain ages.

## INTRODUCTION

Zircon is a common accessory mineral in clastic sedimentary deposits and is widely used for provenance analysis. Its durability, resistance to chemical and physical weathering, high initial concentrations of U and Th, and low initial concentrations of Pb make it a robust geochronometer and a useful mineral for sedimentary provenance studies. The widespread use of single-grain detrital zircon age data is largely due to the development of rapid and inexpensive age acquisition techniques (e.g., Gehrels, 2000; Gehrels et al., 2008) to determine ages for hundreds of zircon grains per day (e.g., Pullen et al., 2014). This volume of data, however, presents challenges for its management, visualization, and interpretation. Efficient, effective, and meaningful procedures to compare and evaluate zircon U-Pb age distributions have, therefore, garnered considerable attention (e.g., Sircombe and Hazelton, 2004; Vermeesch, 2013, 2017; Saylor and Sundell, 2016; Sundell and Saylor, 2017).

Differences in sample grain age frequency distributions are often qualitatively evaluated by visual comparison, but the increasing size of such data sets has resulted in a growing desire for more robust characterization of pairwise differences. Methodologies successfully employed thus far include quantification of inter-sample variations (Gehrels, 2000; Amidon et al., 2005a, 2005b; Saylor et al., 2012; Satkoski et al., 2013); mixture modeling (e.g., Sambridge and Compston, 1994; Licht et al., 2016; Sundell and Saylor, 2017; Sharman and Johnstone, 2017), hierarchical cluster analysis (e.g., Sircombe and Hazelton, 2004), principal component analysis (e.g., Sircombe, 2000; Fedo et al., 2003), and age spectrum deconvolution (Sambridge and Compston, 1994). Here, we focus on the multidimensional scaling (MDS) approach (e.g., Vermeesch, 2013).

The multivariable ordination method, or multidimensional scaling, is a quantitative comparison technique that enhances the capability to visualize variation among samples based on quantified pairwise comparisons of their zircon ages (Vermeesch, 2013). Here, using zircon ages from both synthetic-model and real-world data sets, we: (1) consider different representations of sample ages and metrics of intra-sample disparities; (2) examine strengths and limitations of these approaches by comparison of different metrics and by exploring their impact on MDS ordination; (3) assess the impact of variable proportions of shared and different age components on MDS; and (4) evaluate the relative merits of two-dimensional (2-D) and three-dimensional (3-D) MDS projections in detrital provenance studies. We then apply this methodology to detrital zircon age distributions of samples from platform, basin, and passive margin settings of North America (zircon U-Pb ages and sample locations can be found in Table DR1^{1}). Throughout, we employ the terms “age component” to represent a suite of grain ages comprising a single age mode, “age population” as a suite of grains encompassing one or more age components, and “sample” as ages determined for a limited number of grains drawn at random from some population of assumed infinite size. In this construct, an age component is analogous to a single Gaussian age mode, whereas age populations comprise one or more Gaussian components typically found a sand or sandstone sample.

## MULTIDIMENSIONAL SCALING

Multidimensional scaling (MDS) is a typically iterative ordination technique that graphically represents the relative dissimilarities among samples arrayed in N-dimensional space. It is not new to geologic applications (e.g., Doveton, 1976; Hayward and Smale, 1992; Hounslow and Morton, 2004; Honarkhah and Caers, 2010); however, more widespread application to detrital zircon data sets is relatively recent (e.g., Vermeesch, 2013; Spencer and Kirkland, 2015; Vermeesch and Garzanti, 2015; Arboit et al., 2016; Saylor et al., 2017; Xu et al., 2017). Although a detailed description of the approach is beyond the scope of this paper, we offer a brief overview.

### General Considerations

In detrital geochronology, MDS transforms a matrix of pairwise similarities among zircon age distributions into Cartesian coordinates in 2-D or 3-D space, with the distance between sample points representing the degree of dissimilarity to one another. Greater differences among inter-sample ages lead to a greater spread of points on an MDS “map.” MDS serves to reduce complex age spectra into relatively simple visualizations that describe the variation between sample age frequency distributions. Because *n* variables are typically collapsed into a 2-D or 3-D space, some distortion must occur; the degree of this distortion is typically measured by a goodness-of-fit statistic. Distances in MDS maps are unitless and orientations are arbitrary; only the relative proximities and distances between points have meaning.

*f*) into a disparity matrix of fitted Euclidean (straight-line) distances (

*d*) which, for two samples

*i*and

*j*, can be represented as:

*f*(δ

_{ij}) is a monotonically increasing function that transforms the dissimilarities to “disparities” or fitted distances. MDS uses these disparities to produce a configuration of points in two (e.g., Vermeesch, 2013) or three (Saylor and Sundell, 2016) dimensions. Two variants on the approach are nonmetric and metric MDS. In nonmetric MDS, the ranks of the dissimilarities (ordinal data, not the absolute dissimilarities) are used to determine the ordination, which is iteratively calculated (e.g., isotonic regression; Kruskal, 1964a, 1964b). In metric MDS, a superset of classical MDS (Torgerson, 1952), absolute values of the dissimilarities are used to determine MDS configurations. Metric MDS simultaneously calculates configurations of the disparities matrix and the fits of the transformation via Eigenanalysis (Torgerson, 1952; Borg and Groenen, 2005). If

*f*(δ

_{ij}) = δ

_{ij}and the dissimilarities are metric, then the configuration of points can be solved via linear algebra (Carroll and Arabie, 1980).

How well the MDS configuration preserves the original inter-sample dissimilarities is evaluated by a loss function. In nonmetric MDS, the loss function is minimized numerically to optimize the configuration solution. For metric MDS, the loss function is calculated simultaneously with the calculated configuration of *f*(δ); a commonly used loss function is stress (S; Kruskal, 1964a). In addition to loss functions, Shepard plots display inter-point distances (*d*) against the non-transformed dissimilarities (δ) and are often used to evaluate the goodness of an MDS fit. In a Shepard plot, the closer points fall along a 1:1 line for metric scaling or along the nonparametric function for nonmetric scaling, the better the MDS solution represents the differences among samples. A strength of MDS is that neither the choice of metric or nonmetric MDS nor the choice of the loss function drastically changes the resulting transformation for detrital sample dissimilarities. However, metric MDS provides a global solution that will not potentially collapse into a local optimum (stress minimum) that can complicate finding the optimal solution. For simplicity, we therefore focus on the application of metric MDS and utilize the stress (*S*) loss function.

### Variation in Age Frequency Distributions

With this as a general background, we first explore two aspects of detrital zircon data that potentially impact characterizing spatial and temporal differences in zircon populations using MDS. These are: (1) the implementation of one of several methodologies for representing the relative abundances of different age components in an individual sample, and (2) the use of one of several quantitative measures of similarity-dissimilarity between pairs of sample ages. Once an approach to the representation of relative age frequencies and an approach to quantify inter-sample similarities are adopted, MDS can be used to visualize the inter-sample variability.

#### Representation of Sample Ages

Intra-sample abundances of grain ages can be represented as age frequency distributions (Fig. 1A; e.g., Machado et al., 1996), probability density plots (PDPs; e.g., Dodson et al., 1988, Ludwig, 2003; Fig. 1B), or as kernel density estimations (KDEs; e.g., Silverman, 1986; Vermeesch, 2012; Fig. 1C). Both PDPs and KDEs strive to identify the proportional contribution of one or more age components (e.g., Fig. 1D) within a given sample, each presumably representing some time interval during which zircon crystallized. Each necessitates the imposition of certain assumptions and biases.

A PDP function specifies the probability that a random variable will occur within a particular range of values; it is nonnegative and is normalized to unity. Determination of PDPs involves the summed probabilities of measured grain ages and their associated Gaussian analytical uncertainties (errors). Implicit in the method is that calculated age abundances (e.g., probability density) are related to analytical error (Fig. 1B). A KDE is a nonparametric estimate of the probability density function of a random variable but discards uncertainties associated with data acquisition. The calculation of KDEs also involves the summed probabilities of measured grain ages but utilizes some specified “bandwidth” (e.g., Fortmann-Roe et al., 2012) rather than an analytical error (Fig. 1C). However, the selection of bandwidth is critical, as different choices may lead to strikingly dissimilar density estimates. Bandwidth can be constant; Botev (2007) presents a diffusion-based approach for its approximation that minimizes the asymptotic mean squared error of the estimate (“kernel density estimation”). Bandwidth can also vary (inversely) with data density over some local sample space; these are referred to as “locally adaptive variable bandwidth estimations” or “kernel density functions” if the bandwidth varies inversely to the data point analytical uncertainties (KDFs; e.g., Sircombe and Hazelton, 2004; Vermeesch, 2017).

### Calculation of Differences among Sample Ages

Vermeesch (2013) outlined the requirements for good measures of age population dissimilarity, positing that the chosen metric should be: (1) non-negative; (2) symmetric (δ_{ij} = δ_{ji}); and (3) exhibit triangular inequality (δ_{ik} ≤ δ_{ij} + δ_{ik}). A variety of approaches that meet these criteria to varying degrees have been employed to quantitatively represent the differences in grain age distributions among samples pairs. Several metrics have been employed to assess inter-sample similarities. This includes the cross-correlation coefficient, which is the coefficient of determination (R^{2}) of a cross-plot of age frequencies, PDPs, or KDEs of two samples (e.g., Saylor et al., 2012, 2013; Saylor and Sundell, 2016). It is similar to quantile-quantile or probability-probability plots (e.g., Wilk and Gnanadesikan, 1968), except PDPs or KDEs, rather than cumulative distribution functions, are typically used in the cross-plot. Identical age spectra yield an R^{2} of 1; those that share no age peaks yield an R^{2} of 0. An alternative metric is the nonparametric two-sample Kolmogorov-Smirnov (K-S) test. Used to evaluate the null hypothesis that variation between two age populations is within the expected variation assuming random sampling of a parent population (Guynn and Gehrels, 2010; Razali and Wah, 2011), the K-S test is based on the K-S statistic, which is the maximum difference between the empirical cumulative distribution functions of two samples. The Kuiper’s test statistic (Kuiper, 1960; Press et al., 2007; Saylor and Sundell, 2016) is closely related to the K-S statistic but represents the sum of the absolute sizes of the most positive and most negative differences between the two cumulative distribution functions being compared. Similarity is a measure of resemblance between two PDPs or KDEs (Gehrels, 2000) and is calculated by summing, over the time interval of interest and at some temporal scale of resolution, the square root of the product of each pair of common probabilities. It yields a value of 100% for identical curves and a value of 0% for those that share no age components. Likeness is a complement of the area mismatch metric of Amidon et al. (2005a, 2005b). It represents the percent of “sameness” between two unitized PDPs or KDEs (e.g., Satkoski et al., 2013) and is calculated as one minus the summation of absolute differences between all PDP or KDE values divided by two. Values range from 100 to 0%. The Sircombe-Hazelton dissimilarity measure assesses the “distance” between age distributions based on their PDPs or KDEs. The difference between two age distributions is measured as the integral of the squared difference of the two functions when plotted on the same scale (Sircombe and Hazelton, 2004; Vermeesch et al., 2016).

Given the nontrivial number of approaches that have been taken to quantify intra- and inter-sample age differences, one might anticipate arriving at some sort of consensus. That this is not the case is illustrated by two recent papers dealing specifically with quantifying comparison of large data sets (Saylor and Sundell, 2016) and dissimilarity measures (Vermeesch, 2017) in detrital geochronology. In a study examining the effects of increasing the number of ages selected from a single population, Saylor and Sundell (2016) suggest that metrics for differentiation of sample zircon ages should: (1) increase with increasing sample size; (2) maximize sensitivity by using the full range of possible coefficients; and (3) minimize artifacts resulting from sample-specific complexity. Based on these considerations, they conclude that cross-correlation coefficients among PDPs pass all three criteria, that likeness and similarity coefficients of PDPs (as well as K-S and Kuiper test D and V values) pass two of the criteria, and that all coefficients calculated from KDFs fail at least two of the criteria. In contrast, Vermeesch (2017) suggests that when likeness and cross-correlation coefficients are based on PDPs, they employ a too-narrow smoothing bandwidth to precise data and a too-wide bandwidth to imprecise data, such that results are “insensible.” He suggests that the Sircombe-Hazelton dissimilarity measure using a KDF (bandwidth is allowed to vary) produces more “functional” results.

#### Choices of Representation of Sample Ages and Metrics of Their Differences

Given the number of approaches to characterizing sample zircon ages and their differences, it is first necessary to attempt to arrive at a suitable measure before proceeding to their application in MDS. Disagreement about the most suitable similarity metric might derive from the fact that both Saylor and Sundell (2016) and Vermeesch (2017) rely heavily on model age distributions to make various points. A first step in this direction is to examine different descriptions of sample ages and measures of their differences based on actual real-world measured zircon ages. To this end, we consider detrital zircon ages from the Paleozoic (Gehrels et al., 2011) and Mesozoic (Dickinson and Gehrels, 2009, 2008) succession across the Colorado Plateau, the subject of MDS mapping later in the paper (Table DR1). These data are represented by 6173 zircon ages determined by laser ablation–multicollector–inductively coupled plasma–mass spectrometry among 65 samples (average grains per sample n = 95) extending from the Cambrian Tapeats Sandstone at the base section through the Jurassic Morrison Formation at the top. For each sample, we first calculated their PDP and fixed bandwidth KDE (e.g., Botev, 2007). A total of 65 samples comprise 2080 possible sample pairs; for each of these, we calculate the 6 similarity/dissimilarity metrics noted above (cross-correlation coefficient, K-S statistic, Kuiper’s statistic, similarity, likeness, and the Sircombe-Hazelton dissimilarity measure) at a temporal resolution of 1 Ma.

Several conclusions about the choice of age representation and metric of age difference can be drawn from this exercise. First, plots of similarity, likeness, and the Sircombe-Hazelton dissimilarity measure based on PDPs are virtually identical to those based on KDEs (Figs. 2A, 2B, and 2C). The standard deviation of each metric based on KDE pairs compared to that of each metric based on PDP pairs (standard deviation of residuals of the PDP-KDE regressions) is only 2.3%, 2.8%, and 1.6% for the similarity, likeness, and the Sircombe-Hazelton metrics, respectively.

Second, greatest agreement among these measures of pairwise sample differences exists between values of likeness and similarity (Fig. 2D, R^{2} = 0.94) and likeness and the Kuiper’s test statistic (Fig. 3A; R^{2} = 0.85). Values of likeness and the K-S statistic (Fig. 3B; R^{2} = 0.77) and likeness and the Sircombe-Hazelton dissimilarity measure (Fig. 3C; R^{2} = 0.77) are somewhat less similar, and agreement between values likeness and the cross-correlation coefficient (Fig. 3D; R^{2} = 0.67) are the least-well correlated. In other words, when applied to these real-world data, and when compared to likenesses as a measure of pairwise sample differences, similarity yields the greatest agreement, followed by the Kuiper’s test statistic and the Sircombe-Hazelton dissimilarity measure, and then the cross-correlation coefficient. In retrospect, such an outcome is perhaps not surprising in that likeness, similarity, and the Sircombe-Hazelton dissimilarity metrics are based on summations of many PDP or KDE differences at some designated scale of resolution (here, 1 Ma).

#### Evaluating Dissimilarity Measures in MDS

Based on consideration of these real-world data, it seems apparent that representation of sample ages as either PDPs or KDEs has little impact on the veracity of various measures of sample difference. Moreover, based on similarity of values, it seems that likeness and similarity may be the most efficient measures of pairwise sample differences followed by the Kuiper’s, the Sircombe-Hazelton, and the K-S statistics; cross-correlation coefficients are the least similar to other measures of sample differences. However, it is not immediately apparent how these measures directly translate into the MDS ordinations that best represent dissimilarity within a data set. Although a more thorough analysis of these various options is beyond the scope of this paper, it is informative to examine the impact of three of these metrics of dissimilarity on MDS mapping. To this end, we consider likeness, the K-S test statistic, and cross-correlation coefficients to evaluate which appears, on the basis of agreement with the other metrics, to represent the best, better, and the poorest measures of sample differences. To do this, we construct a simple, three-component model defined by Gaussian distributions of 100 ± 10, 200 ± 10, and 300 ± 10 Ma components, and mix these in increments of thirds, resulting in 10 populations (Fig. 4). The 2-D MDS map generated from such a model resembles a ternary diagram with each single-component sample, representing 100% of one component, positioned at the vertices and with all remaining sample points representing linearly correlated dissimilarities according to the percentage of each component (Fig. 4).

Using metric MDS and the stress (S) loss function, we produced MDS configurations based on three dissimilarities measures of likeness (Fig. 4A), PDP cross-correlation (Fig. 4B), and the K-S D statistics (Fig. 4C). Of these measures, likeness clearly results in the best ternary approximation of the modeled data set, with nearly uniform spacing between all points. The configurations for PDP cross-correlation and the K-S D statistics both define the components as vertices and have stresses comparable to likeness (S = 0.08); however, both are less accurate representations of the modeled data. For multi-modal populations, cross-correlation results in samples clustering toward the vertices containing the component with the highest proportions and results in unequal spacing among points. For the K-S D statistic, the end-member vertices are established, but the mixed data are clearly displaced toward the 200 Ma component (Fig. 4C). That likeness yields a better ternary approximation of the modeled data set than either cross-correlation or the K-S D statistic is in general agreement with the observation that values of likeness exhibit considerably less PDP-KDE scatter than either cross-correlation or the K-S statistic. They are also in accord with the observation that likeness values are closest to calculated values of similarity (Fig. 2D) and the Kuiper’s test statistic (Fig. 3A) and are least similar to coefficients of cross-correlation (Fig. 3D).

### Effect of Variable Proportions of Shared and Unique Components

Employing likeness as the measure of dissimilarity and stress (S) as the loss function in metric MDS mapping, we can now explore the influences of several variables common to detrital geochronology on MDS. These include the effects of variable proportions of shared and different age components among pairs of age frequency distributions. The effect of shared but unequal proportions of age components is one of the challenges of detrital geochronological data. To demonstrate how this impacts MDS, we generate five sets of population pairings from two age components (100 ± 10 Ma and 200 ± 10 Ma; Fig. 5). For simplicity, we fix one population at equal proportions of the two age components, while components in the second population vary in increments of 5% per MDS analysis. Three 100-age randomly drawn samples are then created for each population and subjected to MDS analysis (Fig. 5A–5E). Based on these five model sets, visual separation of samples in MDS space becomes significantly more apparent when likeness values exceed 0.20.

The sensitivity of MDS to the presence of unique components is also highly relevant to its use as a provenance tool. We consider three age components (100 ± 10 Ma, 200 ± 10 Ma, and 300 ± 10 Ma) and allow the 200 ± 10 Ma component to be shared in equal proportion between the two populations (Fig. 6). The average degree of differentiation among 10 iterations of each model set demonstrates that MDS differentiation only becomes consistently apparent as likeness values exceed ∼0.10 (mixtures B and C) and highlights its sensitivity to the inclusion of unique age components (Fig. 6).

### n-Dimensionality and Visualizations of MDS

MDS is often characterized as a useful tool by which to evaluate the generalities of large data sets containing multiple components. In practice, n-1 dimensions are required to project the data in MDS space without distortion. However, typical detrital zircon data often contain greater than four components and are never as simple as our synthetic examples. Here, we briefly show how a four-component model projected onto a 2-D MDS surface can result in poorly transformed configurations, and further show how this problem is significantly resolved when scaling to three dimensions. To do so, we use a mixture model based on fourths of the four age components (100 ± 10 Ma, 200 ± 10 Ma, 300 ± 10 Ma and 400 ± 10 Ma), and translate the dissimilarity matrix of this data set into 2-D (Fig. 7A) and 3-D (Fig. 7C) MDS configurations. Theoretically, a properly translated four-component data set plot should resemble a tetrahedron with equally spaced vertices. Distances between points should be some multiple of one-fourth the distance between any of the four vertices.

Two-dimensional MDS projects the tetrahedron into the two available dimensions. In this instance, nearest neighbor lines seem directed toward one of the tetrahedron’s corners, the projected peak of the pyramid (Fig. 7A). The poor representation of the four components in 2-D is apparent in the Shepard plot (Fig. 7B) and the elevated (poor) stress value (S = 0.23). While the vertices do indeed represent each of the four age components, the placement of samples relative to those endmembers is highly suspect, as the distances between points do not correlate well with dissimilarities (Fig. 7A). Conversely, a 3-D MDS translation of the same data reproduces the predicted pyramidal shape and uniform spacing of sample points. The stress value is reduced, as will always occur with increasing dimensionality (Kruskal and Wish, 1978), to S = 0.9. The Shepard plot (Fig. 7D) also shows the far better correlation between distances and dissimilarities, thus highlighting the more appropriate dimensionality for the data. What is clearly demonstrated here is that despite the simplicity of the data set, the 3-D projection quickly outpaces the 2-D projection most typically used in detrital provenance studies. If the desired result of a detrital zircon provenance study is to characterize a single shift in provenance (e.g., Vermeesch, 2013), increased dimensionality and low S values might not be necessary. In more complex cases, the consequences of distortion related to 2-D or 3-D projection of n-dimensional data require careful consideration.

## PHANEROZOIC EXAMPLES FROM NORTH AMERICA

While modeled data provide a valuable baseline against which to visualize and evaluate MDS mapping, we conclude by considering MDS analyses of three real-world data sets. Our intent here is to demonstrate the utility of the approach in highlighting sample age variation in both time and space. However, we do not intend to expand upon the previously published data sets nor published conclusions, merely present how visual representations can provide clarity within detrital data. The particulars of such differences would, of course, then serve as the basis for further investigation. In each instance, we represent zircon age data as a color-shaded age frequency “heat map” developed on an interpolated surface comprising PDPs ordered by stratigraphic position (e.g., the Colorado Plateau), relative age (e.g., the Bighorn Basin), or geographic position (e.g., the Gulf Coast). Such representations allow for easy visualization of changes in different age components in time and space. In addition, we ordinate the MDS maps to best represent the relative sense (in time or space) of inter-sample differences.

### The Colorado Plateau

Remarkable exposures of Paleozoic and Mesozoic strata across the Colorado Plateau offer a nearly unparalleled opportunity to examine changes in detrital zircon provenance of Cambrian through Jurassic sandstones across of the southwestern United States. Recent LA-ICP-MS determinations of 2529 zircon U-Pb grain ages from 26 Paleozoic samples (Gehrels et al., 2011) and 3444-grain ages from 36 Triassic and Jurassic samples (Dickinson and Gehrels, 2008, 2009) afford an excellent opportunity to compare more traditional representations of grain ages (Fig. 8A) with MDS plots (Fig. 8B). Detrital zircons from Cambrian through Devonian strata yield mainly 1.7 and 1.4 Ga ages derived from basement rocks of the Yavapai-Mazatzal Province and the Amarillo-Wichita uplift, respectively (Gehrels et al. 2011). Mississippian through Permian rocks contain predominantly 1.1 Ga and younger Paleozoic grains derived from the Grenville and Appalachian orogen, respectively. Mesozoic sands record a greater number of age components but exhibit little directional upsection stratigraphic/secular variation (Dickinson and Gehrels, 2008, 2009).

With multiple age components throughout the data set that vary in their presence and proportions upsection, as well as the relatively large number of samples, a 3-D MDS plot affords an opportunity to view clear changes in provenance within the stratigraphic sequence in a single plot. MDS mapping of the 65 samples reveals that differences among samples, as revealed on axis one, largely correlate with stratigraphic position (Fig. 8B). Sample ages also define three general clusters, which correspond closely with age groups that are discerned qualitatively from sample age abundances. In particular, the three clusters in the MDS visualization (Fig. 8B) are almost exactly the same as the groupings of Cambrian through Devonian, Mississippian through Permian, and Mesozoic that are apparent in the PDP heat map (Fig. 8A; Gehrels et al. 2011; Dickinson and Gehrels, 2008, 2009). Moreover, the samples that exhibit somewhat anomalous distances in MDS mapping are also quickly identifiable in the sample PDPs. Among these are an early Jurassic (Moenave Formation, Utah) sample which contains an anomalously high concentration (40 of 93 grains, 43%) spanning only 40 Ma of the late Paleozoic (202–242 Ma; labeled 1 in Fig. 8B), and two Late Cretaceous samples (northeastern New Mexico and southwestern Utah) that are dominated by ca. 1.7 Ga Yavapai-Mazatzal ages and are the only two Mesozoic samples in this compilation containing significant pre-Grenville age grains (labeled 2 in Fig. 8B). While the specifics of the geologic history of these localities have surely served to give rise to these deviations, it is their clear deviance in MDS that serves to bring their differences into sharper view.

### The Bighorn Basin

Dissimilarities in abundances of different age components in any population of zircon grains potentially reflect changes in the tectonic history of basins or of variations in climate and sea level, all of which possibly impact paleogeography and sediment dispersal patterns. May et al. (2013) compiled and qualitatively interpreted zircon age data from Phanerozoic units exposed in the Bighorn Basin of Wyoming to better understand the tectonostratigraphic evolution of that region. Based on 4104 U-Pb detrital zircon ages from 44 samples, May et al. subdivided Upper Mississippian through Paleocene units into four “tectonostratigraphic assemblages” based on perceived patterns in sample zircon age frequency distributions: (1) a Paleozoic–Triassic proximal continental margin assemblage; (2) a Jurassic–early Cretaceous (early Albian) assemblage associated with organization of Cordilleran orogeny; (3) a late Cretaceous (late Albian through Maastrichtian) interior seaway foreland basin assemblage; and (4) a Paleogene assemblage accumulated during structural segmentation during the Laramide orogeny (Fig. 9A). Because their interpretations are based on qualitative inspection of probability density functions, it serves as an excellent case study to which to apply MDS. A 3-D MDS plot should prove adequate in resolving such a four-component system.

Quantitative evaluation by MDS largely confirms May et al.’s (2013) visual examination of sample PDPs with three of the four proposed assemblages being well-resolved (Fig. 9B). However, the posited fourth assemblage, comprising three Paleogene samples, does not separate as a distinct group or cluster. Each of the resolved assemblages contains either a unique component or component with anomalously high proportions. The samples of the fourth assemblage, however, share components with each of the stratigraphically older assemblages, likely contributing to its poor differentiation. In addition, MDS mapping reveals that the population differences differentiating the first (Paleozoic) and second (lower Mesozoic) assemblages are smaller than those distinguishing the second from the third assemblage (upper Cretaceous). Qualitative differences among sample PDPs support this conclusion; abundant Grenvillian (ca. 1.1 Ga) grains persist upsection through Paleozoic and lower Mesozoic units but are largely absent above the early Albian Greybull Sandstone at the top of association (sample 22, Fig. 9A).

In addition to largely verifying the tectonostratigraphic assemblages identified by May et al. (2013), MDS provides additional insights into the interpretation of detrital zircon data. It serves to identify those samples within any assemblage that are least dissimilar to those in any other, and it allows for a visualization of the relative degrees of difference between groups of samples; both provide motivation for further examination of spatial and temporal attributes that might not be apparent in the absence of MDS.

### The Gulf Coast

In addition to better understanding secular differences among various zircon samples, such as those upsection in the Colorado Plateau and Bighorn Basin, MDS should be equally useful in the discrimination of lateral heterogeneities within stratal units of similar age. This approach has been explicitly taken by Xu et al. (2017) with respect to early Miocene sediment supply to the Gulf of Mexico Basin. Here, we conclude by examining age data from samples of the Paleocene Wilcox Formation collected along ∼2000 km of outcrop belt, extending northwest from western Alabama to the head of the Mississippi embayment south of St. Louis, Missouri, and then southwest toward San Antonio, Texas (Blum and Pecha, 2014).

Cenozoic-scale persistence of several major fluvial axes supplying sediment to the northwestern Gulf of Mexico coastal plain (e.g., Galloway et al., 2011) served to impart prominent along-strike variation in the dominance of the same major age components observed comprising grains in the Colorado Plateau and Bighorn Basin (Fig. 10A). Blum and Pecha (2014) report 2564 ages from 27 along-strike samples of the Wilcox Formation. Dominant age components consist of ca. 1.7 Ga Yavapai-Mazatzal, ca. 1.4 Ga Amarillo-Wichita, and younger Western Cordillera (ca. 170 Ma) and Laramide (ca. 60–75 Ma) province grains along the western third of the belt, and ca. 1.1 Ga Grenvillian grains along the eastern two-thirds of the section (Fig. 10A). Once again, the multiple components and numerous samples provide an excellent application of 3-D MDS configurations; 2-D will not adequately represent the component variance of the detrital data set. MDS mapping of these age data results in an excellent agreement between qualitative assessments of lateral variations based on PDP inspection and variations based on quantified differences (Fig. 10B). Variation in the MDS distribution of these 27 samples corresponds closely with position along the outcrop belt. Eastern samples are dominated by Grenville grains and are more closely clustered than other samples; more western samples are noticeably more dispersed, an aspect reflecting the incorporation of a larger number of components (Fig. 10A).

## CONCLUSIONS

MDS has the potential to be a valuable resource for detrital geochronology. Like any tool, proper recognition of the strengths and limitations of the technique is necessary if valid conclusions are to be drawn from MDS data configurations. Our analyses suggest the following. (1) In a context of input representations of age frequencies for the derivation of “sameness” metrics, both PDPs and KDEs serve equally well. Although the latter may afford some better representation on means and durations of zircon generation (crystallization) episodes, it affords no advantage over the former insofar as quantifying pairwise sample age differences. (2) Among the more commonly used metrics of dissimilarity, the likeness and similarity yield the most consistent values when calculated from probability and kernel density estimates; values of the Kuiper’s test statistic and the Sircombe-Hazelton dissimilarity measure are more variable; the Kolmogorov-Smirnov statistics and cross-correlation coefficients exhibit the least agreement with other sameness metrics. (3) Among likeness, the K-S statistic, and cross-correlation coefficients, use of the latter two as dissimilarity measures can significantly distort MDS configurations while the use of likeness results in a strong correlation between percentages of age component incorporation into various populations and their sample-to-sample dissimilarity. (4) MDS is more sensitive to the presence of unique age components than to variable proportions of shared components with differences in as little as 5% leading to appreciable MDS differentiation in simple low component samples. MDS is also relatively sensitive to degrees of overlap of different age modes. (5) More than two dimensions may be required to properly transform the dissimilarity matrix. More sources and more components require increasing dimensions to effectively resolve variation in MDS, and insufficient dimensions can lead to oversimplified solutions where inter-sample distances do not accurately represent inter-sample dissimilarities. (6) The application of MDS, in conjunction with other analytical tools, serves to improve the accuracy of interpretations and conclusions of detrital zircon data.

Application of MDS mapping to the three data sets representing sediment accumulation in the North American platform, intermontane basin, and passive margin settings reveals that this approach provides informative visualizations of relative differences among sample zircon ages and the degree to which these differences can be related to their disposition in time and space. In each application, MDS configurations in three dimensions improved the resolving power due to the complexities of the detrital samples. MDS affords an important tool for understanding the nature of the distributions of zircon ages and ultimately, to the geologic processes and histories reflected in these data sets.

## ACKNOWLEDGMENTS

Early versions of this work benefited greatly from critical reviews by Elena Belousova, James Brower, Aaron Cavosie, William Griffin, Linda Ivany, Christa Kelleher, Damian Nance, Joel Saylor, and Pieter Vermeesch; to these people, we offer our sincere thanks. We gratefully acknowledge our Syracuse University colleagues Mariana Bonich-Wissink, Scott Samson, and Pedro Val for their insights during the formulation of many of the ideas presented herein. We also commend Pieter Vermeesch, who first widely advocated the utility of MDS mapping in detrital zircon geochronometric studies. This research was in part supported by National Science Foundation Grant EAR-1019427 to G.D. Hoke.

^{1}GSA Data Repository Item 2018142, Table DR1: Zircon U-Pb ages and spatial and temporal information from the Colorado Plateau, Bighorn Basin, and Gulf Coast data sets, is available at http://www.geosociety.org/datarepository/2018, or on request from editing@geosociety.org.