Pairwise sample comparisons and multidimensional scaling of detrital zircon ages with examples from the North American platform , basin , and passive margin settings

Over the past several decades, the development of highly efficient and low-cost techniques for the acquisition of U-Pb ages has led to the rapid expansion of detrital zircon analysis and interpretation. Multidimensional scaling (MDS) is a potentially powerful approach for visualizing and interpreting such age data. This study suggests that for any study applying MDS to detrital zircons: (1) the choice of representing age distributions as either probability density plots or kernel density estimations offers no particular advantage when subjected to many inter-sample comparison measures of grain ages; (2) among the various approaches for such sample pairwise comparisons, likeness, similarity, the Kuiper test statistic, and the Sircombe-Hazelton metric are more effective than the Kolmogorov-Smirnov test statistic or the cross-correlation coefficient; (3) variable proportions of both shared and unique age components have significant effects on MDS mapping; and (4) three-dimensional MDS projections are often superior to the two-dimensional projections commonly used in detrital provenance studies. We demonstrate these findings by representing age distributions of samples from platform, basin, and passive margin settings of North America as colored surfaces defined on the basis of sample depositional age, geographic position, and importance of different detrital zircon components, and then compare these with three-dimensional MDS maps. These approaches suggest that the application of MDS techniques to detrital zircon data affords easily attainable and significant advantages in the geologic interpretation of detrital grain ages. LITHOSPHERE; v. 10; no. 3; p. 478–491; GSA Data Repository Item 2018142 | Published online 23 March 2018 https://doi.org/10.1130/L700.1


INTRODUCTION
Zircon is a common accessory mineral in clastic sedimentary deposits and is widely used for provenance analysis.Its durability, resistance to chemical and physical weathering, high initial concentrations of U and Th, and low initial concentrations of Pb make it a robust geochronometer and a useful mineral for sedimentary provenance studies.The widespread use of single-grain detrital zircon age data is largely due to the development of rapid and inexpensive age acquisition techniques (e.g., Gehrels, 2000;Gehrels et al., 2008) to determine ages for hundreds of zircon grains per day (e.g., Pullen et al., 2014).This volume of data, however, presents challenges for its management, visualization, and interpretation.Efficient, effective, and meaningful procedures to compare and evaluate zircon U-Pb age distributions have, therefore, garnered considerable attention (e.g., Sircombe and Hazelton, 2004;Vermeesch, 2013Vermeesch, , 2017;;Saylor and Sundell, 2016;Sundell and Saylor, 2017).
The multivariable ordination method, or multidimensional scaling, is a quantitative comparison technique that enhances the capability to visualize variation among samples based on quantified pairwise comparisons of their zircon ages (Vermeesch, 2013).Here, using zircon ages from both synthetic-model and real-world data sets, we: (1) consider different representations of sample ages and metrics of intra-sample disparities; (2) examine strengths and limitations of these approaches by comparison of different metrics and by exploring their impact on MDS ordination; (3) assess the impact of variable proportions of shared and different age components on MDS; and (4) evaluate the relative merits of two-dimensional (2-D) and three-dimensional (3-D) MDS projections in detrital provenance studies.We then apply this methodology to detrital zircon age distributions of samples from platform, basin, and passive margin settings of North America (zircon U-Pb ages and sample locations can be found in Table DR1 1 ).Throughout, we employ the terms "age component" to represent a suite of grain ages comprising a single age mode, "age population" as a suite of grains encompassing one or more age components, and "sample" as ages determined for a limited number of grains drawn at random from some population of assumed infinite size.In this construct, an age component is analogous to a single Gaussian age mode, whereas age populations comprise one or more Gaussian components typically found a sand or sandstone sample.

MULTIDIMENSIONAL SCALING
Multidimensional scaling (MDS) is a typically iterative ordination technique that graphically represents the relative dissimilarities among samples arrayed in N-dimensional space.It is not new to geologic applications (e.g., Doveton, 1976;Hayward and Smale, 1992;Hounslow and Morton, 2004;Honarkhah and Caers, 2010); however, more widespread application to detrital zircon data sets is relatively recent (e.g., Vermeesch, 2013;Spencer and Kirkland, 2015;Vermeesch and Garzanti, 2015;Arboit et al., 2016;Saylor et al., 2017;Xu et al., 2017).Although a detailed description of the approach is beyond the scope of this paper, we offer a brief overview.

General Considerations
In detrital geochronology, MDS transforms a matrix of pairwise similarities among zircon age distributions into Cartesian coordinates in 2-D or 3-D space, with the distance between sample points representing the degree of dissimilarity to one another.Greater differences among intersample ages lead to a greater spread of points on an MDS "map."MDS serves to reduce complex age spectra into relatively simple visualizations that describe the variation between sample age frequency distributions.Because n variables are typically collapsed into a 2-D or 3-D space, some distortion must occur; the degree of this distortion is typically measured by a goodness-of-fit statistic.Distances in MDS maps are unitless and orientations are arbitrary; only the relative proximities and distances between points have meaning.
In MDS, a matrix of inter-sample dissimilarities (δ) is transformed by a function (f) into a disparity matrix of fitted Euclidean (straight-line) distances (d) which, for two samples i and j, can be represented as: where f(δ ij ) is a monotonically increasing function that transforms the dissimilarities to "disparities" or fitted distances.MDS uses these disparities to produce a configuration of points in two (e.g., Vermeesch, 2013) or three (Saylor and Sundell, 2016) dimensions.Two variants on the approach are nonmetric and metric MDS.In nonmetric MDS, the ranks of the dissimilarities (ordinal data, not the absolute dissimilarities) are used to determine the ordination, which is iteratively calculated (e.g., isotonic regression; Kruskal, 1964aKruskal, , 1964b)).In metric MDS, a superset of classical MDS (Torgerson, 1952), absolute values of the dissimilarities are used to determine MDS configurations.Metric MDS simultaneously calculates configurations of the disparities matrix and the fits of the transformation via Eigenanalysis (Torgerson, 1952;Borg and Groenen, 2005).If f(δ ij ) = δ ij and the dissimilarities are metric, then the configuration of points can be solved via linear algebra (Carroll and Arabie, 1980).How well the MDS configuration preserves the original inter-sample dissimilarities is evaluated by a loss function.In nonmetric MDS, the loss function is minimized numerically to optimize the configuration solution.For metric MDS, the loss function is calculated simultaneously with the calculated configuration of f(δ); a commonly used loss function is stress (S; Kruskal, 1964a).In addition to loss functions, Shepard plots display inter-point distances (d) against the non-transformed dissimilarities (δ) and are often used to evaluate the goodness of an MDS fit.In a Shepard plot, the closer points fall along a 1:1 line for metric scaling or along the nonparametric function for nonmetric scaling, the better the MDS solution represents the differences among samples.A strength of MDS is that neither the choice of metric or nonmetric MDS nor the choice of the loss function drastically changes the resulting transformation for detrital sample dissimilarities.However, metric MDS provides a global solution that will not potentially collapse into a local optimum (stress minimum) that can complicate finding the optimal solution.For simplicity, we therefore focus on the application of metric MDS and utilize the stress (S) loss function.

Variation in Age Frequency Distributions
With this as a general background, we first explore two aspects of detrital zircon data that potentially impact characterizing spatial and temporal differences in zircon populations using MDS.These are: (1) the implementation of one of several methodologies for representing the relative abundances of different age components in an individual sample, and (2) the use of one of several quantitative measures of similaritydissimilarity between pairs of sample ages.Once an approach to the representation of relative age frequencies and an approach to quantify inter-sample similarities are adopted, MDS can be used to visualize the inter-sample variability.
A PDP function specifies the probability that a random variable will occur within a particular range of values; it is nonnegative and is normalized to unity.Determination of PDPs involves the summed probabilities of measured grain ages and their associated Gaussian analytical uncertainties (errors).Implicit in the method is that calculated age abundances (e.g., probability density) are related to analytical error (Fig. 1B).A KDE is a nonparametric estimate of the probability density function of a random variable but discards uncertainties associated with data acquisition.The calculation of KDEs also involves the summed probabilities of measured grain ages but utilizes some specified "bandwidth" (e.g., Fortmann-Roe et al., 2012) rather than an analytical error (Fig. 1C).However, the selection of bandwidth is critical, as different choices may lead to strikingly dissimilar density estimates.Bandwidth can be constant; Botev (2007) presents a diffusion-based approach for its approximation that minimizes the asymptotic mean squared error of the estimate ("kernel density estimation").Bandwidth can also vary (inversely) with data density over some local sample space; these are referred to as "locally adaptive variable bandwidth estimations" or "kernel density functions" if the bandwidth RESEARCH varies inversely to the data point analytical uncertainties (KDFs; e.g., Sircombe and Hazelton, 2004;Vermeesch, 2017).Vermeesch (2013) outlined the requirements for good measures of age population dissimilarity, positing that the chosen metric should be: (1) non-negative; (2) symmetric (δ ij = δ ji ); and (3) exhibit triangular inequality (δ ik ≤ δ ij + δ ik ).A variety of approaches that meet these criteria to varying degrees have been employed to quantitatively represent the differences in grain age distributions among samples pairs.Several metrics have been employed to assess inter-sample similarities.This includes the crosscorrelation coefficient, which is the coefficient of determination (R 2 ) of a cross-plot of age frequencies, PDPs, or KDEs of two samples (e.g., Saylor et al., 2012Saylor et al., , 2013;;Saylor and Sundell, 2016).It is similar to quantilequantile or probability-probability plots (e.g., Wilk and Gnanadesikan, 1968), except PDPs or KDEs, rather than cumulative distribution functions, are typically used in the cross-plot.Identical age spectra yield an R 2 of 1; those that share no age peaks yield an R 2 of 0. An alternative metric is the nonparametric two-sample Kolmogorov-Smirnov (K-S) test.Used to evaluate the null hypothesis that variation between two age populations is within the expected variation assuming random sampling of a parent population (Guynn and Gehrels, 2010;Razali and Wah, 2011), the K-S test is based on the K-S statistic, which is the maximum difference between the empirical cumulative distribution functions of two samples.The Kuiper's test statistic (Kuiper, 1960;Press et al., 2007;Saylor and Sundell, 2016) is closely related to the K-S statistic but represents the sum of the absolute sizes of the most positive and most negative differences between the two cumulative distribution functions being compared.Similarity is a measure of resemblance between two PDPs or KDEs (Gehrels, 2000) and is calculated by summing, over the time interval of interest and at some temporal scale of resolution, the square root of the product of each pair of common probabilities.It yields a value of 100% for identical curves and a value of 0% for those that share no age components.Likeness is a complement of the area mismatch metric of Amidon et al. (2005aAmidon et al. ( , 2005b)).It represents the percent of "sameness" between two unitized PDPs or KDEs (e.g., Satkoski et al., 2013) and is calculated as one minus the summation of absolute differences between all PDP or KDE values divided by two.Values range from 100 to 0%.The Sircombe-Hazelton dissimilarity measure assesses the "distance" between age distributions based on their PDPs or KDEs.The difference between two age distributions is measured as the integral of the squared difference of the two functions when plotted on the same scale (Sircombe and Hazelton, 2004;Vermeesch et al., 2016).

Calculation of Differences among Sample Ages
Given the nontrivial number of approaches that have been taken to quantify intra-and inter-sample age differences, one might anticipate arriving at some sort of consensus.That this is not the case is illustrated by two recent papers dealing specifically with quantifying comparison of large data sets (Saylor and Sundell, 2016) and dissimilarity measures (Vermeesch, 2017) in detrital geochronology.In a study examining the effects of increasing the number of ages selected from a single population, Saylor and Sundell (2016) suggest that metrics for differentiation of sample zircon ages should: (1) increase with increasing sample size; (2) maximize sensitivity by using the full range of possible coefficients; and (3) minimize artifacts resulting from sample-specific complexity.Based on these considerations, they conclude that cross-correlation coefficients among PDPs pass all three criteria, that likeness and similarity coefficients of PDPs (as well as K-S and Kuiper test D and V values) pass two of the criteria, and that all coefficients calculated from KDFs  fail at least two of the criteria.In contrast, Vermeesch (2017) suggests that when likeness and cross-correlation coefficients are based on PDPs, they employ a too-narrow smoothing bandwidth to precise data and a too-wide bandwidth to imprecise data, such that results are "insensible."He suggests that the Sircombe-Hazelton dissimilarity measure using a KDF (bandwidth is allowed to vary) produces more "functional" results.

Choices of Representation of Sample Ages and Metrics of Their Differences
Given the number of approaches to characterizing sample zircon ages and their differences, it is first necessary to attempt to arrive at a suitable measure before proceeding to their application in MDS.Disagreement about the most suitable similarity metric might derive from the fact that both Saylor and Sundell (2016) and Vermeesch (2017) rely heavily on model age distributions to make various points.A first step in this direction is to examine different descriptions of sample ages and measures of their differences based on actual real-world measured zircon ages.To this end, we consider detrital zircon ages from the Paleozoic (Gehrels et al., 2011) and Mesozoic (Dickinson andGehrels, 2009, 2008) succession across the Colorado Plateau, the subject of MDS mapping later in the paper (Table DR1).These data are represented by 6173 zircon ages determined by laser ablation-multicollector-inductively coupled plasma-mass spectrometry among 65 samples (average grains per sample n = 95) extending from the Cambrian Tapeats Sandstone at the base section through the Jurassic Morrison Formation at the top.For each sample, we first calculated their PDP and fixed bandwidth KDE (e.g., Botev, 2007).A total of 65 samples comprise 2080 possible sample pairs; for each of these, we calculate the 6 similarity/dissimilarity metrics noted above (cross-correlation coefficient, K-S statistic, Kuiper's statistic, similarity, likeness, and the Sircombe-Hazelton dissimilarity measure) at a temporal resolution of 1 Ma.
Several conclusions about the choice of age representation and metric of age difference can be drawn from this exercise.First, plots of similarity, likeness, and the Sircombe-Hazelton dissimilarity measure based on PDPs are virtually identical to those based on KDEs (Figs. 2A, 2B, and  2C).The standard deviation of each metric based on KDE pairs compared to that of each metric based on PDP pairs (standard deviation of residuals of the PDP-KDE regressions) is only 2.3%, 2.8%, and 1.6% for the similarity, likeness, and the Sircombe-Hazelton metrics, respectively.
Second, greatest agreement among these measures of pairwise sample differences exists between values of likeness and similarity (Fig. 2D, R 2 = 0.94) and likeness and the Kuiper's test statistic (Fig. 3A; R 2 = 0.85).Values of likeness and the K-S statistic (Fig. 3B; R 2 = 0.77) and likeness and the Sircombe-Hazelton dissimilarity measure (Fig. 3C; R   the cross-correlation coefficient (Fig. 3D; R 2 = 0.67) are the least-well correlated.In other words, when applied to these real-world data, and when compared to likenesses as a measure of pairwise sample differences, similarity yields the greatest agreement, followed by the Kuiper's test statistic and the Sircombe-Hazelton dissimilarity measure, and then the cross-correlation coefficient.In retrospect, such an outcome is perhaps not surprising in that likeness, similarity, and the Sircombe-Hazelton dissimilarity metrics are based on summations of many PDP or KDE differences at some designated scale of resolution (here, 1 Ma).

Evaluating Dissimilarity Measures in MDS
Based on consideration of these real-world data, it seems apparent that representation of sample ages as either PDPs or KDEs has little impact on the veracity of various measures of sample difference.Moreover, based on similarity of values, it seems that likeness and similarity may be the most efficient measures of pairwise sample differences followed by the Kuiper's, the Sircombe-Hazelton, and the K-S statistics; cross-correlation coefficients are the least similar to other measures of sample differences.However, it is not immediately apparent how these measures directly translate into the MDS ordinations that best represent dissimilarity within a data set.Although a more thorough analysis of these various options is beyond the scope of this paper, it is informative to examine the impact of three of these metrics of dissimilarity on MDS mapping.To this end, we consider likeness, the K-S test statistic, and cross-correlation coefficients to evaluate which appears, on the basis of agreement with the other metrics, to represent the best, better, and the poorest measures of sample differences.To do this, we construct a simple, three-component model defined by Gaussian distributions of 100 ± 10, 200 ± 10, and 300 ± 10 Ma components, and mix these in increments of thirds, resulting in 10 populations (Fig. 4).The 2-D MDS map generated from such a model resembles a ternary diagram with each single-component sample, representing 100% of one component, positioned at the vertices and with all remaining sample points representing linearly correlated dissimilarities according to the percentage of each component (Fig. 4).
Using metric MDS and the stress (S) loss function, we produced MDS configurations based on three dissimilarities measures of likeness (Fig. 4A), PDP cross-correlation (Fig. 4B), and the K-S D statistics (Fig. 4C).Of these measures, likeness clearly results in the best ternary approximation of the modeled data set, with nearly uniform spacing between all points.The configurations for PDP cross-correlation and the K-S D statistics both define the components as vertices and have stresses comparable to likeness (S = 0.08); however, both are less accurate representations of the modeled data.For multi-modal populations, cross-correlation results in samples clustering toward the vertices containing the component with the highest proportions and results in unequal spacing among points.For the K-S D statistic, the end-member vertices are established, but the mixed data are clearly displaced toward the 200 Ma component (Fig. 4C).That likeness yields a better ternary approximation of the modeled data set than either cross-correlation or the K-S D statistic is in general agreement with the observation that values of likeness exhibit considerably less PDP-KDE

RESEARCH
scatter than either cross-correlation or the K-S statistic.They are also in accord with the observation that likeness values are closest to calculated values of similarity (Fig. 2D) and the Kuiper's test statistic (Fig. 3A) and are least similar to coefficients of cross-correlation (Fig. 3D).

Effect of Variable Proportions of Shared and Unique Components
Employing likeness as the measure of dissimilarity and stress (S) as the loss function in metric MDS mapping, we can now explore the influences of several variables common to detrital geochronology on MDS.These include the effects of variable proportions of shared and different age components among pairs of age frequency distributions.The effect of shared but unequal proportions of age components is one of the challenges of detrital geochronological data.To demonstrate how this impacts MDS, we generate five sets of population pairings from two age components (100 ± 10 Ma and 200 ± 10 Ma; Fig. 5).For simplicity, we fix one population at equal proportions of the two age components, while components in the second population vary in increments of 5% per MDS analysis.Three 100-age randomly drawn samples are then created for each population and subjected to MDS analysis (Fig. 5A-5E).Based on these five model sets, visual separation of samples in MDS space becomes significantly more apparent when likeness values exceed 0.20.
The sensitivity of MDS to the presence of unique components is also highly relevant to its use as a provenance tool.We consider three age components (100 ± 10 Ma, 200 ± 10 Ma, and 300 ± 10 Ma) and allow the 200 ± 10 Ma component to be shared in equal proportion between the two populations (Fig. 6).The average degree of differentiation among 10 iterations of each model set demonstrates that MDS differentiation only becomes consistently apparent as likeness values exceed ~0.10 (mixtures B and C) and highlights its sensitivity to the inclusion of unique age components (Fig. 6).

n-Dimensionality and Visualizations of MDS
MDS is often characterized as a useful tool by which to evaluate the generalities of large data sets containing multiple components.In practice, n-1 dimensions are required to project the data in MDS space without distortion.However, typical detrital zircon data often contain greater than four components and are never as simple as our synthetic examples.Here, we briefly show how a four-component model projected onto a 2-D MDS surface can result in poorly transformed configurations, and further show how this problem is significantly resolved when scaling to three dimensions.To do so, we use a mixture model based on fourths of the four age components (100 ± 10 Ma, 200 ± 10 Ma, 300 ± 10 Ma and 400 ± 10 Ma), and translate the dissimilarity matrix of this data set into 2-D (Fig. 7A) and 3-D (Fig. 7C) MDS configurations.Theoretically, a properly translated four-component data set plot should resemble a tetrahedron with equally spaced vertices.Distances between points should be some multiple of one-fourth the distance between any of the four vertices.
Two-dimensional MDS projects the tetrahedron into the two available dimensions.In this instance, nearest neighbor lines seem directed toward one of the tetrahedron's corners, the projected peak of the pyramid (Fig. 7A).The poor representation of the four components in 2-D is apparent in the Shepard plot (Fig. 7B) and the elevated (poor) stress value (S = 0.23).While the vertices do indeed represent each of the four age components, the placement of samples relative to those endmembers is highly suspect, as the distances between points do not correlate well with dissimilarities

RESEARCH
(Fig. 7A).Conversely, a 3-D MDS translation of the same data reproduces the predicted pyramidal shape and uniform spacing of sample points.The stress value is reduced, as will always occur with increasing dimensionality (Kruskal and Wish, 1978), to S = 0.9.The Shepard plot (Fig. 7D) also shows the far better correlation between distances and dissimilarities, thus highlighting the more appropriate dimensionality for the data.What is clearly demonstrated here is that despite the simplicity of the data set, the 3-D projection quickly outpaces the 2-D projection most typically used in detrital provenance studies.If the desired result of a detrital zircon provenance study is to characterize a single shift in provenance (e.g., Vermeesch, 2013), increased dimensionality and low S values might not be necessary.In more complex cases, the consequences of distortion related to 2-D or 3-D projection of n-dimensional data require careful consideration.

PHANEROZOIC EXAMPLES FROM NORTH AMERICA
While modeled data provide a valuable baseline against which to visualize and evaluate MDS mapping, we conclude by considering MDS analyses of three real-world data sets.Our intent here is to demonstrate the utility of the approach in highlighting sample age variation in both time and space.However, we do not intend to expand upon the previously published data sets nor published conclusions, merely present how visual representations can provide clarity within detrital data.The particulars of such differences would, of course, then serve as the basis for further investigation.In each instance, we represent zircon age data as a colorshaded age frequency "heat map" developed on an interpolated surface comprising PDPs ordered by stratigraphic position (e.g., the Colorado Plateau), relative age (e.g., the Bighorn Basin), or geographic position (e.g., the Gulf Coast).Such representations allow for easy visualization of changes in different age components in time and space.In addition, we ordinate the MDS maps to best represent the relative sense (in time or space) of inter-sample differences.

The Colorado Plateau
Remarkable exposures of Paleozoic and Mesozoic strata across the Colorado Plateau offer a nearly unparalleled opportunity to examine changes in detrital zircon provenance of Cambrian through Jurassic sandstones across of the southwestern United States.Recent LA-ICP-MS determinations of 2529 zircon U-Pb grain ages from 26 Paleozoic samples (Gehrels et al., 2011) and 3444-grain ages from 36 Triassic and Jurassic samples (Dickinson andGehrels, 2008, 2009) afford an excellent opportunity to compare more traditional representations of grain ages (Fig. 8A) with MDS plots (Fig. 8B).Detrital zircons from Cambrian through Devonian strata yield mainly 1.7 and 1.4 Ga ages derived from basement rocks of  the Yavapai-Mazatzal Province and the Amarillo-Wichita uplift, respectively (Gehrels et al. 2011).Mississippian through Permian rocks contain predominantly 1.1 Ga and younger Paleozoic grains derived from the Grenville and Appalachian orogen, respectively.Mesozoic sands record a greater number of age components but exhibit little directional upsection stratigraphic/secular variation (Dickinson andGehrels, 2008, 2009).With multiple age components throughout the data set that vary in their presence and proportions upsection, as well as the relatively large number of samples, a 3-D MDS plot affords an opportunity to view clear changes in provenance within the stratigraphic sequence in a single plot.MDS mapping of the 65 samples reveals that differences among samples, as revealed on axis one, largely correlate with stratigraphic position (Fig. 8B).Sample ages also define three general clusters, which correspond closely with age groups that are discerned qualitatively from sample age abundances.In particular, the three clusters in the MDS visualization (Fig. 8B) are almost exactly the same as the groupings of Cambrian through Devonian, Mississippian through Permian, and Mesozoic that are apparent in the PDP heat map (Fig. 8A; Gehrels et al. 2011;Dickinson andGehrels, 2008, 2009).Moreover, the samples that exhibit somewhat anomalous distances in MDS mapping are also quickly identifiable in the sample PDPs.Among these are an early Jurassic (Moenave Formation, Utah) sample which contains an anomalously high concentration (40 of 93 grains, 43%) spanning only 40 Ma of the late Paleozoic (202-242 Ma; labeled 1 in Fig. 8B), and two Late Cretaceous samples (northeastern New Mexico and southwestern Utah) that are dominated by ca.1.7 Ga Yavapai-Mazatzal ages and are the only two Mesozoic samples in this compilation containing significant pre-Grenville age grains (labeled 2 in Fig. 8B).While the specifics of the geologic history of these localities have surely served to give rise to these deviations, it is their clear deviance in MDS that serves to bring their differences into sharper view.

The Bighorn Basin
Dissimilarities in abundances of different age components in any population of zircon grains potentially reflect changes in the tectonic history of basins or of variations in climate and sea level, all of which possibly impact paleogeography and sediment dispersal patterns.May et al. (2013) compiled and qualitatively interpreted zircon age data from Phanerozoic units exposed in the Bighorn Basin of Wyoming to better understand the tectonostratigraphic evolution of that region.Based on 4104 U-Pb detrital zircon ages from 44 samples, May et al. subdivided Upper Mississippian through Paleocene units into four "tectonostratigraphic assemblages" based on perceived patterns in sample zircon age frequency distributions: (1) a Paleozoic-Triassic proximal continental margin assemblage; (2) a Jurassic-early Cretaceous (early Albian) assemblage associated with organization of Cordilleran orogeny; (3) a late Cretaceous (late Albian through Maastrichtian) interior seaway foreland basin assemblage; and (4) a Paleogene assemblage accumulated during structural segmentation during the Laramide orogeny (Fig. 9A).Because their interpretations are based on qualitative inspection of probability density functions, it serves as an excellent case study to which to apply MDS.A 3-D MDS plot should prove adequate in resolving such a four-component system.
Quantitative evaluation by MDS largely confirms May et al.'s (2013) visual examination of sample PDPs with three of the four proposed assemblages being well-resolved (Fig. 9B).However, the posited fourth assemblage, comprising three Paleogene samples, does not separate as a distinct group or cluster.Each of the resolved assemblages contains either a unique component or component with anomalously high proportions.The samples of the fourth assemblage, however, share components with each of the stratigraphically older assemblages, likely contributing to its poor differentiation.In addition, MDS mapping reveals that the population differences differentiating the first (Paleozoic) and second (lower Mesozoic) assemblages are smaller than those distinguishing the second from the third assemblage (upper Cretaceous).Qualitative differences among sample PDPs support this conclusion; abundant Grenvillian (ca.1.1 Ga) grains persist upsection through Paleozoic and lower Mesozoic units but are largely absent above the early Albian Greybull Sandstone at the top of association (sample 22, Fig. 9A).
In addition to largely verifying the tectonostratigraphic assemblages identified by May et al. (2013), MDS provides additional insights into the interpretation of detrital zircon data.It serves to identify those samples within any assemblage that are least dissimilar to those in any other, and it allows for a visualization of the relative degrees of difference between groups of samples; both provide motivation for further examination of spatial and temporal attributes that might not be apparent in the absence of MDS.

The Gulf Coast
In addition to better understanding secular differences among various zircon samples, such as those upsection in the Colorado Plateau and Bighorn Basin, MDS should be equally useful in the discrimination of lateral heterogeneities within stratal units of similar age.This approach has been explicitly taken by Xu et al. (2017) with respect to early Miocene sediment supply to the Gulf of Mexico Basin.Here, we conclude by examining age data from samples of the Paleocene Wilcox Formation collected along ~2000 km of outcrop belt, extending northwest from western Alabama to the head of the Mississippi embayment south of St. Louis, Missouri, and then southwest toward San Antonio, Texas (Blum and Pecha, 2014).
Cenozoic-scale persistence of several major fluvial axes supplying sediment to the northwestern Gulf of Mexico coastal plain (e.g., Galloway et al., 2011) served to impart prominent along-strike variation in the dominance of the same major age components observed comprising grains in the Colorado Plateau and Bighorn Basin (Fig. 10A).Blum and Pecha (2014) report 2564 ages from 27 along-strike samples of the Wilcox Formation.Dominant age components consist of ca.1.7 Ga Yavapai-Mazatzal, ca.1.4 Ga Amarillo-Wichita, and younger Western Cordillera (ca.170 Ma) and Laramide (ca.60-75 Ma) province grains along the western third of the belt, and ca.1.1 Ga Grenvillian grains along the eastern twothirds of the section (Fig. 10A).Once again, the multiple components and numerous samples provide an excellent application of 3-D MDS configurations; 2-D will not adequately represent the component variance of the detrital data set.MDS mapping of these age data results in an excellent agreement between qualitative assessments of lateral variations based on PDP inspection and variations based on quantified differences (Fig. 10B).Variation in the MDS distribution of these 27 samples corresponds closely with position along the outcrop belt.Eastern samples are dominated by Grenville grains and are more closely clustered than other samples; more western samples are noticeably more dispersed, an aspect reflecting the incorporation of a larger number of components (Fig. 10A).

CONCLUSIONS
MDS has the potential to be a valuable resource for detrital geochronology.Like any tool, proper recognition of the strengths and limitations of the technique is necessary if valid conclusions are to be drawn from MDS data configurations.Our analyses suggest the following.(1) In a context of input representations of age frequencies for the derivation of "sameness" metrics, both PDPs and KDEs serve equally well.Although the latter may afford some better representation on means and durations

RESEARCH
of zircon generation (crystallization) episodes, it affords no advantage over the former insofar as quantifying pairwise sample age differences.
(2) Among the more commonly used metrics of dissimilarity, the likeness and similarity yield the most consistent values when calculated from probability and kernel density estimates; values of the Kuiper's test statistic and the Sircombe-Hazelton dissimilarity measure are more variable; the Kolmogorov-Smirnov statistics and cross-correlation coefficients exhibit the least agreement with other sameness metrics.(3) Among likeness, the K-S statistic, and cross-correlation coefficients, use of the latter two as dissimilarity measures can significantly distort MDS configurations while the use of likeness results in a strong correlation between percentages of age component incorporation into various populations and their sample-tosample dissimilarity.(4) MDS is more sensitive to the presence of unique age components than to variable proportions of shared components with differences in as little as 5% leading to appreciable MDS differentiation in simple low component samples.MDS is also relatively sensitive to degrees of overlap of different age modes.( 5) More than two dimensions may be required to properly transform the dissimilarity matrix.More sources and more components require increasing dimensions to effectively resolve variation in MDS, and insufficient dimensions can lead to oversimplified solutions where inter-sample distances do not accurately represent intersample dissimilarities.( 6) The application of MDS, in conjunction with other analytical tools, serves to improve the accuracy of interpretations and conclusions of detrital zircon data.
Application of MDS mapping to the three data sets representing sediment accumulation in the North American platform, intermontane basin, and passive margin settings reveals that this approach provides informative visualizations of relative differences among sample zircon ages and the degree to which these differences can be related to their disposition in time and space.In each application, MDS configurations in three dimensions improved the resolving power due to the complexities of the detrital samples.MDS affords an important tool for understanding the nature of the distributions of zircon ages and ultimately, to the geologic processes and histories reflected in these data sets.

Figure 1 .
Figure 1.Four representations of intra-sample variations in detrital zircon ages, in this case within a sample from the top of the Devonian Temple Butte Formation (n = 100) exposed in the Grand Canyon (Gehrels et al., 2011).(A) Age frequency distribution binned at 5 Ma.(B) Probability density plot (PDP), where peaks (e.g., ca.1430 and 1725 Ma) reflect abundant (age component) grains presumably derived from different geologic provinces (e.g., Amarillo-Wichita and Yavapai-Mazatzal).(C) Kernel density estimation (KDE) derived using a fixed bandwidth of 40 Ma (Botev, 2007).(D) Best-fit Gaussian distributions to four of the apparent age components (1722, 1426, 1913, and 509 Ma), which account for 99% (88.3%, 3.4%, 4.6%, and 0.3%, respectively) of the observed variance in zircon ages for this sample.

Figure 2
Figure 2. (A, B, and C) Comparison of similarity, likeness, and the Sircombe-Hazelton dissimilarity metric among detrital zircon age data determined for sample probability density plots (PDPs; x-axes) versus sample kernel density estimations (KDEs; y-axes).Points are measures of difference between each of 2080 possible pairs of 65 samples from the Paleozoic (Gehrels et al., 2011) and Mesozoic (Dickinson and Gehrels, 2009, 2008) succession extending across the Colorado Plateau.Note that for each of these three metrics, similarity/dissimilarity values are virtually identical for PDP and KDE representations of sample ages.(D) Comparisons of values of PDP likeness and of PDP similarity among the 65 Colorado Plateau samples (R 2 = 0.94).
Figure 4. Two-dimensional metric multidimensional scaling (MDS) images employing different measures of dissimilarity (δ) based on variable mixing of three components (100 ± 10, 200 ± 10, and 300 ± 10 Ma) at 33.3% intervals.Rectangular insets are probability curves representing population ages for the nearest point in the MDS plot.Nearest neighbor lines (with respect to calculated dissimilarity) are solid back; second nearest neighbors are connected with dashed gray lines.S = stress value.(A) The central ternary shows the MDS map using likeness (LK).(B) The same model data set and parameters but using the crosscorrelation coefficient (CC) as the measure of dissimilarity.(C) As A and B, but using the Kolmogorov-Smirnov maximum difference between cumulative distribution functions (K-S D statistic) as the measure of dissimilarity.Note that among these three metrics, only likeness (A) produces the theoretically correct (and desired) configuration.

Figure 6 .
Figure 6.Models incorporating variable proportions of three age components.Probability curves affording three possible mixtures (columns) of two (upper and middle rows) bimodal age distributions with variable contributions of 100 ± 10, 200 ± 10 Ma, and 300 ± 10 Ma.LK = likeness values between top and middle row frequencies.Averages of 10 metric multidimensional scaling (MDS) results (bottom row), each comprising three random, 100-age draws from each of the two distributions in each of the five sets of mixtures; S = Stress.Note that actual sample differences manifest as MDS separation only becomes apparent as likeness approaches a value of ~0.1 (mixtures B and C).

DFigure 7 .
Figure 7. Two-versus three-dimensional multidimensional scaling (MDS) for four component mixtures.(A) Two-dimensional MDS for a four-component (100 ± 10, 200 ± 10, 300 ± 10, and 400 ± 10 Ma) mixture model (solid and dashed gray lines = nearest and next-nearest neighbors).(B) Shepard plot for two-dimensional MDS.The poor linearity between dissimilarities and distances indicates poor translation of the dissimilarity matrix.(C) Three-dimensional MDS for the four-component mixture model (circle sizes are scaled to reflect distance of the circle from point of view of viewer; smaller is more distant).(D) Shepard plot for three-dimensional MDS.The improved fit demonstrates the value of the added dimension in resolving sample differences.

Figure 8 .
Figure 8. Detrital zircon ages from the Paleozoic (Gehrels et al., 2011) and Mesozoic (Dickinson and Gehrels, 2009, 2008) succession across the Colorado Plateau.(A) Age probability densities (PDP) of 65 samples (gray horizontal lines) ranging from the Cambrian Tapeats Sandstone at the base through the Jurassic Morrison Formation at the top.Predominant modal age components as vertical tan lines.(B) Multidimensional scaling (MDS) plot of sample age likenesses colored by stratigraphic height.Sample points are scaled for distance into the page.Solid and dashed gray lines indicate nearest and next-nearest neighbors.Three sample clusters are: (stratigaphic base) dominated by ca.1.7 Ga Yavapai-Mazatzal and ca.1.4 Ga Amarillo-Wichita province grains (below ~420 m); (stratigraphic center) dominated by ca.1.7 Ga Yavapai-Mazatzal and ca.1.1 Ga Grenvillian grains (between ~420 and 1100 m); and (stratigraphic top) dominated by ca.1.1 Ga Grenvillian and younger province grains (above ~1100 m).Note several obvious anomalies in both probability density plot (A) and MDS (B) visualizations.These are an early Jurassic Moenave sample that contains numerous late Paleozoic grains ("1" and blue arrow in A and B), and two Late Cretaceous samples that are dominated by ca.1.7 Ga Yavapai-Mazatzal ages ("2" and red arrows in A and B).(C) Shepard plot for the three-dimensional MDS.

Figure 9 .Figure 10 .
Figure 9. Upper Mississippian to Paleocene detrital zircon ages from the Bighorn Basin, Wyoming (May et al., 2013).(A) Probability densities (PDP) of 44 samples (black horizontal lines).Dominant age components as vertical tan lines.(B) Multidimensional scaling (MDS) plot of sample age likeness colored by tectonostratigraphic assemblage membership as inferred by May et al. (2013).As in previous figure, sample points are scaled to distance into the page and nearest neighbor lines are represented as solid and dashed gray lines.Three of the four assemblages are well-resolved by MDS: (1) Upper Mississippian-Triassic samples (blue bar in A and points in B), (2) Jurassic-lower Albian samples (green bar in A and points in B), and (3) upper Albian-Cretaceous samples (yellow bar in A and points in B).A posited fourth assemblage comprising three Paleogene samples (red bar in A and points in B) does not well segregate as a distinct population.The persistence of ca.1.1 Ga Grenvillian grains visible in A creates a closer association between the upper Mississippian to Triassic (blue) and Jurassic to lower Albian (green) in MDS than the upper Albian to end Cretaceous.(C) Shepard plot for the three-dimensional MDS.
2 = 0.77) are somewhat less similar, and agreement between values likeness and