The (U-Th)/He dating technique is an essential tool in Earth science research with diverse thermochronologic, geochronologic, and detrital applications. It is now used in a wide range of tectonic, structural, petrological, sedimentary, geomorphic, volcanological, and planetary studies. While in some circumstances the interpretation of (U-Th)/He data is relatively straightforward, in other cases it is less so. In some geologic contexts, individual analyses of the same mineral from a single sample are expected to yield dates that differ well beyond their analytical uncertainty owing to variable He diffusion kinetics. Although much potential exists to exploit this phenomenon to decipher more detailed thermal history information, distinguishing interpretable intra-sample data variation caused by kinetic differences between crystals from uninterpretable overdispersion caused by other factors can be challenging. Nor is it always simple to determine under what circumstances it is appropriate to integrate multiple individual analyses using a summary statistic such as a mean sample date or to decide on the best approach for incorporating data into the interpretive process of thermal history modeling. Here we offer some suggestions for evaluating data, attempt to summarize the current state of thinking on the statistical characterization of data sets, and describe the practical choices (e.g., model structure, path complexity, data input, weighting of different geologic and chronologic information) that must be made when setting up thermal history models. We emphasize that there are no hard and fast rules in any of these realms, which continue to be an important focus of improvement and community discussion, and no single interpretational and modeling philosophy should be forced on data sets. The guiding principle behind all suggestions made here is for transparency in reporting the steps and assumptions associated with evaluating, integrating, and interpreting data, which will promote the continued development of (U-Th)/He chronology.

Over the last quarter-century, the field of (U-Th)/He chronology has been transformed by improved understanding of the fundamentals of the technique, the development of standard analytical workflows that are relatively straightforward to implement in labs, and the availability of tools to help interpret the significance of data. Together, this transformation has enabled an explosion of applications. Depending on the circumstance, the (U-Th)/He method can be applied as a thermochronometer to decipher the thermal history of a rock, as a geochronometer to constrain mineral crystallization and the age of distinct geologic events, or in detrital studies to characterize the thermal history of sedimentary basins and source regions (Fig. 1). Conventional and emerging directions include constraining paleotopography, landscape evolution, tectonic exhumation along normal faults, and erosional exhumation at the local and orogenic scale (refer to Reiners et al., 2018, for examples), as well as detrital studies to decipher sediment provenance (e.g., Stockli and Najman, 2020), Fe oxide investigations to address fault zone processes and weathering histories (e.g., Shuster et al., 2005; Cooperdock and Ault, 2020; dos Santos Albuquerque et al., 2020), extraterrestrial material studies to infer impact histories (e.g., Kelly et al., 2018; Tremblay and Cassata, 2020), dating of tephras and other volcanic products to quantify the ages of volcanic eruptions (e.g., Danišík et al., 2012, 2021), and deep-time applications to decipher near-surface histories over hundreds of millions of years (e.g., McDannell and Flowers, 2020; Peak et al., 2021).

This rapid progress has generated strong needs for data integration, representation, statistical characterization, interpretation, and modeling approaches that are flexible and suitable for a diverse array of data sets. These methodologies are under continuous development. Challenges include how to evaluate and combine multiple analyses from the same material. Reproducible (U-Th)/He dates are expected for many samples, such as in geochronologic applications and thermochronologic studies in quickly cooled settings. However, in other cases, individual aliquots of the same mineral from the same sample (or multiple samples that underwent the same thermal history) are expected to yield dates that differ beyond the analytical uncertainties due to factors such as radiation damage and crystal size that cause crystals to have different He diffusion kinetics (e.g., Reiners and Farley, 2001; Fitzgerald et al., 2006; Flowers and Kelley, 2011; Brown et al., 2013; McDannell et al., 2018). Although this type of intra-sample dispersion (referred to here as “date variation”) is potentially beneficial for data interpretation if kinetic variability is sufficiently well understood, it also can be obscured by other factors that are unnoticed, unmeasured, or currently unquantifiable and cause systematic bias or scatter in the data (referred to here as “overdispersion” or “scatter”). This possible complexity in data distributions depending on time-temperature (tT) path and/or crystal characteristics can complicate the statistical representation of data sets. For example, it is not always appropriate to combine analyses from a sample using some measure of central tendency (such as a mean sample date), because these summary statistics assume that the population is normally distributed, which is not true for samples characterized by substantial date variation.

In addition, differing thermal history modeling philosophies have been developed. Modeling is an interpretational process, and the application of different strategies can yield different types of insights into data from different geologic settings. When such modeling is used to help decipher thermochronologic data sets, interpretive choices must be made. These include how to assimilate data and their uncertainties, how to weight and input different types of chronologic and geologic information, how to permit a geologically reasonable level of complexity for tT path solutions, and how to define the model structure based on the sample context as well as the questions to be explored and hypotheses to be tested with the results. These decisions directly impact the model outcomes and the implications inferred from them. It is possible for different conclusions to reasonably be reached from the same data set depending on the approach and decisions made while setting up models.

Our goal here is not to resolve all of these complex issues, but rather to capture the current state of thinking on these topics, to outline the considerations when evaluating and integrating individual aliquot data and when assimilating data into thermal history modeling, to advocate against forcing any specific interpretational and modeling philosophy on data, and to instead promote transparency in reporting these practices so that these methods can continue to progress in tandem with the needs of different types of data sets. We first review sources of interpretable intra-sample date variation as well as causes of overdispersion (section 2); provide some suggestions for evaluating results in thermochronometer, geochronometer, and detrital studies (section 3); discuss considerations and statistical approaches when integrating analyses (section 4); summarize the decisions to be made when setting up thermal history models (section 5); and conclude by describing future directions (section 6). Most (U-Th)/He data have been acquired for apatite and zircon using the conventional single crystal methodology, so we accordingly focus this manuscript on these data types. The companion paper provides essential background for this contribution by describing the fundamentals of the method; dateable minerals; how individual aliquot data are acquired; the process and choices associated with data reduction; and recommended reporting practices for individual aliquot (U-Th)/He, kinetic, 4He/3He, and continuous ramped heating data (Flowers et al., 2022). This manuscript covers considerations associated with the subsequent steps along the path from individual analyses to data interpretation. We refer the reader to recent reviews that provide numerous examples of applications (Reiners et al., 2018; Ault et al., 2019; Tremblay et al., 2020). Our primary aim here is to provide the reader with practical knowledge to help decipher different types of (U-Th)/He data sets and to enhance the clear presentation of the interpretational steps in published products.

Multiple factors can cause (U-Th)/He dates from a sample to vary beyond their analytical uncertainties. This section, Figure 2, and Table 1 summarize possible contributors to data dispersion; their effects on the data; if the magnitude of dispersion is affected by the nature of the thermal history; and if/how the causes can be detected, exploited, or avoided. These factors include kinetic effects (e.g., due to radiation damage, grain size; section 2.1), U-Th zonation (section 2.2), grain fragmentation and abrasion (section 2.3), He implantation (section 2.4), and mineral and fluid inclusions (section 2.5). We then highlight the role of tT paths in data dispersion (section 2.6). We especially emphasize how to diagnose if a given factor influences a (U-Th)/He data set and if the factor can be leveraged in thermal history interpretation.

2.1. Date Variation Caused by He Diffusion Kinetic Variability

Date variation can be caused by effects that introduce differences in He diffusion kinetics (and thus closure temperature, or TC value) among grains from a single sample or from multiple samples that underwent the same thermal history. These effects can expand the thermal history information accessible by the data. Their influence can be substantial for samples that underwent protracted types of tT paths but are minimized for samples in geochronometer studies and from young, fast-cooled settings (section 2.5). Figures 34 illustrate how radiation damage and grain size influence apatite and zircon (U-Th)/He dates for tT paths characterized by fast cooling, protracted cooling through the He partial retention zone (HePRZ), and reheating into the HePRZ. See section 3 of the companion paper (Flowers et al., 2022) for a more complete summary of how each factor is thought to mechanistically control He diffusion; the main focus here is on the data patterns generated by each effect.

2.1.1. Radiation Damage

A variety of work has demonstrated that the accumulation of radiation damage due to radioactive decay has a substantial effect on the He diffusivity of apatite and zircon (Fig. 2A; see section 3.3 of companion paper for a summary, Flowers et al., 2022; e.g., Shuster et al., 2006; Guenthner et al., 2013). Plots of date versus eU provide a means to evaluate the influence of radiation damage on data from the same sample (or from samples with the same tT path), because eU is a proxy for damage for grains with the same thermal history (Figs. 34, plots in the middle column; Shuster et al., 2006; Flowers et al., 2007). This is because total accumulated radiation damage is due to both eU (representing the mineral’s alpha productivity) and the time and conditions of damage accumulation and annealing. For minerals that underwent the same damage accumulation and annealing conditions (and with the same damage accumulation and annealing behavior), eU becomes a useful damage proxy. In some cases, radiation damage variation across a mineral suite with the same tT path can generate positive (apatite and zircon; Figs. 3B3C and 4B) and/or negative (zircon; Fig. 4C) correlations between date and eU. Date-eU plots have become a standard approach for representing data graphically, both because of the factors that influence diffusion kinetics, radiation damage has the greatest potential leverage (with tens to >100 °C variation in single-mineral closure temperatures possible), and because this factor can be exploited in thermal history interpretation.

Radiation damage and its associated effect on He diffusivity are evolving properties of crystals. Longer intervals of damage accumulation more strongly impact an individual mineral’s He retentivity and allow for greater divergence of He retentivities across minerals characterized by variable eU and thus variable damage generation rates. These effects are also magnified by higher eU (one reason why radiation damage affects zircon more profoundly than apatite) and greater eU variation in the crystal suite. Together, these factors increase the probability that a date-eU correlation will be observed among grains from a sample that experienced a given tT path. Note that detrital samples that did not undergo full post-depositional resetting will likely have additional scatter on date-eU plots because of differing predepositional thermal histories (e.g., Flowers et al., 2007; Guenthner et al., 2015).

He diffusion kinetic models that track the evolution of He diffusion as a function of the accumulation and annealing of radiation damage are available for both apatite and zircon (e.g., Flowers et al., 2009; Gautheron et al., 2009; Guenthner et al., 2013; Gerin et al., 2017; Willett et al., 2017). Their improvement is an important focus of ongoing work. These models enable the effects of radiation damage on (U-Th)/He data sets to be simulated quantitatively (see section 5), and allow for thermal history information to be extracted from date-eU relationships. The use of these kinetic models requires the input of the parent isotope concentrations for the simulated minerals, so that the rates of radiation damage accumulation can be tracked.

2.1.2. Crystal Size

He loss is controlled by the domain size, which for apatite and zircon is usually considered to be the size of the physical crystal itself (Fig. 2B; see section 3.2 of companion paper, Flowers et al., 2022; e.g., Farley, 2000; Reiners and Farley, 2001). Crystal size is typically represented by the equivalent spherical radius (RS), which is the radius of a sphere with the same surface-area-to-volume ratio or the same alpha-ejection (FT) correction as the simulated grain. Plots of date versus RS (Figs. 34, plots in the right column) can be used to assess if crystal size variation influences data dispersion for samples in which whole crystals were analyzed and the crystal represents the diffusion domain (this is not necessarily true for fragments or detrital minerals). This effect can generate positive correlations between date and RS for protracted tT paths (Figs. 3B3C and 4B4C) and can cause closure temperatures to vary by ~10 °C for common grain size differences. Unlike radiation damage, crystal size is not an evolving property of crystals, with detrital grains being possible exceptions.

Because radiation damage can exert much greater control on He diffusivity than size, date-eU correlations may be present in the data while positive date-crystal size patterns are not (e.g., Weisberg et al., 2018a). For zircon, variation in crystal aspect ratio may introduce dispersion on date-size plots because of anisotropic diffusion (see companion paper, Flowers et al., 2022). Negative date-size correlations can even occur owing to eU-size relationships among the dated grains. For example, negative eU-size correlations in apatite (e.g., Reiners et al., 2018) and positive eU-size correlations in zircon (e.g., Baughman and Flowers, 2020) may generate negative date-size patterns. In some cases, it is possible for thermal histories to create size-date correlations in the absence of date-eU patterns if crystal size varies widely across a mineral suite but damage does not, due to a limited eU range and/or short damage accumulation time (e.g., Reiners et al., 2018). For detrital grains with pre-depositional tT paths and abrasional rounding histories that differ, distinct patterns may not emerge on date-size plots. Like radiation damage, the effect of crystal size on He diffusion is included in kinetic model frameworks used for thermal history interpretation.

2.1.3. Major Element Chemistry

Chemical composition has the potential to influence diffusional characteristics by modifying the mineral lattice structure and annealing properties (see section 3.4 of companion paper, Flowers et al., 2022; e.g., Gautheron et al., 2013; Djimbi et al., 2015). However, the importance of this factor for He diffusion in different minerals is not yet well constrained. The effect of chemistry on annealing kinetics is best-understood for apatite, where the magnitude depends on apatite chemistry, chemical variability across the apatite suite, and the tT path. The crystal chemistry effect on annealing properties is included in apatite kinetic models (e.g., Flowers et al., 2009; Gautheron et al., 2009) and thus can be incorporated in thermal history models if apatite chemistry is measured.

2.1.4. Potential for Trapping in Coarser Defects

Crystal imperfections such as microfractures, dislocations, void spaces, and vacancies have the potential to trap He in the crystal and bias results to older dates (see section 3.5 of companion paper, Flowers et al., 2022; e.g., Danišík et al., 2017; Gerin et al., 2017; Zeitler et al., 2017). These features are not easily detectable with techniques that are currently part of the workflow of most (U-Th)/He labs, and are not included or interpretable within current kinetic model frameworks. However, identification of anomalous diffusivity behavior due to crystal defects in individual grains is possible using continuous ramped heating approaches (e.g., Idleman et al., 2018; McDannell et al., 2018), which hold opportunity for more routine use to fingerprint and discard such crystals from data sets. Avoiding the analysis of minerals from highly deformed rock samples may help minimize the probability of this trapping effect. The degree to which the trapping effect influences (U-Th)/He dates is an area of active research.

2.2. Dispersion Caused by U-Th-Sm Zonation

Parent nuclides (U, Th, and Sm) are generally assumed to be homogeneously distributed in crystals dated by (U-Th)/He, but many grains do not conform to this simplified assumption (Fig. 2C). Zonation of parent nuclides has the potential to affect (U-Th)/He dates by complicating the FT correction (e.g., Meesters and Dunai, 2002a, 2002b; Hourigan et al., 2005), modifying the shape of the He diffusion profile, and creating intracrystalline domains with varying radiation damage fractions and He diffusivities, thus affecting the bulk crystal retentivity (e.g., Farley et al., 2011; Danišík et al., 2017). Typically, grains with rims enriched in eU will yield dates younger than those of unzoned crystals, while grains with enriched cores will yield dates older than those of their unzoned counterparts.

The scale of the different zonation effects on individual dates and on intra-sample data dispersion depends on zonation magnitude, zonation pattern, and the variability of zonation magnitude and pattern among the grains dated from a sample. Thermal history also plays a role in whether zonation induces data dispersion via diffusion profile modification and heterogeneous intracrystalline radiation damage effects (see section 2.6). Parent nuclide zonation is generally a significant problem for FT corrections only when zonation causes most eU to be concentrated either within 15 mm of the rim or more than 15 mm from the rim. In apatite, severe systematic zonation of this type is uncommon, and typical zonation magnitudes cause FT correction inaccuracies of <2%–5% (Farley et al., 2011; Ault and Flowers, 2012). In zircon, however, substantial zonation is more common. While fine-scale oscillatory zonation is less problematic, other zonation patterns can cause larger inaccuracies, particularly in the case of metamorphic overgrowths (Hourigan et al., 2005; Orme et al., 2015). If zonation patterns among grains from a sample are not systematic, but instead vary in pattern and magnitude between grains, this is likely to increase the overall dispersion of the data by biasing individual dates to both older and younger than the unzoned case rather than skewing the dates unidirectionally (e.g., Farley et al., 2011; Ault and Flowers, 2012).

Although parent isotope zonation in crystals is only rarely characterized quantitatively prior to (U-Th)/He analysis, moving forward there would be value in its more routine integration in the dating workflow, especially for tT histories and minerals where zonation is likely to be most problematic. A qualitative sense of zonation pattern and magnitude can be obtained by cathodoluminescence, backscatter, or Raman imaging, as well as by fission-track etching (for U). However, quantitative data are needed to account for zonation effects. Such data are typically acquired via laser ablation–inductively coupled plasma–mass spectrometry (LA-ICP-MS) mapping or drilling techniques (e.g., Hourigan et al., 2005; Farley et al., 2011; Johnstone et al., 2013; Danišík et al., 2017). Once parent isotope zonation is characterized, (U-Th)/He dates can be corrected with zoned FT values (e.g., Hourigan et al., 2005; Farley et al., 2011), and thermal history modeling can use the observed zonation to properly account for the evolution of He diffusion profiles and intracrystalline damage for a given tT path (e.g., Meesters and Dunai, 2002a, 2002b; Ketcham, 2005).

2.3. Dispersion Caused by Fragmentation and Abrasion

2.3.1. Grain Fragmentation

While the ideal practice is to analyze only euhedral and intact crystals, sometimes studies include the analysis of crystal fragments (i.e., fragments of larger crystals that were broken after development of the diffusion profile, such as during mineral separation; Fig. 2D, Brown et al., 2013; Beucher et al., 2013). For grains without parent isotope zonation and that have cooled rapidly, the distribution of He should be uniform within the crystal (except near the grain edge due to ejection). Consequently, there is no core-to-rim variation in the corrected (U-Th)/He date, and dates will not vary between fragments (if it is possible to make appropriate FT corrections, which is not always straightforward). However, for grains that have experienced a protracted thermal history such that a rounded diffusion profile developed, dating fragments of the crystal that capture the grain edge will yield different dates than fragments from the grain interior (Fig. 2D; even for crystals without parent isotope zonation and even if appropriate FT corrections are made). This arises because the grain has an intracrystalline gradient in core-to-rim date that is younger toward the grain edge. The influence of this “fragment effect” can be substantial and is dependent on the size, geometry, and eU distribution within the initial crystal and also, importantly, on the nature of the thermal history experienced. The more protracted the thermal history, the more rounded the diffusion profile becomes, which will intensify this effect. The date variation from fragmentation alone may vary from ~7% even for rapid (~10 °C/Ma), monotonic cooling to over 50% for protracted, complex histories that cause significant diffusional loss of He. The magnitude of dispersion arising from fragmentation scales with the grain’s cylindrical radius and is of a similar magnitude to dispersion expected from differences in absolute grain size alone (with RS values varying from 40 μm to 150 μm).

The fragment effect on date variation can be usefully harnessed in terms of thermal history reconstruction because it is sensitive to the thermal history experienced, and its effect can be calculated explicitly. It can therefore be included in thermal history optimization algorithms. Because of the sensitivity to the unknown thermal history of the sample, unlike the FT correction, which is entirely geometrical, this effect cannot be determined a priori and used to “correct” dates for dated fragments. This raises some important issues in terms of integrating dates determined on fragments into a quantitative interpretation. If care is taken to select, carefully measure, and fully characterize fragments properly (e.g., dimensions, number of terminations, geometry), then the date variation caused by the fragment effect can be exploited and used to constrain thermal history information. The corollary, though, is that dates determined on fragments cannot be interpreted robustly without recourse to some form of thermal history modeling that does explicitly include the fragmentation effect. Therefore, in many situations, it may be simpler to avoid analysis of fragments altogether.

2.3.2. Natural Grain Abrasion in Detrital Samples

Detrital grains are often abraded during fluvial or aeolian transport and therefore can lose some portion of the grain margins affected by alpha ejection and diffusive He loss (Fig. 2E). In the case of detrital samples that underwent complete He loss and damage annealing after deposition during burial reheating, grain abrasion effects become irrelevant, and the grains can be treated like those from a bedrock sample (see section 3.3). However, for detrital samples that experienced only partial or no resetting after deposition (section 3.3), the grain abrasion effect can significantly influence date dispersion and accuracy, as different grains had varying amounts of the grain exteriors removed during transport.

It is generally difficult to quantify the abrasion effect in unreset or partially reset samples due to uncertainties in the amount of grain abrasion and the time when abrasion occurred. However, the geologic context, depositional setting, sediment provenance, U-Pb dates for double-dated crystals, and grain shapes may provide some insights into the extent of date dispersion due to abrasion. In general, abrasion causes the preferential loss of terminations, facets, edges, and corners from the crystal and causes the crystal surfaces to become pitted and appear dull and less transparent. This, in turn, obscures cracks, inclusions, and other impurities during grain selection, which makes high-quality crystal selection more challenging. Apatite is more affected by abrasion than zircon because apatite is less mechanically and chemically durable. This can result in detrital mineral separates containing both well-rounded apatite and clear euhedral zircon. For the same reason, detrital samples sourced from regions of sedimentary rocks can lack apatite while still yielding large amounts of zircon.

Detrital grains collected from fluvial/glacial systems may not be abraded much if the catchment size is relatively small and transport distance short (e.g., Stock et al., 2006; Ehlers et al., 2015; Enkelmann et al., 2015). In this case, a standard FT correction can be applied to (U-Th)/He dates, because little of the alpha-depleted grain margins would be removed by abrasion. Similarly, coarse-grained, immature clastic sedimentary rocks may contain euhedral clear crystals suggesting negligible abrasion or shielding from abrasion due to inclusion in larger clasts or mineral grains.

In contrast, accessory detrital grains that have been transported farther and are contained in texturally more mature sediment and traveled as isolated grains will show varying degrees of abrasion that will have removed differing amounts of the grain edge affected by alpha-ejection and diffusive He loss. In this case, for unreset or partially reset detrital samples, applying a standard FT correction is less appropriate and can result in overestimation of the true date to unknown degrees. One way to address this problem can be selection of only the least abraded grains (with terminations and edges) (e.g., Pujols et al., 2020), although this might introduce other biases in terms of provenance (e.g., the closest source may have the least abraded grains). Another possible solution is the use of in situ laser ablation techniques that allow dating of the grain interiors only, thus avoiding the margins affected by alpha ejection (e.g., Tripathy-Lang et al., 2013; Pickering et al., 2020), although—depending on interpretive strategy—this may introduce a thermal history bias (grain rims are also where diffusional loss occurs).

2.4. Erroneously Old Dates Caused by He Implantation

In some circumstances, parentless He due to sources other than the mineral’s in situ radioactivity can bias results to erroneously old dates and add overdispersion to data sets. Just as He atoms generated within one alpha particle-stopping distance of the grain margin may be ejected out of the crystal, He atoms produced that same distance outside of the crystal may be injected into the grain, known as He implantation (Fig. 2F; e.g., Farley, 2002; Fitzgerald et al., 2006; Spiegel et al., 2009; Danišík et al., 2010; Gautheron et al., 2012; Murray et al., 2014). Although the magnitude of He implantation is usually insignificant relative to in situ-produced He, in some cases, “bad neighbor” U-Th–rich crystals (e.g., Gautheron et al., 2012) or secondary U-Th–rich rim phases (Murray et al., 2014) violate this assumption (Fig. 2F). Dates for smaller, lower eU, and lower He grains are most susceptible to bias to older results, because injected He has a proportionately larger influence on the total He budget (e.g., Janowski et al., 2017). However, this effect can also add overall scatter to the data by implanting varying amounts of He into different crystals and increasing their dates to different degrees. Note that unlike kinetic variability, the magnitude of the He implantation effect on the data is independent of the thermal history.

Grains separated and dated by the conventional method retain no record of their petrographic context, so He implantation is not easily quantified or corrected. However, crystals characterized by surficial discoloration may indicate the former presence of a U-Th–rich coating and should be avoided during grain selection (e.g., Murray et al., 2014). In addition, negative date-eU correlations at very low eU (<5 ppm) may arise from this phenomenon, providing a mechanism to diagnose this problem (e.g., Murray et al., 2014). Petrographic examination may also provide evidence for the likelihood or prevalence of He implantation effects.

2.5. Dispersion Caused by Mineral and Fluid Inclusions

U-Th–bearing mineral inclusions can cause spuriously old (U-Th)/He dates through complications to the FT correction from parent nuclide heterogeneity (section 2.2), creation of local He diffusivity variations due to variable radiation damage, and parentless He for analytical procedures that incompletely dissolve the included phases (e.g., apatite, Fig. 2G; Lippolt et al., 1994; Farley, 2002; Ehlers and Farley, 2003). For minerals like zircon, which commonly have higher eU and for which more aggressive dissolution procedures are used, inclusions are less likely to influence the total He budget and therefore the (U-Th)/He date. For phases like apatite, lower eU and less aggressive dissolution means that such inclusions are theoretically more problematic (Fig. 2G). However, included phases would need to be extremely large and greatly enriched in eU to cause the “too-old” dates that are sometimes attributed to smaller and subtler micro-inclusions (Vermeesch et al., 2007). Thus, inclusion problems should generally be avoidable during crystal selection. It may be possible to fingerprint dated crystals with undetected inclusions by inspection of rare earth element (REE) data (if analyzed via solution ICP-MS during acquisition of the parent isotope results) or U-Th data, because the chemical patterns of grains with inclusions may differ from those of inclusion-free grains.

Fluid inclusions also have the potential to impact the data (Fig. 2G; e.g., Lippolt et al., 1994; Stockli et al., 2000; Farley, 2002; Danišík et al., 2017; Zeitler et al., 2017). Inclusions with excess He would cause dates to be anomalously old, while those with excess eU would bias dates younger. However, as for mineral inclusions, the relatively large sizes and high excess concentrations needed for fluid inclusions to substantially influence the data means it should be generally possible to circumvent fluid inclusions during grain selection.

2.6. Role of Thermal History in Date Variation and Overdispersion

The thermal history plays a central role in determining if some types of data dispersion are manifested (e.g., Flowers et al., 2007, 2009; Flowers and Kelley, 2011; Cogné et al., 2012; Wildman et al., 2016). Figure 5 contains schematic date-eU plots that illustrate if and how the different sources of dispersion are likely to shift the date for endmember samples characterized by (1) a young crystallization age, rapid cooling history, and no reheating (like examples in Figs. 1A1B, 3A, and 4A) or (2) an old crystallization age and tT path characterized by slow cooling through or residence in the HePRZ (like examples in Figs. 3B and 4B) or reheating and partial resetting in the HePRZ (like examples in Figs. 1D, 3C, and 4C).

Dispersion stemming from the thermal history is minimal for the young, quickly cooled sample (Fig. 5A). This scenario is applicable to geochronologic studies (e.g., in which the dated minerals crystallized at temperatures appropriate for full He retention and were not reheated) or thermochronologic studies on young, rapidly exhumed samples (e.g., Cenozoic intrusive samples emplaced in rapidly eroding orogenic belts or rapidly exhumed rocks due to normal faulting; e.g., Farley et al., 2001; Stockli et al., 2000; Ehlers et al., 2015). In this case, radiation damage-induced differences in He diffusivity among the dated crystals will be minimized. And even if individual crystals have variable diffusivities (owing to differences in radiation damage, crystal size, or other factors), fast and approximately synchronous cooling through their temperature sensitivity range would cause them to yield similar dates. Such a sample should yield roughly uniform dates across a broad eU span on a date-eU plot (Fig. 5A). The potential contributors to overdispersion are relatively few and include U-Th zonation effects on the FT correction, He implantation, and inclusions. In general, if these factors are operative, they tend either to cause a skew in the date population toward erroneously old dates (e.g., He implantation, mineral inclusions, excess He in fluid inclusions, crystal imperfections) or to symmetrically increase the data dispersion (e.g., if intra-sample zonation patterns among grains are not systematic but instead vary in pattern and magnitude among grains).

In contrast, dispersion is amplified for the older sample characterized by a protracted tT path, where all sources of date scatter have the potential to contribute (Fig. 5B). Endmember examples of this scenario occur in Precambrian basement from cratonic settings (Figs. 1D, 3C, and 4C; e.g., Flowers, 2009; Baughman and Flowers, 2020; Sturrock et al., 2021), but basement samples in Mesozoic and Cenozoic orogenic belts can also show these effects (Figs. 3B and 4B; e.g., Flowers et al., 2008; Enkelmann et al., 2014; McKeon et al., 2014). In this situation, individual crystals in a sample may have diffusivities (and therefore TC values) that vary substantially owing to radiation damage-induced divergence of their He retentivities, in addition to possible contributions from other kinetic effects (e.g., crystal size). These variable diffusivities can be strongly manifested in the data for tT histories characterized by slow cooling through the HePRZ or reheating and partial resetting, because the crystals will pass through their temperature sensitivity ranges at different times during cooling or have different magnitudes of He loss during reheating and therefore will record different dates. Such samples should yield a positive and/or negative correlation across a wide eU span on a date-eU plot (Fig. 5B). However, the date-eU patterns have the potential to be distorted or obscured by other factors that induce data dispersion, especially those that are also magnified by this type of protracted tT path. These other factors include additional influences on variable He diffusion kinetics, such as crystal size or heterogeneous intracrystalline radiation damage due to U-Th zonation, as well as factors that disrupt or affect the He diffusion profile, such as grain fragmentation or U-Th zonation. In addition, the factors that affect the quickly cooled sample dates may still play a role (e.g., U-Th zonation effects on the FT correction, He implantation, inclusions, crystal imperfections). Again, most unidentified sources of error lead to excess He, which biases the results to older dates.

Whereas the previous section was aimed at summarizing potential contributors to data dispersion within individual samples, the purpose of this section is to offer some practical guidelines for evaluating and presenting real (U-Th)/He data sets. A first-order consideration is whether the data are part of (1) a thermochronologic study that constrains the thermal history, (2) a geochronologic study that constrains the mineral crystallization age, or (3) a detrital study on sedimentary rocks or unconsolidated sediment. This section suggests a general workflow for systematically evaluating (U-Th)/He thermochronologic (section 3.1) and geochronologic data sets (section 3.2) and describes additional considerations associated with detrital studies (section 3.3). Our goal is to provide researchers with some strategies for evaluating data to help ensure that only reliable analyses are interpreted and that possible influences on the data are considered before proceeding to data interpretation, and, in some cases, thermal history modeling.

3.1. Evaluating (U-Th)/He Thermochronologic Data Sets

(U-Th)/He thermochronologic studies are those for samples that yield information about the thermal history. Many such samples yield data that are relatively straightforward to interpret, either because they yield reproducible dates or because they show date variation for which the first-order controls are understood, are accounted for in kinetic model frameworks, and can be used to advantage during interpretation (e.g., radiation damage, crystal size). However, a subset of samples yield data that are less straightforward and exhibit large dispersion for which the causes cannot be corrected for or are not fully understood. These latter samples should be subject to only limited (if any) first-order interpretation. Even samples with crystals that appear to be ideal when selected under the microscope can yield data that fall into this difficult-to-interpret category (e.g., due to He implantation or uncharacterized eU zonation).

One possible strategy for systematically evaluating apatite and zircon (U-Th)/He thermochronologic data sets of typical size consists of: (1) assessing individual analysis quality, (2) constructing date-eU and date-grain size plots and evaluating data patterns, (3) considering outliers, and (4) deciding if and how to combine individual analysis data with a statistical model (e.g., such as by reporting a mean sample date). Figure 6 illustrates this workflow, along with additional steps described in section 5 associated with interpreting data using thermal history modeling. This approach assumes that the crystals were selected for quality using generally accepted criteria under a binocular microscope with crossed polars, such that the analyzed grains surpass a minimum size, are fracture-free, and lack visible fluid inclusions (all minerals) and mineral inclusions (for apatite).

First, individual analyses should be evaluated to exclude those of doubtful quality from interpretation and to check that the most appropriate assumptions were used during data reduction (Fig. 6). Crystals with small FT values (<~0.5), and those with He and parent isotope amounts near blank values, are questionable. Labs may have threshold blank values based on how reproducible and well-characterized lab blanks are, below which the analysis is rejected as unreliable. For grains analyzed that are anhedral, the user should confirm that the most appropriate morphology was used to calculate the derived data (e.g., FT values, concentrations). For crystals that are fragments from which the alpha-depleted edge is thought to have been fully removed, it may be most appropriate to report and consider only the uncorrected (U-Th)/He date (unless fragmentation preceded the tT path of interest). If a single-grain analysis is characterized by anomalous Th, U, or Th/U and yields a wildly different date, then this may suggest the presence of inclusions that bias the date and be grounds for excluding the analysis.

Second, date-eU and date-RS plots may be constructed to determine if correlations are present (Figs. 3, 4, 6, and 7). Such plots are useful for seeking patterns in date-eU and date-crystal size space, evaluating secondary dispersion attributable to other effects, and for visually identifying analyses that may be outliers and merit additional detailed data inspection. Such plots are only valid for grains that experienced the same thermal history, so patterns for each sample should be evaluated independently, although similar patterns for different samples may suggest a similar tT path (note that for unreset and partially reset detrital samples, grains may have additional scatter on date-eU plots because the pre-depositional history may differ among the crystals; see section 3.3). Grain fragments should generally be excluded from grain size plots, or fragment-date relationships should be considered separately, if fragmentation is thought to postdate the tT history of interest. In this circumstance, the fragments do not represent the diffusion domain, and the purpose of date-size plots is to evaluate possible date-diffusion domain relationships.

Following construction of the data plots, the data patterns should be evaluated and their significance considered (Fig. 6). Examples of real data sets that display different pattern types are in Figure 7. Possible endmember types of patterns that have differing implications are:

  • Uniform dates for crystals of similar eU and size (Fig. 7A). In this case, the He diffusivities of the grains dated are not thought to vary substantially. More limited information resides in the data than if crystals of variable eU and size (and therefore possibly variable kinetics and TC) were dated, because these grains should yield the same date regardless of thermal history, and thus they are consistent with either fast cooling or more protracted tT paths.

  • Uniform dates across a wide eU and size span (Fig. 7B). In this circumstance, the similarity of the dates implies rapid cooling and approximately synchronous passage of all grains through their temperature sensitivity ranges. The TC variability of the minerals dated depends on their eU and grain size range and on the thermal history prior to rapid cooling (longer damage accumulation times allow for divergent damage accumulation amounts and therefore greater variability in He retentivities across the mineral suite; see section 2.1.1). This example is analogous to the interpretation of fast cooling when different thermochronometers yield overlapping dates. For instance, if biotite 40Ar/39Ar, zircon fissiontrack, and apatite (U-Th)/He dates all overlap at 10 Ma, rapid cooling at 10 Ma is inferred.

  • Systematic positive date-eU correlations (Figs. 7C7D for apatite, Fig. 7H for zircon), negative date-eU correlations (Figs. 7I7J for zircon), or positive date-crystal size correlations (Fig. 7E). These relationships suggest that the grains have variable retentivities and record more protracted tT paths. This might reflect slow cooling through the HePRZ, during which the grains began retaining He at different times, or reheating into the HePRZ, which caused variable magnitudes of He loss across the crystal suite.

  • Excessive scatter uncorrelated with eU (Figs. 7F7G) and size. This “overdispersion” has a number of possible causes (Table 1; Fig. 5). Anomalously old dates for very low eU (<5 ppm) crystals suggest a problem with He implantation (Fig. 7F; section 2.4), which merits omission of these analyses or the entire sample from data interpretation or motivates petrographic examination of the sample in thin section to assess the probability of this effect. If grain fragments were analyzed, then relationships between fragment size and date should be considered (section 2.3.1). Significant and variable U-Th zonation, unidentified U-Th or He-rich mineral or fluid inclusions, and/or kinetic variability induced by other factors could also contribute to excess dispersion (Fig. 7G). As noted previously, some of these factors can cause a skew toward older date distributions. Detrital samples also may exhibit scattered data due to the inheritance of variable amounts of pre-depositional He and radiation damage across the crystal suite (section 3.3). Interpreting inexplicably scattered data without understanding and correcting for the cause(s) should be done with caution. Such samples should either be rejected from thermal history modeling to avoid erroneous interpretations because the dates are strongly affected by mechanisms not included in the kinetic model framework (section 5), or additional information should be obtained (e.g., U-Th zonation data) to help decipher the cause(s) of the data pattern and move the sample into the interpretable realm.

In reality, some samples may yield data patterns that fall between these endmember types owing to multiple contributors to data variability (Table 1; Fig. 5). For example, a sample may yield generally uniform dates across a broad eU and size span but have older outlier dates for only the lowest eU crystals. This suggests bias from He implantation for these low eU analyses, such that excluding them from interpretation may be justified. Alternatively, a sample may yield a generally positive date-eU correlation and no date-grain size correlation but also be characterized by substantial additional scatter. Such a pattern suggests that the grains have variable retentivities owing to radiation damage differences across the crystal suite, but other factors such as zonation contribute to additional date scatter. The compatibility of the results with data from other nearby samples, and the resolution needed from the data to address the question asked, are considerations in deciding if and how to interpret the data.

Another step may be an attempt to identify outliers (Fig. 6). Sometimes, particularly in overdispersed sets of (U-Th)/He dates, analyses may pass the first quality control step described above but appear unusually or erroneously old without an evident connection to eU or grain size. Such grains are made apparent by comparison with other grains, particularly in large-N (tens of grains) data sets (Ketcham et al., 2018; Cooperdock et al., 2019; He et al., 2021); comparison among grains with diffusive loss during ramped heating that is normal versus irregular (see section 8.4 of companion paper, Flowers et al., 2022; Zeitler et al., 2017; McDannell et al., 2018); or comparison with independent data (e.g., dates older than the crystallization age of the unit or the crystal itself in the case of double-dating studies). This is perhaps to be expected, as many hypothesized but difficult-to-detect sources of error result in excess He, such as He implantation, inclusions, and He-retaining defects. Big data or machine learning algorithm approaches (Recanati et al., 2021), or checking for the compatibility of allowable thermal histories for each analysis during thermal history modeling (Sousa and Farley, 2020; section 5), may also help identify outliers. At this time, there is neither an agreed-upon definition of outliers in this context nor a consensus on the best approach to fingerprint them. Certainly, dates that violate solid geological or geochronological constraints, such as being older than crystallization ages, may be considered outliers and eligible for omission from interpretation and thermal history modeling.

Finally, whether or not a statistical model should be used to combine and report sample data may be considered (Fig. 6). Combining analyses using some measure of central tendency, such as a mean sample date, is only appropriate when individual analyses are expected to define a normally distributed population. This is discussed further in section 4.

3.2. Evaluating (U-Th)/He Geochronologic Data Sets

In (U-Th)/He geochronologic studies, the (U-Th)/He date constrains the age of mineral crystallization or a distinct geologic event, such as a volcanic eruption (example in Fig. 8). In this scenario, variable kinetic effects between grains do not contribute to data dispersion, and a normally distributed population of dates is expected, such that combining analyses using a central tendency statistic is generally appropriate (see section 4). Interpretation is therefore simpler than in some thermochronologic data sets. Assessing individual analyses to exclude those of doubtful quality (see previous section) and evaluating outliers that also may be justifiable for omission is still recommended. Again, if data are skewed, they will more likely be biased to older dates due to effects such as He implantation and mineral inclusions (e.g., Ketcham et al., 2018; Cooperdock et al., 2019; He et al., 2021).

3.3. Additional Considerations for Detrital Studies

The (U-Th)/He dates derived from sedimentary rocks typically yield wide dispersion due to (1) varying detrital provenance, derivation from different sources areas, and thus varying He inheritance and radiation damage from differing pre-depositional thermal histories; (2) varying kinetic effects (e.g., eU, chemistry, crystal size, which may be more variable than for bedrock samples owing to disparate grain sources); (3) repeated and/or prolonged residence within the HePRZ; (4) general lower crystal quality due to weathering and abrasion that hinder selection of crystals without cracks and inclusions; (5) the effects of variable amounts of grain abrasion that removed different amounts of the He-depleted grain exterior (Fig. 2E); and (6) generally smaller grain size resulting in larger FT corrections and greater effects of possible He implantation. All of these factors can cause additional scatter on date-eU and date-grain size plots. For detrital samples that were not fully reset after deposition, eU is compromised as a proxy for radiation damage because each grain may have had a thermal history that differed before grain deposition (e.g., Flowers et al., 2007; Guenthner et al., 2015; Fox et al., 2019).

For detrital data analysis, it is important to first establish to what degree the sample has been thermally reset after sediment deposition (Fig. 9A). To do so, it is crucial to know the stratigraphic age and leverage independent constraints on thermal maturation or maximum burial temperatures (e.g., vitrinite reflectance data).

An unreset detrital sample was not buried and heated at all, or it was buried and heated to temperatures below the HePRZ, such that the (U-Th)/He dates predate the time of deposition and/or are equal in the case of syn-depositional volcanic input (Fig. 9B). Unreset detrital samples include those of modern (unconsolidated) sedimentary deposits. In these cases, the (U-Th)/He date records the thermal history of the source region and can be used as a provenance tool or for geomorphic/erosion studies. For these studies, the analyses of many crystals (N > 50) is needed to obtain statistically meaningful date distributions (e.g., Vermeesch, 2004; Stock et al., 2006; Ehlers et al., 2015). Grain precharacterization via U-Pb and/or other methods may also be applied to strategically target a smaller number of analyses on one or more subpopulations (e.g., Campbell et al., 2005; Reiners et al., 2005). The date dispersion is expected to be large, because each crystal records an individual thermal history experienced in the source region. The appearance of each crystal analyzed should be evaluated for abrasion to guide the decision of whether the (U-Th)/He date should be FT corrected or not (see companion paper, Flowers et al., 2022). The recent development of laser ablation (U-Th)/He dating techniques will be beneficial for future detrital studies of unreset samples because the outer rim of the grain can be excluded from analysis and a high number of crystals can be analyzed in a time- and cost-effective manner (e.g., Tripathy-Lang et al., 2013; Pickering et al., 2020). This method will potentially bias results toward older core dates that might have experienced less diffusive loss in the source region, so this factor should be considered in data interpretation. Extra caution must be taken when collecting detrital samples in catchment regions affected by wildfires. Even short durations of fire have the potential to reset or partially reset apatite (U-Th)/He dates within the top centimeters (<3 cm) of exposed bedrock, which will erode and supply grains to the sediment system (e.g., Reiners et al., 2007; Mitchell and Reiners, 2003). In detrital studies where the thermal history of the sediment source region is of interest, the analysis of clast- and cobble-size material may be considered (e.g., Colgan et al., 2008). Individual clasts/cobbles can be treated and analyzed like a bedrock, allowing multi-grain analyses that will improve data uncertainty and allow thermal history modeling. Multi-method dating can be additionally conducted on each cobble to expand the tT history and other provenance information (e.g., Grabowski et al., 2013; Enkelmann and Garver, 2016; Falkowski et al., 2016; Willett et al., 2020). Double- and triple-dating methods on individual detrital grains (e.g., U-Pb-He or U-Pb-He and fission track) have also proven to be a powerful approach to defining source area thermal histories by refining the detrital provenance (e.g., Rahl et al., 2003; Reiners et al., 2005; Thomson et al., 2017; Kirkland et al., 2020; Stockli and Najman, 2020).

A fully reset detrital sample that was heated to temperatures above the HePRZ results in the complete loss of He and (U-Th)/He dates that are younger than the depositional age (Fig. 9C). In this case, the (U-Th)/He data can constrain peak burial heating temperatures and subsequent cooling/exhumation histories. Heating to temperatures above the HePRZ may not result in the full annealing of accumulated radiation damage, which affects the He diffusion kinetics and can contribute to date dispersion. Evaluating the data for a fully reset detrital sample follows the recommendations for evaluating a bedrock thermochronologic data set as outlined above (section 3.1). In reality, many reset sedimentary rocks show significant dispersion, such that calculating and reporting mean sample dates may be inappropriate (e.g., McKay et al., 2021), owing to a combination of the reasons described at the beginning of this section as well as possible effects of variable damage annealing during burial. This dispersion appears to be particularly common for pre-Cenozoic sedimentary strata because kinetic effects are amplified over long timescales (section 2.6; Fig. 5B). Forward tT- path modeling can be a useful tool for exploring the possible range of (U-Th)/He date dispersion due to varying kinetic parameters (e.g., eU, grain size) and pre-depositional thermal histories (for samples with incomplete damage annealing during burial) (e.g., Flowers et al., 2007; Powell et al., 2016; Schwartz et al., 2017; Fox et al., 2019; McKay et al., 2021).

A partially reset detrital sample was heated to temperatures within the HePRZ and yields a distribution of (U-Th)/He dates that both predate and postdate the time of deposition (Fig. 9D). In these samples, date dispersion may be caused by the crystal’s individual He inheritance from the source region or different source regions, the effects of grain abrasion (such that there is an unknown amount of He loss from alpha ejection and the diffusion profile at the grain margin), the post-depositional He accumulation and diffusive loss during prolonged time within the HePRZ, as well as other effects described in section 2. Owing to large data dispersion, calculating a mean sample date and discarding older and younger grains as outliers is generally inappropriate. In a partially reset sample, detailed information about the tT path is limited, but the maximum burial can be constrained to the temperature window of the HePRZ (e.g., Enkelmann et al., 2010). The reporting of both the uncorrected and FT-corrected (U-Th)/He dates is recommended, because some of the measured He accumulated after deposition. Visualization tools such as radial plots and principal component analysis can be used to identify grain populations within partially reset samples, whereby the youngest age population may be attributed to the time of cooling after maximum burial heating (e.g., Vermeesch, 2009). U-Pb-He double dating of crystals can determine detrital provenance and narrow source terrane derivation; this can help discretize differential He inheritance and be useful for deriving and separately modeling the thermal histories of different detrital populations (e.g., Yonkee et al., 2019). Additionally, partially reset detrital data sets can be forward modeled together to inform post-depositional basin thermal histories (e.g., Fosdick et al., 2015).

Overall, it can be difficult to evaluate the degree of resetting in a detrital sample, and a higher number of analyses (N>>10) than typically acquired for bedrock samples may be needed to gain more confidence in the post-depositional maximum heating temperature. For that reason, it is highly recommended to review all existing geological and thermal information of the study region such as other thermochronologic data, metamorphic grades, and/or sediment maturation data to estimate the maximum burial temperatures and guide the choice of thermochronologic system used for addressing specific geologic questions.

4.1. Considerations

How to properly represent and statistically characterize a suite of individual (U-Th)/He analyses from a sample is an important focus of ongoing study. Any summary statistic that is used to represent sample data assumes that the underlying distribution of dates for the sample population is known (e.g., He et al., 2021). Measures of central tendency, such as a mean sample date, assume that the population is normally distributed (defining a Gaussian distribution), where the width of the distribution represents the likelihood that another analysis will yield the same result.

Measures of central tendency are most appropriate for samples that yield generally uniform dates (either across a limited or wide eU and grain size range) and lack skewed date distributions or date variation due to kinetic effects (real data set example in Fig. 8). This scenario is most likely for samples in geochronologic studies and for samples in thermochronologic studies with young crystallization ages (and therefore limited time for radiation damage-induced divergence of He retentivities), little grain size variation, and rapid cooling histories (examples in Figs. 1A1B, 3A, 4A, and 5A). However, even in this circumstance, samples may yield a skewed distribution of dates if factors that cause asymmetric date excursions are present. As mentioned previously, most sources of bias that induce asymmetric data distributions cause old-date excursions that may lead to a positive skewness in date distributions, either in large-N data sets or across samples at a given locality or within a particular lithology. In such cases, a central tendency statistic should not be reported unless the old-date signal can be isolated and/or removed. If sufficient data have been acquired to demonstrate that the date distribution is skewed to older dates, one recently proposed approach is to use the youngest statistically meaningful date population to estimate the time of cooling (He et al., 2021).

Using a measure of central tendency to represent sample data is generally inappropriate for samples characterized by substantial date dispersion due to kinetic variation (see real dataset examples in Figs. 7C7J). Kinetically caused data dispersion is common for samples with older crystallization ages and/or those characterized by protracted thermal histories (examples in Figs. 1D, 3B3C, and 4B4C) and can be manifested as systematic correlations between date and eU or date and grain size (see sections 2.1.1 and 2.1.2). In these circumstances, the date population is not expected to be normally distributed owing to variable He retentivities (and thus variable Tc values) among the mineral suite dated (Fig. 5B), even if other effects that could contribute to data dispersion are absent (e.g., parentless He). At present, it is common not to use any summary statistic to represent the data distribution of these samples.

The low number of crystals (typically 5–10 analyses) dated in routine work for typical samples in thermochronologic and geochronologic studies can make it difficult to definitively determine whether or not the data define a normally distributed population. Similarly, a small number of grains makes spurious date-eU or date-size correlations more likely to occur (Ketcham et al., 2018). Looking forward, regular acquisition of more analyses for each sample would help address this problem (section 6). At present, it is not always straightforward to decide on whether to use a summary statistic to represent the data. One practical approach to this issue is to simply assign a threshold level of data dispersion below which a central tendency measure is reported and above which it is not, and clearly state the value used. For example, the standard deviation is a metric used to characterize the variability in the measurements of a population. So, one might report a mean and standard deviation for samples with dispersion lower than an assigned and clearly stated level (such as for samples with less than a 15% or 20% sample standard deviation) and not report a mean for samples with higher levels of data dispersion.

Established statistical tests used for geochronometry, such as the mean squared weighted deviation (MSWD; Wendt and Carl, 1991), can also serve as an indicator of whether to report a mean. However, the fidelity of the MSWD relies on the assumption that uncertainties are well understood and quantified. Thus, in the context of (U-Th)/He, an MSWD is not always straightforward to use in its intended role of identifying potential outliers and testing for mixed populations; a high MSWD may result from a problem with the data, but it may alternatively reflect that uncertainties are underestimated (e.g., if FT uncertainties are poorly known and not included in the uncertainty computed for the individual aliquot corrected (U-Th)/He dates; see companion paper, Flowers et al., 2022). Nevertheless, for geochronologic applications of (U-Th)/He (section 3.2), we recommend that a MSWD or a similar test be considered, with a reasonable criterion being that it be below 1 + 2(2/f)1/2, where f is the number of degrees of freedom (Wendt and Carl, 1991). This will help align practice with other geochronologic techniques and incentivize further careful investigation into how best to quantify uncertainties. See Table 2 for how to calculate the MSWD for different central tendency statistics, which are discussed next.

4.2. Approaches for Reporting a Central Tendency Statistic and Its Uncertainty When This Is Considered Appropriate

In cases where it is considered appropriate to integrate (U-Th)/He dates from multiple individual analyses using a measure of central tendency, a number of methods have been used or proposed. Table 2 lists the equations associated with several approaches, and Figure 8 illustrates their application to a geochronologic data set.

4.2.1. Unweighted and Weighted Means

The mean and weighted mean (Fitzgerald et al., 2006) are the most widely used measures of central tendency. Table 2 lists three approaches for reporting means (Equations sb61, 6, and 11).

The regular (unweighted) mean does not weight the individual analyses by their uncertainties (Table 2, Equation 1). Some have favored this approach because individual analysis uncertainties are not fully characterized at present.

The weighted mean weights the individual analyses by their uncertainties. As uncertainty estimates are further improved, enabling more complete uncertainty characterization of the individual analyses, use of the weighted rather than the unweighted mean may be preferable. Using the weighted mean requires a decision regarding how to weight the individual analyses. One approach is to weight the mean using the inverse variance (1/σi2) for the weights (wi) (Table 2, Equation 6). However, a shortcoming of this weighting is that younger grains tend to be weighted more, which biases the combined outcome to a younger result (Peyton et al., 2012). This is particularly inappropriate when uncertainties are assigned to individual analyses as a uniform percentage based on the reproducibility of standards, as has been done in some past work. To avoid this bias toward younger results, an alternative approach is to weight instead by the squared relative deviation (wi = 1/[σi/xi]2) (Table 2, Equation 11). If a uniform relative error on the individual analyses is assumed, this is equivalent to the unweighted mean, but otherwise this approach allows weighting based on differential analytical uncertainties.

Other options such as the pooled age, the isochron age, and the central age have also been proposed (Vermeesch, 2008). However, the pooled age tends to be dominated by high-eU grains, and the isochron and central age may be better suited for larger datasets than are collected routinely at present.

4.2.2. Reporting Uncertainty in the Mean

Several approaches may be used to represent the uncertainty on the combined date (Table 2).

  1. The standard error based on the sample standard deviation is one approach, referred to here as the “SD-based SE.” In Table 2, Equations 2–3, 7–8, and 12–13 are the expressions for the sample standard deviation and the associated standard error for the unweighted, weighted, and relative weighted cases. For small sample sizes where relatively few grains are measured, as is typical at present, a potential issue with this approach is that the SD-based SE may yield a value that is too low if analyses are coincidentally similar.

  2. The standard error based on the individual analysis uncertainties, referred to here as the individual uncertainty-based SE, may be appropriate to use if the uncertainties of the individual analyses are well quantified and if the individual analysis uncertainties appear to properly represent the observed variability in the data. In Table 2, Equations 4, 9, and 14 are the expressions for the sample standard error based on the individual analysis uncertainties for the unweighted, weighted, and relative weighted cases. If individual analysis uncertainties are too low (for example, if FT uncertainties are not propagated into the individual analysis uncertainties), a shortcoming of this approach is that it will yield a standard error that is too small.

Given the limitations of each standard error method, one conservative approach for representing the uncertainty on the combined data is to report the higher of the two values: the maximum of either the SD-based, or individual uncertainty-based, SE.

For data sets in which individual analysis uncertainties clearly do not account for the observed variability in the data, the sample standard deviation may be a useful alternative or additional measure to report as a representation of the variability of the individual (U-Th)/He dates.

We emphasize again that it is only appropriate to combine individual analyses for a sample into a mean with an associated uncertainty if the analyses are believed to represent a normally distributed population. The potential utility of the MSWD for deciding whether to integrate analyses into a mean was discussed in the previous section. In Table 2, Equations 5, 10, and 15 are the expressions for the MSWD for the unweighted, weighted, and relative weighted cases. Again, many circumstances lead to populations of dates with substantial and expected variation due to kinetic effects (e.g., radiation damage, crystal size), or in other cases to populations that are positively skewed owing to undetected effects that in general tend to bias the results to older dates. In these cases, sample means and other central tendency statistics should not be reported.

4.3. Recommendations

There currently is no required and universally agreed-upon approach for how to statistically characterize a set of individual (U-Th)/He analyses from a sample. This will continue to be informed by improved understanding of the controls on He diffusion kinetics, new techniques that can identify and remove biased dates, and acquisition of larger data sets to better characterize the distribution of dates from a sample population (section 6). Reporting a central measure statistic (e.g., mean sample date) is appropriate for samples with normally distributed dates (e.g., common in geochronologic studies), but it is not correct for samples with substantially skewed dates or kinetic variation. We recommend explaining the rationale for combining or not combining sample data using a summary statistic. If a summary statistic is used, then state how data are combined (e.g., unweighted mean, weighted mean and nature of weighting), how the uncertainty is calculated, what factors are included in the uncertainty, and the confidence interval (1σ or 2σ).

5.1. Overview

Thermal history modeling is an interpretive step that is commonly used to convert thermochronologic data into thermal histories from which geologic conclusions are derived (Fig. 6). This is done because (U-Th)/He dates are quantities that represent a mineral’s time-integrated thermal history, and a variety of tT paths can yield the same (U-Th)/He date (see sections 2.1 and 2.5 of companion paper, Flowers et al., 2022). Much can generally be learned from (U-Th)/He data sets even without tT modeling, but it can be powerful for testing specific hypotheses and exploring particular questions with a data set. This process involves using He diffusion kinetic model(s) for the mineral(s) of interest in a tT modeling program to determine the range of thermal histories that can explain the data while honoring other geologic and geochronologic constraints on the tT path.

Thermal history modeling can be done in both a forward and an inverse sense. Forward modeling involves choosing a given tT path to predict the dates (e.g., Ketcham, 2005). Different segments of the selected tT path (e.g., heating magnitude or duration, cooling timing or rate) can be varied by the user to gain a conceptual understanding of how different parts of the tT history influence the predicted date pattern and to determine which tT path segments the data are sensitive to and actually constrain. Forward modeling is therefore a recommended preliminary step before inverse modeling, so that the latter is not a “black box.” Inverse thermal history modeling involves generating a suite of forward tT paths with a defined set of characteristics from which dates are predicted, compared with the observed input data, and used to constrain the suite of viable tT histories that can account for the thermochronologic, as well as any independent geologic and geochronologic, data (e.g., Ketcham, 2005, 2012; Gallagher, 2012).

Thermal history modeling requires careful consideration of the input thermochronologic data, a solid understanding of independent geologic and tT constraints that may bear on the samples and how to reasonably and defensibly incorporate them into the models, deliberate implementation of a geologically plausible level of complexity in tT path solutions given the sample context, an understanding of the criteria used to determine solutions and how these criteria affect interpretation of model outcomes, complete representation of tT model outputs, and full explanation of all of the above aspects in published products (Table 3). Deciphering the complete, continuous thermal history of the sample is generally not feasible. More commonly, the goal is to resolve the tT path well in the tT range of interest for the problem being addressed. Being cognizant of which portions of the thermal history the data do and do not constrain is key for reliable data interpretation.

The two most commonly used thermal history modeling programs are HeFTy (Ketcham, 2005) and QTQt (Gallagher, 2012). Both programs allow for forward and inverse modeling. The differing strategies and philosophies that these programs use for generating tT paths during inverse modeling, for statistically determining which paths predict dates that acceptably reproduce the observed data, and for depicting inversion model results were recently discussed in a paper and associated comments and replies (Vermeesch and Tian, 2014; Gallagher and Ketcham, 2018; Gallagher and Ketcham, 2020; Vermeesch and Tian, 2020; Green and Duddy, 2020). Broadly comparing software approaches, HeFTy uses frequentist statistics, testing the null hypothesis that the data could be a sample from the set of possibilities implied by the model given measurement uncertainties. A Monte Carlo scheme is used to generate tT paths, with the user assigning a degree of permitted complexity to different segments of the history to help ensure a fuller mapping of the set of geologically reasonable paths that fit the data. QTQt uses a Bayesian methodology, which allows the user to set up a series of priors concerning the thermal history and optionally include other factors such as diffusion kinetics, allowing them to vary. The Markov Chain Monte Carlo (MCMC) method is used to search the solution space by learning from earlier-attempted paths and includes a penalty for complex paths (with many nodes); thus, it seeks and highlights the simplest set of solutions that best fit the data. In both programs, constraints or priors can be used to enforce external information concerning the thermal history, such as depositional events (i.e., stratigraphic ages).

To interpret the geologic significance of thermal histories derived from the modeling effort, additional assumptions, such as the geothermal gradient or other factors, are then required (Fig. 6). Thermal-kinematic modeling software such as PECUBE, which computes the spatial and temporal variability of upper crustal temperatures during topographic evolution and faulting, can also be used to constrain which geomorphic, structural, and geodynamic histories are most consistent with a (U-Th)/He data set (e.g., Braun, 2003; Ehlers and Farley, 2003; Ehlers, 2005; Braun et al., 2012).

We do not address all of these complex topics here but instead focus specifically on the practical choices and assumptions that must be made by the user during setup of inverse thermal history models. A critical point is that interpretive decisions are unavoidable when setting up models, including what data types and samples to simulate, how to input these data into the models, what modeling program to use, whether to explore and represent a wide range of feasible histories or to favor simpler tT paths in model outputs, what criteria are used to determine whether the data are replicated, and what independent geologic and geochronologic data are relevant to the samples and should be honored. These choices should be carefully considered and deliberately made, but they are not always clear cut, such that there may be multiple reasonable paths through this decision tree. It therefore is essential that published products articulate the choices made and logic used along this path and represent how well the preferred tT path solutions reproduce the data (section 5.4; Table 3). We first reiterate the critical concepts of “model structure” and hypothesis testing that are central to constructing inverse thermal history models (Gallagher and Ketcham, 2018; section 5.2), describe two common strategies for inputting data into inverse models (section 5.3), and conclude with some recommendations for model reporting (section 5.4; Table 3).

5.2. Designing a Model Structure, Testing Hypotheses, and Evaluating Model Outputs

Thermal history modeling is carried out within a deliberately designed model structure to test and explore specific hypotheses for a data set, such as the timing, magnitude, or rates of cooling/exhumation and/or heating/burial event(s) across a study area (e.g., Gallagher and Ketcham, 2018). The model structure is informed by the geologic and geochronologic context of the samples, by preliminary interpretations made from the data based on the temperature sensitivity(s) of the dated mineral(s), and perhaps by additional plots of sample date(s) versus elevation or distance along a transect that may aid in better understanding the spatial patterns of the data (Fig. 6).

The model structure is implemented in a number of ways. First, independent geologic and geochronologic knowledge is incorporated into the models by choosing whether and how to apply tT constraint boxes through which all tested tT paths must pass. This step is essential for designing geologically valid model frameworks. These constraints may be based on local geologic observations, such as an unconformity that requires the rocks to have been at the surface at a specific time, or based on the broader context such as knowledge that the study area is within a larger region that was undergoing burial within a certain interval. Characteristics of the intervening tT path segments are also specified (e.g., number of breakage points, cooling rates). In some cases, for example, when using QTQt, an additional decision is required about whether to explore and represent a wide range of tT path possibilities or to limit outcomes by penalizing more complex solutions with rate-variant trajectories even if they replicate the data as well as simpler solutions. It is common to iteratively carry out multiple models that assume different model frameworks and that vary different model aspects to fully understand the limits of a data set and extract the maximum information from it.

As an example of model structure, it may be possible that a history of continuous slow cooling/exhumation, or one of early rapid cooling/exhumation followed by one or more episodes of heating/burial and cooling/re-exhumation, can fit a thermochronologic data set equally well (Fig. 10A). In this case, the level of complexity that the authors infer is most geologically realistic will determine how the model is constructed (e.g., McClure Mountain syenite example; Anderson et al., 2017, 2018; Weisberg et al., 2018a, 2018b). Thus, in the continuous cooling inverse model, reheating would be precluded, and the only tT constraints imposed may be the hightemperature crystallization age and the modern surface temperature (Fig. 10B). In contrast, in the heating and cooling inverse model, reheating would be allowed, and additional constraint boxes would be defined, for example, based on unconformable relationships inferred to be relevant to the samples that constrain earlier intervals when the sample was at or near the surface (Fig. 10C). The outcomes of each model then support or refute the hypothesis tested by the model structure by either yielding or not yielding tT solutions that are considered to acceptably reproduce the data. This process assumes that the data are reliable, appropriate uncertainties are applied, and the kinetic models account for the mineral diffusion characteristics sufficiently well (e.g., Gallagher and Ketcham, 2020).

It is not possible to infer thermal histories from the thermochronologic data alone without adopting a model framework. Models that seek, favor, and highlight the simplest solutions embed model frameworks that test the hypothesis that rate- and direction-invariant thermal histories with limited inflections can explain the data (for example, this approach is sometimes adopted when using QTQt). The outcomes of these models do not falsify the possibility that more complex tT paths can statistically reproduce the data as well or better than more simplistic paths, because the models were designed to penalize and reject non-monotonic tT trajectories that change in rate and direction rather than explore them. Whether this approach yields the most geologically likely suite of tT paths, or yields overly simplistic and geologically unrealistic or illogical outcomes, depends entirely on the sample context, model design, and study objectives. It is important to be aware that the outcomes of models that highlight simple solutions may imply that the data restrict thermal histories in a temperature range that the results are not sensitive to and therefore cannot limit. For example, outcomes of inverse thermal history models of apatite (U-Th)/He data sets that penalize rate-variant paths may depict a narrow range of rate-invariant monotonically cooling tT paths at temperatures <30 °C, although in reality the apatite (U-Th)/He data allow any tT trajectory at these low temperatures; this may be key in a circumstance in which one is trying to determine the most recent phase of cooling that the data can constrain, such as associated with incision of a canyon. This again underscores the need to understand, clearly articulate, and justify embedded assumptions in thermal history models so that it is clear what the models do and do not test and/or resolve.

It is crucial to carefully evaluate the model outcomes. This evaluation includes confirming and conveying in the published product that the preferred solutions reproduce the data, for example via statements of the specified statistical fitting criteria (for HeFTy; note that QTQt inversion models will always yield solutions even if the fit quality is poor) and/or graphical depictions of observed and predicted data (Flowers et al., 2016; Gallagher et al., 2016). This assessment also includes inspecting the model outcomes to ensure that they are not invalidated by incontrovertible geologic and chronologic information that constrains the tT path, such as a time when the sample was at the surface based on the age of volcanic rocks directly overlying the sample or an interval of heating/burial based on a package of sedimentary rocks sitting above the sample. If such constraints are violated by the tT path solutions, the model should be discarded or reformulated to include them, because later portions of tT path solutions can be dependent on earlier segments that are in the temperature sensitivity window of the thermochronometer and vice versa (see section 2 of companion paper, Flowers et al., 2022). Sensible and reliable interpretations cannot be drawn from models that flatly contradict relevant geology.

A key point is that simply finding statistically probable tT paths within a given model framework does not mean that those are the most geologically probable thermal histories. Nor should the default be to assume that the simplest model structure and simplest possible tT path is always (or perhaps ever) the most geologically realistic, especially over longer timescales. Additional arguments are required to establish the most geologically valid model framework(s) for a given thermochronologic data set, and these should be communicated along with thermal history models (Fig. 10; section 5.4; Table 3).

5.3. Common Strategies for Data Input

An important consideration when deciding whether and how to include data in a modeling effort is the connection between the data and the theoretical and computational framework within which we are attempting to interpret and reproduce them. Current (U-Th)/He models account for a limited range of mechanisms: thermally activated volume diffusion as possibly modified by domain size and radiation damage, and long alpha stopping distances. Zoning in U-Th can also be incorporated, but it is typically not measured and requires assumptions to extrapolate to three dimensions. If dates are strongly influenced by a mechanism that is not included in the model framework (e.g., U-Th zonation, He implantation, unconstrained kinetic factors), there are no logical grounds for expecting the model to reproduce them; essentially, the only option the model has to fit the data is to distort the tT path to compensate for the physical attribute or process it is missing. If omitted factors have limited and symmetric effects on dates, as might be inferred from modest but non-skewed excess dispersion, then the data can probably be modeled safely, perhaps while including some means of accommodating the excess dispersion such as increasing the estimated uncertainty. If the omitted factors impart a strong positive skewness to the data distribution (e.g., old-date excursions that cannot be explained by eU or grain size), modeling becomes more dangerous and should be executed with extreme caution, if at all. Inverse thermal history modeling should only be undertaken for data that have been carefully evaluated, are considered reliable, and yield sensible and understandable data patterns believed to be accounted for by the model (section 3).

There are two principal options for data input: (1) inputting the individual analyses and their associated uncertainties or (2) using averaging into “synthetic” grains either for the entire sample or within kinetic subgroups.

  1. One approach is to input the individual analyses and their associated uncertainties. This can be considered the most holistic approach, as it allows the inversion procedure to sort through possibly competing kinetic influences from size, eU, and other factors such as composition or zoning if appropriate data have been gathered. However, it has a higher computational overhead and assumes that the uncertainties are fully characterized, which is generally not the case. Most reported uncertainties do not currently include those associated with the FT correction and cannot incorporate factors that are not characterized (e.g., He implantation, eU zonation). If uncertainties are underestimated, the result can be either no fits or an overly restricted swath of tT paths. To compensate, if excess dispersion appears symmetric, individual estimated uncertainties can be increased. However, if some dates partially result from omitted factors or mechanisms leading to a highly skewed distribution, including these dates can distort any joint solution or preclude finding one. QTQt software will search for thermal histories that reproduce the input data to the maximum extent possible and may simultaneously prefer simpler solutions. QTQt also allows uncertainties in U, Th, Sm, He, and grain size to be re-evaluated as a part of the fitting process. However, because of this flexibility, results must be checked carefully against the original data to ensure that any degree of misfit is acceptable.

    An additional consideration is the representation of (U-Th)/He data relative to other data being modeled simultaneously (e.g., fission track, vitrinite). In both HeFTy and QTQt, the more instances of a given type of data that are included, the more that data type influences the solution. Consequently, modeling, for example, five individual (U-Th)/He dates and one fission-track date may weight the solution toward the (U-Th)/He data in a manner not necessarily anticipated or desired by the researcher.

  2. Another approach is to use averaging into “synthetic” grains either for the entire sample or within kinetic subgroups. This approach allows one to apply an uncertainty model that does a better job of characterizing group dispersion than the individual estimated uncertainties, and it also decreases computational load and equalizes weighting among data types. For samples for which it is reasonable to report a mean (see section 4), one could use the weighted or unweighted mean and uncertainty reported for the sample, often without loss of information. This approach is functionally analogous to the standard procedure of combining fission-track single-grain dates into one date and uncertainty.

For samples for which it is not reasonable to report a mean, one can model the data within appropriate kinetic bins, usually eU, but potentially incorporating other factors such as size or composition, if they appear to organize the data (e.g., date-eU correlations or anticorrelations). Uncertainties for each bin can be estimated using the methods in section 4.2.

For both data input cases, when the data show skewed excess dispersion, the question can arise of whether one must use all data or may attempt to identify and omit outliers during the modeling process. In addition to the approaches to outlier identification discussed in section 3.1, another procedure involves evaluating single grains of a sample for compatibility by modeling them individually and overlaying their solutions to determine if one or more are inconsistent with the rest (Sousa and Farley, 2020). This approach provides a potentially concrete and non-biased way of screening grains, but it is not without weaknesses. Underestimated uncertainties on the individual analyses or an inappropriate model structure may yield results that suggest an analysis is inconsistent with the rest of the data when it is not, or a grain can be partially influenced by an omitted factor in such a way that it remains sufficiently consistent with other grains to be included but nevertheless distorts the joint solution. Integrating multiple approaches to outlier identification can be valuable and appropriate.

5.4. Recommendations for Reporting Thermal History Models

Thermal history modeling of (U-Th)/He data sets is an interpretational exercise. There are multiple reasonable approaches to this process, including those outlined above, with the strategy partly dependent on the data pattern, software used, and interpreted geologic context. It is not unusual for scientists to approach modeling of a single data set in different ways. This is no different from the interpretation of any data set in the Earth sciences, where scientists may reasonably reach divergent conclusions depending on the factors most weighted in the interpretational process. As long as the published product clearly explains the rationale for why and how interpretations were developed, and transparently conveys how well the preferred solutions reproduce the observed thermochronologic and geologic data, these differences can be healthy and help focus future work to best test and refine interpretations.

For these reasons, we emphasize that publications should clearly explain the rationale for the model setup and the path to the favored geologic interpretation. Table 3 is a checklist of minimum needed information to report for thermal history models. A semi-standardized format for presenting this information was proposed previously (Flowers et al., 2015). All information needed to assess and reproduce the conclusions, and for others to use the same data to develop alternative models and interpretations, must be provided. However, there is no thermal history modeling approach that is clearly superior to all others for all data sets, and a single modeling philosophy should not be forced on the interpretation of (U-Th)/He data sets.

Numerous opportunities exist to acquire information to improve decisions about how to interpret data, incorporate them into modeling, and infer geologic meaning from the results. Looking forward, we see opportunities in the following areas:

  1. Improved grain characterization. In some circumstances, the acquisition of additional information regarding parent isotope zonation, radiation damage, He distribution, and other factors via LA-ICP-MS, Raman spectroscopy, in situ He analysis (Danišík et al., 2017), and other methods may assist with data interpretation. Implementing efficient lab workflows that include the acquisition of such data as a more routine aspect of (U-Th)/He dating would be an important step in this direction.

  2. Diffusion kinetics. Improved understanding of He diffusion kinetics, and in particular how they are impacted by radiation damage accumulation and annealing, will improve our ability to properly interpret (U-Th)/He data sets, from appropriately attributing dispersion to deriving reliable thermal histories.

  3. Uncertainties and dispersion. Improved quantification of individual analysis uncertainties (see companion paper, Flowers et al., 2022) would be beneficial for determining whether individual analyses vary beyond what is expected from analytical and geometric effects and thus whether they are or are not truly “dispersed.” Appropriate uncertainties on eU, crystal size, and other parameters are similarly needed to effectively interrogate data for date-eU correlations, date-size relationships, and other data patterns.

  4. Representation and statistical characterization of dispersed data sets. Samples with (U-Th)/He dates that differ by more than the single-grain analytical uncertainties are relatively common. This may be due either to expected date variation from variable kinetics or to overdispersion from other causes. Additional approaches for representing such data sets and statistically characterizing them are needed.

  5. Outlier identification. The development and implementation of analytical and statistical methods for fingerprinting (U-Th)/He dates that are anomalous would increase confidence in the identification of biased analyses that should be banished from data sets. For example, continuous ramped heating methods hold promise in this arena (e.g., Idleman et al., 2018).

  6. Generation of larger data sets via costeffective and higher throughput methods. Larger (U-Th)/He data sets would aid in identifying data patterns (e.g., date-eU relationships), characterizing the distribution of sample populations, and fingerprinting anomalous analyses. This would enhance data interpretation and decisions about how to statistically characterize sample data. More data are also required for detrital studies (e.g., Vermeesch, 2004), to decipher thermal histories in areas with complexly evolving thermal structures owing to topography and faulting (Gautheron and Zeitler, 2020), and to interpret orogenic-scale patterns of exhumation and relief change (e.g., Thomson et al., 2010; Ehlers et al., 2015). Developing more cost-effective methods that speed sample throughput will aid in achieving this. Laser ablation techniques (e.g., Boyce et al., 2006; Tripathy-Lang et al., 2013; Evans et al., 2015; Horne et al., 2016; Pickering et al., 2020) and more rapid dissolution methods for refractory phases like zircon are possible options.

  7. Data management. With the desire to increase the amount and types of analytical data associated with (U-Th)/He dates comes an associated need for effective data management tools. Efficient and fully integrated data reduction and management workflows in labs that harness the power of modern database systems would improve the traceability, recoverability, and organization of lab metadata and thereby increase the volume of high-quality data that can be produced and managed. Associated training of lab personnel in data management systems would enable these systems to be further adapted and customized in tandem with new analytical developments.

  8. Thermal history modeling software. Continued improvement in inverse thermal history modeling tools is needed to maximize the tT information extracted from different types of (U-Th)/He data sets. This includes optimizing their ability to efficiently simulate large and dispersed data sets, including deeper-time results for which inversion results are increasingly non unique, large portions of tT space must be searched, and assuming the simplest monotonic cooling-only tT path framework is less likely to be correct. Promoting user understanding of these modeling programs, the pros and cons of each for different types of data sets, and how to set up, present, and defend the model structures used for the tT simulations, will enhance the accuracy and clarity of thermal history interpretations.

  9. Thermal and kinematic modeling software. Quantitative thermal and kinematic interpretation tools, ranging from 1-D and 2-D models of crustal thermal structures to 3-D numerical models that track the evolution of the crustal thermal field, have advanced in tandem with innovations in the (U-Th)/He technique (e.g., Braun, 2003; Ehlers and Farley, 2003; Braun et al., 2012; Mora et al., 2015). Such software enables modeling of the tT paths of rocks transiting through the crust to test geologic hypotheses for the data. It is important that these models are updated to include the most recent He diffusion kinetic models. As laboratory studies continue to constrain the factors that can contribute to a broad span of He diffusivity within single-mineral He thermochronometers, an ongoing challenge is how to honor and exploit this valuable intra-sample complexity when large multi-sample data sets are combined in thermokinematic models.

This manuscript summarizes key considerations associated with evaluating, integrating, and interpreting conventional individual aliquot (U-Th)/He data. The methods associated with the representation, statistical characterization, interpretation, and modeling of different types of (U-Th)/He data sets are under active development. Our goals are to help guide non-experts in the interpretational process, aid in transparent reporting and interpretational practices, and assist in decisions about whether and how to assimilate data into thermal history modeling.

  1. Intra-sample date dispersion (section 2; Table 1). A variety of factors can cause individual (U-Th)/He dates to vary beyond their analytical uncertainties, either due to interpretable kinetic effects (e.g., radiation damage, grain size) or other factors (e.g., U-Th zonation, grain fragmentation and abrasion, parentless He). Many of these effects are magnified by tT paths characterized by protracted cooling through the HePRZ or reheating and partial He loss in the HePRZ. Table 1 summarizes these possible influences on the (U-Th)/He date, if they are affected by the character of the tT path, and whether they can be identified, exploited, and/or circumvented.

  2. Evaluating (U-Th)/He thermochronologic data sets (section 3.1). A reasonable workflow for evaluating (U-Th)/He data sets in thermochronometer studies includes: (1) evaluating individual analysis quality, (2) constructing date-eU and date-grain size plots, (3) assessing the significance of data patterns, (4) considering outliers, and (5) deciding whether it is appropriate to combine individual analyses from a sample using a summary statistic such as a mean date.

  3. Evaluating (U-Th)/He geochronologic data sets (section 3.2). When evaluating (U-Th)/He geochronologic data sets, assessing analysis quality and outliers before combining results is recommended. In this case, the data are expected to define a normally distributed population appropriate for reporting a mean sample date.

  4. Additional considerations in detrital studies (section 3.3). An important first step in detrital studies that affects the interpretational process is to establish whether detrital samples are (1) unreset, such that the (U-Th)/He dates are older than or equal to the time of deposition and record the thermal history of the source region; (2) fully reset, such that the dates are younger than the depositional age and record information about peak burial heating and subsequent cooling; or (3) partially reset, such that dates are both older and younger than deposition and record information about maximum burial temperatures. Detrital samples typically yield wide dispersion, even those that are partially or fully reset, owing not only to the factors that can affect bedrock samples, but also due to variable pre-depositional histories, variable amounts of grain abrasion, and generally lower grain quality.

  5. Integrating individual analyses (section 4; Table 2). Use of a summary statistic to represent sample data assumes that the underlying distribution of dates for the sample population is known (e.g., He et al., 2021). For samples with normally distributed single grain dates, reporting a central tendency statistic is appropriate, but such a statistic is inappropriate for samples characterized by substantial skew in the date distribution or by kinetic variation. The criteria for combining or not combining individual analyses into a summary statistic should be stated, although at present there is no community agreement on approach. If data are combined, then how they are combined (e.g., unweighted mean, weighted mean, nature of weighting), how the uncertainty is represented, what factors are included in the uncertainty, and the confidence interval (1σ or 2σ) should be reported.

  6. Interpreting data with thermal history models (section 5; Table 3). Thermal history modeling is a central tool used to convert (U-Th)/He data to thermal history interpretation and test hypotheses for the data based on the sample context. Which data are simulated, how they are input into the model (section 5.3), and what independent geologic and geochronologic constraints are considered reliable, relevant to the simulated samples, and therefore important to honor (section 5.2) are all interpretive decisions that must be made and should be carefully considered during model setup. Different modeling philosophies and approaches have been developed, and it can be entirely reasonable for divergent scientific conclusions to be reached depending on the choices made and weighting of different factors during the interpretational path. Consequently, when thermal history models are used to interpret data, it is imperative to clearly explain the rationale for the model framework and how geologic interpretations are developed from model results so that the others can assess the reliability of the outcomes and how they can be further tested. Therefore, the published product should state and explain (also see section 5.4 and checklist in Table 3): (1) the rationale for the model structure, logic for the imposed tT constraints, and details of the model setup; (2) constraints on the character of tested tT paths or solutions; (3) what data are used in the modeling and why; (4) how data are input (i.e., as single grains, or as synthetic sample or kinetic subgroup averages); (5) the kinetic model(s) used; (6) all data and information need to reproduce thermal history model results; (7) tT modeling program used and statistical fitting criteria; (8) clear representation and explanation of model outputs, including a representation of how well the tT path results fit the observed data; and (9) how model outputs were used to reach the final geologic interpretation.

Two anonymous reviewers and Kip Hodges provided helpful comments that improved this contribution. We thank Brad Singer for efficient editorial handling and Kerry Gallagher for feedback on an earlier version of this manuscript. U.S. National Science Foundation grants EAR-1822119, -1844182, and -1925489 to R.M. Flowers provided partial support for this work.

Science Editor: Brad S. Singer