## Abstract

Although Yosemite Valley, USA, catalyzed the modern environmental movement and fueled foundational debates in geomorphology, a century of investigation has failed to definitively determine when it formed. The non-depositional nature of the landscape and homogeneous bedrock have prevented direct geological assessments. Indirect assumptions about the age of downcutting have ranged from pre-Eocene to Pleistocene. Clarity on this issue would not only satisfy public interest but also provide a new constraint for contentious debates about the Cenozoic tectonic and geomorphologic history of the Sierra Nevada in California. Here we use thermochronometric analysis of radiogenic helium in apatite crystals, coupled with numerical models of crustal temperatures beneath evolving topography, to demonstrate significant late Cenozoic deepening of Tenaya Canyon, Yosemite’s northeastern branch. Approximately 40%–90% of the current relief has developed since 10 Ma and most likely since 5 Ma. This coincides with renewed regional tectonism, which is a long-hypothesized but much debated driver of Sierran canyon development. Pleistocene glaciation caused spatially variable incision and valley widening in Yosemite Valley, whereas little contemporaneous erosion occurred in the adjacent upper Tuolumne watershed. Such variations probably arise from glacial erosion’s dependence on topographic focusing of ice discharge into zones of rapid flow, and on the abundance of pre-existing fractures in the substrate. All available data, including those from our study, are consistent with a moderately high and slowly eroding mid-Cenozoic Sierra Nevada followed by significant late Cenozoic incision of some, but not all, west-side canyons. A likely driver of this event was range-crest uplift accompanied by fault-induced beheading of some major drainages, although other mechanisms such as drainage reorganization following volcanic deposition are plausible.

## INTRODUCTION

Much of the relief of California’s Yosemite Valley formed as a river canyon prior to mid-Pleistocene glaciation (Matthes, 1930), but the timing and age of incision remain poorly known. All prior claims are merely assumptions, either that incision coincided with tectonism or with a subset of the spatially-variable geologic events found elsewhere in the Sierra Nevada. In the canonical study, Matthes (1930) assumed that major canyon incision coincided with Pleistocene faulting at the Sierra Nevada’s eastern edge. South of Yosemite, 3.6 Ma extrusive volcanic rocks scattered in the San Joaquin Canyon indicate substantial incision prior to this time (Huber, 1981), which suggests a similar minimum age for Yosemite. (The locations of major river systems and other features noted in the text are displayed in a regional map in Appendix 1.) Remnant exposures of a 10 Ma lahar and lava flow on both sides of Tuolumne Valley, Yosemite’s northern neighbor, indicate that the Tuolumne Canyon either had not yet formed or was filled with debris at that time (Huber, 1990). Apatite (U-Th)/He data from the southern Sierra suggest canyons deepened after 30 Ma (McPhillips and Brandon, 2012; Clark et al., 2005). In the northern Sierra, however, inset gravel deposits provide incontrovertible evidence that some canyons deeply incised the granite bedrock prior to 40 Ma (Gabet and Miggins, 2020), which perhaps indicates a similar antiquity of Yosemite itself (Schaffer, 1997). However, no consensus exists about the occurrence or magnitude of late Cenozoic uplift of the Sierra Nevada range crest, which is a plausible driver for canyon incision (Wakabayashi, 2013; Martel et al., 2014; Wernicke et al., 1996; Mulch et al., 2006; Cassel et al., 2009; Gabet, 2014).

Here, we present direct constraints on Yosemite Valley’s formation from apatite (U-Th)/ He thermochronometry (Farley, 2002) using granitoid bedrock samples taken from in and around the valley and a technique not previously applied to canyon development in the Sierra Nevada: ^{4}He/^{3}He thermochronometry (Shuster and Farley, 2005). The diffusive loss of radiogenic ^{4}He from apatite crystals over geologic time increases strongly with temperatures above ~30 °C. Thus, both the total accumulated ^{4}He, measured as (U-Th)/He age (*A*_{He}), and its intracrystalline spatial distribution, measured by ^{4}He/^{3}He degassing spectra (*R*_{4/3}), record the low-temperature thermal history of rocks (Farley, 2002; Shuster and Farley, 2005). Thermal history, in turn, relates to depth below surface. Measured *A*_{He} values, which correspond to the time when a sample resided ~1.5–3 km below the surface, are ca. 40 Ma at the bottom of Yosemite Valley and 60 Ma on the surrounding uplands (House et al., 2001). This age difference is consistent with canyon deepening after the Eocene but, given the low regional erosion rates, must at least partially arise from larger thermal gradients beneath the canyon associated with topographic focusing of heat flux. For example, given a mid-range background thermal gradient of ~30 °C km^{-1}, rocks beneath flanking upland surfaces would pass through isotopic closure at ~0.8 km greater depth than beneath an adjacent canyon with the cross-sectional morphology of Yosemite, which implies a difference of more than 8 Ma in *A*_{He} for erosion rates <0.1 km Ma^{-}1.

Inferring topographic evolution from thermochronometry requires methods for analyzing two sequential couplings: how erosion and topographic evolution control the temperature history of rocks now exposed at the surface and how a mineral’s temperature history controls its isotopic composition (symbolically: Topography → Temperature → Isotopes). Our study uses a Monte Carlo inverse analysis (Mosegaard and Tarantola, 1995) of the Temperature → Isotopes coupling to identify temperature histories consistent with isotopic data at a specified level of performance and then uses a gradient-descent optimization of a crustal temperature model with evolving topography to invert the Topography → Temperature coupling. In the simple situation evaluated here, the deepening of a canyon relative to an adjacent upland, the Topography → Temperature calculation produces an essentially exact match, so the performance of a topographic model is equivalent to the performance of the temperature history governing its optimization. The performance scores for models are probabilistic relative likelihoods (Royall, 1997; Glover and Dixon, 2004), and they are evaluated in comparison to models that best match the isotopic data.

We note two distinctions between our analysis and the thermochronometric method applied previously in some other studies (e.g., McPhillips and Brandon, 2012), wherein topographic history is defined at the outset by a small set of parameters, whose values and ranges are then determined by inversion. Although it is a strategically useful method in many situations, the latter approach necessarily means that initial assumptions about the characteristics of topographic evolution restrict the range of temperature scenarios evaluated against isotopic data. There is no need to impose such limiting constraints in our study, and we invert the Temperature → Isotopes separately, subject only to an assumption of monotonic cooling while allowing more complex histories in sensitivity studies. Further, instead of parameter values and ranges, our inversion produces temporal functions— cooling histories and corresponding topographic scenarios—whose performance against isotopic data can be evaluated in comparison to the others, and which can be grouped into sets that contain the range of behaviors consistent with the data at a given level of confidence. Any desired parameters describing topographic evolution can be extracted from these sets.

Our manuscript proceeds through four sections: (1) summary of data acquisition, (2) explanation of methods and results for deriving cooling histories from individual crystals, (3) explanation of methods and results for topographic evolution, and (4) discussion of possible roles of tectonics and glaciation as driving factors. Appendix 1 provides a regional map. The study is motivated foremost by Yosemite Valley’s stature as one of the most famous topographic features on the planet, visited by some four million people each year and illustrated abundantly in works ranging from elementary geology textbooks to the fine art photographs of Ansel Adams and the vernacular art of calendars and screensavers. One central goal of the geosciences is to satisfy public curiosity about the history and genesis of Earth features. A second motivation arises from Yosemite’s character as a particularly prominent and narrow canyon on the Sierra Nevada’s western flank. As such, information about its formation may illuminate ongoing debates about this range’s tectonic and geomorphological history. The Sierra Nevada is a continental range that stands high despite the absence of a crustal root. No consensus exists as to whether its elevation increased or decreased in the late Cenozoic and how its prominent west-flank canyons manifest the range’s tectonic history as the margin of a high inland plateau, a volcanic arc, and the site of a crustal root delamination event (Zandt et al., 2004; Jones et al., 2004; Wakabayashi, 2013; Martel et al., 2014; Wernicke et al., 1996; Mulch et al., 2006; Gabet, 2014; Gabet and Miggins, 2020; Schaffer, 1997).

## DATA ACQUISITION

### Samples

We collected bedrock samples from 16 sites in and near Yosemite National Park in 2010 and 2012. Apatite was separated from samples at the Berkeley Geochronology Center (BGC) or at Apatite to Zircon, Inc., using crushing, sieving, magnetic, and heavy liquid mineral separation techniques. Twelve sites yielded apatites for analysis (Table 1) and nine of these, mapped in Figure 1A, produced *R*_{4/3} spectra.

### Apatite (U-Th)/He Ages

*A*_{He} were calculated using established radiogenic production equations and decay parameters (Farley, 2002; Shuster and Farley, 2005) and measured concentrations of U, Th, and Sm parent nuclides and ^{4}He. Parent nuclides were measured, as in previous work, with inductively coupled plasma-mass spectrometry (ICP-MS), either by isotope dilution for whole crystals or laser ablation mapping of cross-sections (House et al., 2000; Tremblay et al., 2015; Fox et al., 2017; Farley et al., 2011; Fox et al., 2014). ^{4}He was measured either by laser heating of crystals to yield gas for cryogenic purification and ICP-MS or as the cumulative release in degassing experiments for *R*_{4/3} (Shuster and Farley, 2005). In all cases, *A*_{He} was corrected for alpha ejection losses (Farley et al., 1996). Figure 1, Tables 1–2, and Supplemental Material Tables S1–S15^{1} provide data.

Site-averaged *A*_{He} values (Table 1) were determined from at least three aliquots (Table S1). For the individual crystals used in ^{4}He*/*^{3}He analysis, *A*_{He} was calculated by one of three methods, which are designated as A, B, or C in Table 2 and elaborated on further in Supplemental Material 1.2 (see ^{footnote 1}). (A) If the measured *R*_{4/3}(*F*) spectrum did not indicate parent nuclide zonation (Farley et al., 2010), the bulk U and Th content of the crystal was determined by isotope dilution and combined with the measured total ^{4}He release from the degassing experiment to calculate a bulk *A*_{He}. (B) When the *R*_{4/3}(*F*) pattern indicated the likely zonation of parent nuclides within the crystal, we obtained spatially resolved U and Th measurements on a planar section through the crystal via laser ablation–inductively coupled plasma–mass spectrometry (LA-ICP-MS) performed at the BGC. Data collection and reduction methods were reported previously (Fox et al., 2017). Two-dimensional concentration maps were converted to effective 1-D radial zonation profiles (Farley et al., 2011). Radial profiles are scaled to the bulk “effective Uranium concentration,” eU = U + 0.235 Th, for each crystal. For one case, crystal MV04b, zonation was treated as 3-D (Fox et al., 2014). (C) If a crystal was neither dissolved nor mapped due to loss, usually in transfer from tube to epoxy mount, we set the *A*_{He} equal to the site-averaged bulk age and calculated total eU.

*R*_{4/3} Analyses

Following Shuster and Farley (2005), the naturally occurring intracrystalline spatial distribution of radiogenic ^{4}He was inferred via controlled, sequential degassing analysis of crystals containing a spatially uniform, proton-induced distribution of ^{3}He. Apatite crystals were loaded into Sn foil capsules, placed within Teflon containers, and axially stacked in quartz tubes. At the Francis H. Burr Proton Therapy Center (Massachusetts General Hospital, Boston, MA), the tubes were bombarded with protons of ~220 MeV incident energy for ~5 h, with a total proton fluence of about ~10^{16} cm^{-2}. After irradiation, samples were returned to the BGC for analysis as detailed in Tremblay et al. (2015). Briefly, irradiated euhedral, inclusion-free crystals were photographed and individually loaded into Pt-Ir packets for sequential heating under ultra-high vacuum using a feedback-controlled 70W diode laser coupled with a calibrated optical pyrometer. For each step, the released gas was purified with a SAES GP-50 getter pump. The remaining gas was directed into a temperature-controlled cryogenic trap held at 11 K. The adsorbed gas was released from the trap at 33 K, and the ^{4}He/^{3}He ratio and the molar abundance of ^{3}He were measured using a MAP 215-50 sector field mass spectrometer. Blank measurements were obtained at every sixth heating step. ^{4}He/^{3}He data are normalized to their mean values (herein, *R*_{4/3} indicates the normalized ratio) and reported as a function of the fractional cumulative release of ^{3}He (S*F*^{3}He, herein shortened to *F*). Resulting *R*_{4/3}(*F*) spectra are shown in Figure 2. Comparison of these spectra with modeled *R*_{4/3}(*F*) calculated for the same crystal properties and specified time-temperature histories constitutes the primary basis for our study. Data from the step-heating experiments are provided in Tables S2–S15.

### Quality Control of *R*_{4/3} Spectra

The basis for *R*_{4/3} analysis is detection and modeling of the diffusive redistribution of the intracrystalline spatial profile of radiogenic He. *R*_{4/3} data cannot be used to interpret cooling history if the diffusive signal cannot be identified and modeled. Potential reasons for such problems include significant and unmeasured spatial variations of parent isotope concentration (“zonation”), incomplete characterization of controlling physics, such as zones of enhanced diffusivity, and edge effects produced during irradiation. We defined a rigorous but conservative criterion for whether the *R*_{4/3} spectrum for a given crystal displays enough of a coherent diffusive signal to constrain cooling history ( Appendix 2). Of our 19 measured *R*_{4/3} spectra, 14 passed this test and are used in Figure 1. In analyzing those 14, we also removed a small number of outlier points at the beginnings and ends of the spectra that either reflected very small quantities of gas or defined large changes in the slope *dR/dF* that cannot be explained by any diffusive physical model ( Appendix 2). At beginnings of spectra, the latter occur at values of *F* < 0.03 and thus represent a very small fraction of the total measurement. They cannot reflect the multi-million year cooling history, which must be detectable, in any case, from the remainder of the profile provided there are coherent data at *F* < 0.2, as in, for example, Figures 2A, 2D, 2H, and 2J.

## RECONSTRUCTING COOLING HISTORIES

To identify the characteristics of temperature-time (*T*(*t*)) cooling histories consistent with isotopic data, we used an exhaustive-search Monte Carlo method, following Schildgen et al. (2010). For each crystal we randomly generated a large number of *T*(*t*) and calculated model *A*_{He} and *R*_{4/3}. These calculations account for intracrystalline radiogenic ^{4}He ingrowth and diffusion, with a diffusivity dependent on temperature as well as the production and annealing of radiation damage (Shuster et al., 2006; Flowers et al., 2009). *T*(*t*) histories that match *A*_{He} to within uncertainty were assigned a relative likelihood score, which is defined in the present study according to their performance against measured *R*_{4/3}.

### Method: Modeling Intracrystalline ^{4}He and *R*_{4/3}

For numerical modeling of the production and diffusion of ^{4}He within apatite crystals, we used three components of prior work (Schildgen et al., 2010; Fox and Shuster, 2014; Fox et al., 2014):

Given a temperature history

*T*(*t*) and characteristics of an analyzed crystal (size and parent nuclide concentration), a “geological model” calculates the amount and spatial distribution of^{4}He over time by solving the equation for^{4}He production and diffusion, using the Crank-Nicolson finite difference scheme and a spherical approximation (Ketcham, 2005). Boundary conditions are zero^{4}He flux across the grain center and zero^{4}He concentration at the surface. The calculation accounts for alpha ejection, the evolution of diffusivity as a function of time and position within the crystal (Flowers et al., 2009), and zonation of U and Th parent nuclides. The latter is assumed to be uniform if no zonation data are available for the crystal, or either radially symmetric or 3-D as noted previously. Table 2 lists input parameters for intracrystalline modeling.A “degassing model” begins with a specified intracrystalline

^{4}He distribution and predicts the*R*_{4/3}(*F*) produced over the course of a given degassing experiment. The initial distribution of^{4}He is the end state of a calculation with the geological model for a specified*T*(*t*). The equations and numerical scheme are the same as in the first model, except that new He production is negligible and the model also tracks the evolution of^{3}He from an initially uniform distribution. Diffusivities of^{3}He and^{4}He are assumed to be equal.The randomly generated

*T*(*t*) histories were used as forcings on the geological model for each crystal. All of the*T*(*t*) are monotonic cooling histories, meaning that temperature could only remain constant or decrease forward in time; there is no evidence of deep burial or Cenozoic intrusion in the study region to indicate that reheating would be significant. We evaluate the implications of reheating scenarios later. The initial point for every*T*(*t*) is 150 °C at 90 Ma, but as a sensitivity test all analyses were repeated with a 70 Ma starting age. The final point,*T*(0), is set to the modern annual mean surface temperature at the sample’s location. The rest of the*T*(*t*) consists of six to 10 steps defined as pairs of randomly generated times and temperatures. Times are taken from a uniform distribution between the initial age and zero Ma. Temperatures are equal to their previous value minus a normal random variable truncated by setting all negative values to zero. Temperatures so calculated that fall below*T*(0) are re-assigned to equal*T*(0). We also examined a small set of prescribed*T*(*t*) designed to assess the absolute upper limit of cooling since 10 Ma permitted by the data for valley-bottom samples. For consistency with*A*_{He}, this upper limit requires a prior period lasting ~40 m.y. with no cooling at all. Such a scenario is physically implausible for upland interfluvial sites because it involves zero erosion, but in valley bottoms it is possible because sediment cover may halt incision.

Monthly weather data available from the National Park Service for Yosemite Valley (elevation 1200 m) and Tuolumne Meadows (elevation 2620 m) indicate mean annual temperatures of 12–13 °C and 3–4 °C, respectively, and hence a lapse rate of 6–6.5 °C km^{-1}. Thus, we approximated *T*(0) = 12 °C for the Tenaya Canyon sites and 5 °C for the Porcupine Flat upland site (Table 2).

The number of randomly generated *T*(*t*) for each crystal depended on how many realizations were needed to produce a set of dozens of high-performing models for use in statistical evaluations, and it ranged from 20 ´ 10^{3} to 120 ´ 10^{3} (Table 2). For each crystal, each *T*(*t*) was used to drive the geological model to calculate *A*_{He} and the initial condition for the degassing model. If the calculated *A*_{He} matched the measured value to within one standard deviation, or 5% of the age if a single crystal was measured, the degassing model was then used to calculate *R*_{4/3}(*F*) for comparison to the data. This comparison assigns a relative likelihood score to each *T*(*t*) meeting the *A*_{He} criterion, using the original method described next.

### Method: Relative Likelihood Statistic λ from *R*_{4/3} Data

The set of tens of thousands of *T*(*t*) randomly generated for evaluation against data so thoroughly covers the range of possibilities that one must be very close to the true cooling history. The most likely candidate, *T _{o}*(

*t*), is the one that minimizes the summed squared mismatch between modeled and measured

*R*

_{4/3}(

*F*). The likelihood of one of the other

*T*(

*t*) candidates being the true cooling history instead is smaller by a multiple of magnitude λ ∈ [0,1]. The following explains how we chose to calculate λ.

Consider two model-calculated functions, *M _{o}* and

*M*

_{1}, whose form and allowable values are determined by physics, as with our

*R*

_{4/3}(

*F*), and which compare to measured data as shown in Figure 3A.

*M*passes closely through the data and minimizes the summed squared data–model mismatch, while

_{o}*M*

_{1}is an alternative. It is obvious that

*M*

_{1}is less likely to be the true model, but by how much? One usual answer emerges by choosing an error model for the data (normal distribution, σ = uncertainty), resampling a large number of times, and recording how many times

*M*

_{1}performs better than

*M*compared to the number expected if both models were equally good. The ability of the data set to discriminate between the two models in this fashion increases with the number of measurements, with γ

_{o}^{-1}, and with the separation between

*M*

_{1}and

*M*at each point.

_{o}If systematic mismatches between data and models arise for reasons not accounted for in the physical models, as in the case illustrated by Figure 3B, the same calculation based on resampling would lead to gross over-confidence in the likelihood of *M _{o}* if the random variable’s means are assumed to be equal to the measured values. Nonetheless, such data are evidence in favor of the model to which they are closest, and their usefulness for discriminating between models increases with model separation (useful in Fig. 3B, for example, but not in Fig. 3C). Additionally, to avoid overly confident claims in such cases, γ used in resampling should be whichever is larger, measurement uncertainty or the mismatch between data and

*M*.

_{o}Based on the preceding discussion, for each crystal in our analysis, we define the likelihoods (λ) of each *T*(*t*) relative to *T _{o}*(

*t*) based on the power of the data set to discriminate between models, assuming normality of errors, as

for an *R*_{4/3}(*F*) spectrum with *N* measurements. Here, *δ _{i}* is the capacity for discrimination at individual measurement

*i*, a measure of distance between models as defined in Figure 3D. The maximum value of

*δ*is the distance between models at

_{i}*i*, while

*δ*= 0, meaning that measurement

_{i}*i*has no discriminatory power, if the two models intersect at

*i*or if the data point lies halfway between them. The indicator

*s*is +1 or -1 depending on whether the measurement is closer to the model for

_{i}*T*(

_{o}*t*) or

*T*

_{1}(

*t*), respectively. Parameter s is a generalized uncertainty that is equal to whichever is larger, the measurement uncertainty or the root mean square mismatch of data from the best model,

*M*(Fig. 2). Equation 1 approximates well a particular form of the previously mentioned counting method for determining relative likelihoods (Fig. 4A).

_{o}When individual parameters are inferred by inverse analysis, a probability density function (pdf) of its value can be constructed. The ratio of a value’s probability density relative to density at the mode in such a pdf is equivalent to λ in our case. For a normal distribution, the 1σ and 2σ positions (68.3% and 95.4% confidence levels) occur at ratios of 0.607 and 0.135. We define our category boundaries in the same way, an assignment that is necessarily appropriate because λ depends on the summed behavior of numerous random variables (Figs. 4B–4C). We also define an “optimal” category for which λ > 0.90, to highlight the small number of bestperforming models.

### Results: Cooling Histories

Figures 1B–1O display inferred cooling histories for individual crystals, and Figure 2 illustrates the corresponding model matches to *R*_{4/3} data. Compared to their counterparts on the adjacent northern uplands, rocks in Tenaya Canyon, the deep northeast branch of Yosemite Valley (Fig. 5), experienced significant cooling much closer to present and were 25–50 °C warmer in the early Miocene, 20 m.y. ago (Figs. 1F–1J versus Figs. 1B–1E). This difference can be explained only if considerably more overburden sat above the canyon samples than their upland counterparts and the canyon deepened more recently. Late Cenozoic cooling also occurred in the central main Yosemite Valley (Figs. 1K–1L), but the magnitude is not well-constrained. High-quality sample SS1b (Fig. 1K) indicates much more cooling than low-quality sample CW1b (Fig. 1L). The magnitude of cooling decreased westward (Figs. 1M–1N).

*R*_{4/3} analysis lacks sensitivity to temperatures of <~25 °C, but geological constraints provide confirmation of the upland sites’ recent cooling history. Cosmogenic isotope analyses indicate that gently sloped interfluve regions of the non-glaciated granitic western Sierra erode slowly, at 8–24 m Ma^{-1} (Stock et al., 2005; Callahan et al., 2019). Such erosion would produce 4–20 °C of cooling since 20 Ma for geothermal gradients of 20–40 °C km^{-1}, as indicated by black bars in Figures 1B, 1C, and 1O. That such low erosion rates apply to the multimillion year timescale is confirmed by the presence of Pliocene lava flow remnants on the uplands between Yosemite and the San Joaquin Canyon (Huber et al., 1989) and by the observation that granite benches adjacent to erosion-resistant 9 Ma lava flows lie at elevations only 100–200 m beneath them (for example, fig. 3 of Hildreth et al., 2022).

## TOPOGRAPHIC INTERPRETATION

### Strategy

To interpret cooling histories in terms of erosion and topographic development, a numerical thermal-kinematic model of crustal temperatures is required to account for transient responses and spatial focusing of heat flux in canyons. Our model (Fig. 6) provides 2-D, time-dependent solutions of the heat equation on a domain representing a vertical plane intersecting both the Tenaya Canyon sample site (red dot in Fig. 1, red arrow in Fig. 5) and the Porcupine Flat upland site (blue dot, blue arrow). The five crystals from Tenaya Canyon allow determination of a well-constrained mutual signal, which provides a focus for analysis and unusually high confidence. We validated our 2-D approach by comparison to limited 3-D calculations ( Appendix 3). No attempt was made to model the full 3-D topographic evolution because, of the valley sample sites, only Tenaya Canyon provided enough redundancy for a strong interpretation of *R*_{4/3} data.

The model was optimized by matching predicted cooling histories *T*(*t*) for the canyon and upland sites to reconstructed site-averaged *T*(*t*) for the same two locations (*T*_{C} and *T*_{U}). Each such “target pair” was assigned a probabilistic relative likelihood score (P_{CU}) based on its ability to match measured *A*_{He} and *R*_{4/3}, combining probabilities for all crystals and both sites. Upon optimization, the crustal temperature model very closely reproduces *T*_{C} and *T*_{U}. The score P_{CU} thus also applies to the corresponding optimized model, including its topographic evolution scenario.

### Site-Averaged Cooling Histories and Relative Likelihood Statistic P_{CU}

Any cooling history consistent with isotopic data at a site must necessarily be an average of the individual crystal *T*(*t*) values that match data well. We calculated a range of *T*_{C} and *T*_{U} (Fig. 7A) as averages for all crystals at each site, weighted according to uncertainty, and spanning the range of l and *T*(*t*) values found for individual crystals ( Appendix 4). Each history, *T*_{C} or *T*_{U}, was used to drive the intracrystalline He model for each crystal (*i*) at its site (*j*) to calculate a (U-Th)/He age (*A _{ij}*) and an

*R*

_{4/3}(

*F*) spectrum. From the latter, we calculated the corresponding single-crystal relative likelihood,

*λ*, as described previously. The P

_{ij}_{CU}for a canyon history

*j*= C paired with an upland history

*j*= U depends multiplicatively on the target pairs’ probabilistic matches to both

*A*

_{He}and

*R*

_{4/3}for both sites:

where *P*_{∗} is the maximum value of the numerator in the entire set of pairs, and

The normalization to *P*_{∗} renders P_{CU} a comparative assessment of pairs to one another, under the assumption that we have examined the range of possible *T*(*t*) thoroughly enough that one of the pairs must be close to the true one. Such normalization is conservative, in that it can only increase likelihoods and so admit more models into a given category. In Equation 4, *A _{mi}* is the measured

*A*

_{He}for crystal

*i*, and σ

*is its standard deviation. Index*

_{i}*i*ranges through all of the crystals used at a site (five for Tenaya Canyon and two for the Upland site). Following universal practice of weighting individual data inversely to their uncertainty, we assign each crystal a weight,

*w*, which represents a generalized inverse uncertainty. Coefficients and

_{i}*w*

_{i}^{∗}indicate, respectively,

*w*normalized to the site maximum and to

_{i}*w*summed for the site. Weight factors

_{i}*w*are products of four indicators of a measured crystal’s ability to constrain temperature history. They are larger if the bestfit scenario,

_{i}*T*(

_{o}*t*), matches the data closely, if results are insensitive to removal of points at the beginning of

*R*

_{4/3}spectra, if mean results do not shift between the “optimal” and “1s” categories, and if results are insensitive to the starting age of the random

*T*(

*t*). Figure 2 gives values for weights, while Appendix 5 presents the contributing indicators and formulae.

To illustrate, the thick red and blue curves in Figure 7A are the *T*_{C}, *T*_{U} pair of greatest likelihood, and Figure 8 shows their performance against measured *R*_{4/3} and *A*_{He}. Figure 7B displays the temperature differences between canyon and upland for all pairs, with coloring according to P_{CU}.

### Crustal Temperature Model

To calculate crustal temperature evolution in response to exhumation, our model uses the control volume method of Patankar to solve the heat balance equation (Patankar, 1980; Cuffey et al., 2016) on a grid representing the 2-D, 41 km northwest-to-southeast transect through Tenaya Canyon and upland sample sites (Fig. 6A). The grid consists of 120 nodes in both horizontal and vertical directions, with closer spacing near the surface and centered on Tenaya Canyon, where control volume sizes are ~90 m ´ 90 m (Fig. 6B).

We impose a spatially uniform but time-varying background exhumation rate, represented in the model by an upward vertical velocity, and a sequence of surface topographies over time. The grid point at the topographic surface and its control volume are reconfigured at each time step as topography evolves, to ensure exact placement of the air-rock boundary and associated thermal boundary condition.

The Kelvin temperature, *T*, evolves with time, *t*, and elevation above grid base, *z*, according to:

where ρ is density, *c* the heat capacity, *k* the thermal conductivity, *Ṡ* is the source term (set to zero except in sensitivity tests), and *v* is the vertical velocity corresponding to background exhumation rate. Parameters *c* and *k* depend on temperature according to *k* = 2.34 - 0.00198[*T* + 273.15] W (mK)^{-1} and *c* = 680 + 1.70[*T* + 273.15] J (kg K)^{-1}, which are averages of typical granite with typical granodiorite (Miao et al., 2014; Robertson, 1988). Horizontal temperature gradients at left and right boundaries are set to zero, while the basal temperature is held constant. The latter is an adjustable parameter and proportional to the mean geothermal gradient. Temperature at the topographic surface is prescribed and varies with elevation according to a lapse rate of 6.16 °C km^{-1} and a temperature at the current bottom of Tenaya Canyon of 12 °C. Temperatures at nodes in the overlying atmosphere are likewise prescribed and have no effect on crustal temperatures. Simulations begin with a steady-state solution at 55 Ma and run forward to present, with 120 temporal nodes distributed to provide higher resolution for recent millennia. The first 5 m.y. of the simulations are discarded. Figure 6C shows one example of the calculated modern temperature field for a portion of the grid.

Differences in exhumation rates from place to place are controlled by evolving the surface topography through a sequence of forms based on simplified transformations of the modern topography, with varying degrees of elevation difference between the upland and canyon sites (Fig. 9). In general, the surface elevation of a landscape *z _{s}*(

*x,t*) evolves as the difference between rock uplift rate (

*x,t*) and exhumation rate

*ė(x,t)*. Here, we assume uniform uplift rate (

*t*) and spatially variable exhumation rate (

*t*) +

*ṡ(x,t)*, where

*ṡ*is the rate of change between the surfaces shown in Figure 9. The previously mentioned background exhumation rate equals (

*t*). In our models, consequently,

*ż*= –

_{s}(x,t)*ṡ*(

*x,t*), and

*z*(

_{s}*x,t*) simply equals the elevations in Figure 9. In reality, there would be additional uniform elevation changes from background exhumation not equaling isostatic or tectonic uplift. These changes are unknown. We are justified in ignoring them because any associated lapse rate surface temperature changes would have been identical at the different sample sites and in any case would be small in magnitude compared to exhumation cooling. The base of the model grid is fixed relative to the absolute elevations in Figure 9. Model results are optimized against

*T*(

*t*) histories for only the two sample sites, making the precise character of these topographic forms unimportant.

### Model Optimization

To produce the set of topographic evolution results, we optimized the crustal temperature model against each pair of canyon and upland cooling histories (Fig. 7A), *T*_{C} and *T*_{U}. Except for sensitivity tests, no attempt was made in this study to model the period of rapid cooling prior to 50 Ma. Optimizations were achieved by first choosing a starting topographic stage from the set in Figure 9, then directly calculating a constant geothermal gradient and a time-varying background exhumation rate (*v* in Equation 5) to match the 45 Ma upland-to-canyon temperature difference and the entire *T*_{U}, respectively (for example, Figs. 10A–10D). We then matched *T*_{C} by using a linear gradient-descent algorithm to assign ages to the topographic stages of Figure 9, starting from a nominal, standard progression. The time-varying background exhumation rate, combined with the time- and space-varying surface topography, offer sufficient degrees of freedom to always match the *T*_{C} and *T*_{U} histories closely, to within a thermochronometrically indistinguishable tolerance of 3 °C prior to 40 Ma (the approximate *A*_{He} for Tenaya Canyon samples) and usually to within 1 °C after (Figs. 7A and 10A–10B). Most optimizations used the sequence of topographic forms defined by the green and red curves in Figure 9. In some cases, prior to the late Cenozoic period of rapid canyon cooling, the upland cools more rapidly than the canyon, requiring use of the blue curves before the red ones.

Basal boundary temperature at 20 km depth is held constant and determines the mean geothermal gradient. The true time-averaged value for this gradient is unknown, but its inferred value scales inversely with net canyon incision between 50 Ma and present, and hence it depends on the starting topography chosen. The optimization for a *T*(*t*) pair was repeated using several different initial topographic stages, which produced a suite of results for different geothermal gradients that shared the same P_{CU} (e.g., Fig. 10E). Results for specified arbitrary geothermal gradients were then calculated by interpolation.

### Results

Our results (Fig. 11A) indicate substantial deepening of Tenaya Canyon since 10 Ma, of magnitude 0.85–1.3 km or 0.55–1.0 km for geothermal gradients of 30°C km^{-}1 and 40 °C km^{-1}, respectively (2s uncertainty, P_{CU} > 0.13). In the most likely scenario (Figs. 7C–7D; magenta in 11A), most of this deepening occurred since 5 Ma. A deepening of 0.55–1.3 km represents ~40%–90% of the current relief from the summit of Half Dome to the Yosemite Valley floor (Fig. 5). Thus, after a long period of comparative quiescence, the eastern section of the Yosemite landscape was transformed in the late Cenozoic. Increased downcutting most likely started between 3 Ma and 7 Ma (Figs. 11B–11C) but maybe as long ago as 10 Ma (1s uncertainty) or even 20 Ma (2s). Although the magnitude of inferred erosion is inversely related to the geothermal gradient, the timing of canyon deepening is largely independent.

Typical continental geothermal gradients are 20–40 °C km^{-1}, though higher values occur in active volcanic arcs and lower values occur today in the western Sierra foothills (Saltus and Lachenbruch, 1991; Blackwell et al., 2011). In our results, gradients of <~30 °C km^{-1} imply a reconfiguration of topography such that elevations at the current canyon’s location exceeded those at the upland site in the Miocene. Because such a reconfiguration of topography is unlikely, and because 45 °C km^{-1} is excessive compared to comparable non-volcanic settings, we regard a 30–40 °C km^{-1} gradient as likely in the vicinity of Tenaya Canyon from the mid Tertiary through the time of incision.

Temporal changes in the geothermal gradient could obscure the true temporal progression of incision. The gradient may have increased during the middle Miocene to Pliocene episodes of volcanism and delamination that occurred both north and south of the Yosemite region (Busby and Putirka, 2009; Saleeby and Foster, 2004), though the absence of intrusions suggests no major increase. Such an event would not alter our conclusion of late Cenozoic incision, however, as the possible magnitude of sustained rock reheating is zero at the surface and increases with depth. For example, a significant reheating lasting from 15 Ma to 5 Ma, of a magnitude consistent with our He isotopic data, would require ~1 km of Tenaya Canyon incision since 5 Ma ( Appendix 6). Reheating scenarios generally move the timing of inferred incision toward the present.

Episodes of burial could also, in principle, obscure a more complex incision history. The thermochronometric data do not preclude significant downcutting of Tenaya Canyon during the Oligocene or Miocene, if followed soon by reburial in volcanic debris and then by re-excavation since 5 Ma. Evaluation of such a scenario must rely on geological arguments, in particular the paucity of sediment accumulation in the Central Valley between 40 Ma and 15 Ma (Wakabayashi, 2013) and the fact that neither remnants of deposits nor volcanic sources exist anywhere in the Merced River watershed, of which Yosemite is a part.

## DISCUSSION: ROLES OF GLACIATION, TECTONISM, AND BURIAL

### Glaciation

Our data cannot resolve the fraction of canyon incision exclusively resulting from Pleistocene glaciation, but they do place limits on net Pleistocene erosion of the high-elevation upper Tuolumne watershed northeast of Yosemite. This region, the site of the famous Tuolumne Meadows, transitions westward from a platform into a deep canyon. This morphology suggests the platform did not experience deep late Cenozoic erosion despite glaciation (Huber, 1990). Our analyses confirm this (Figs. 1D–1E). Exceptionally high-quality crystal LDP-NH limits cooling since 10 Ma at this site to ≤20 °C (Figs. 1D and 2J; Appendix 2). Glacial erosion was likely limited by the ice spreading laterally instead of funneling into a channel, and by the massive, unfractured bedrock (Dühnforth et al., 2010). The former reduces the rate of ice flow, which is an observed and theoretical correlate of subglacial erosion (Herman et al., 2015; Hallet, 1979), while the latter inhibits subglacial quarrying dependent on the linkage of fractures driven by differential stresses imposed by variable ice load (Hallet, 1996; Hooyer et al., 2012). Indeed, cosmogenic isotope studies have demonstrated that the unfractured bedrock zones of this region eroded less than 2 m in the most recent 0.1 Ma (Dühnforth et al., 2010).

Portions of Tenaya Canyon upstream of our sample site exhibit V-shaped cross-sections, which suggests little influence from flowing ice, and this also corresponds to a zone of remarkably massive rock (Matthes, 1930). The lowest part of this section lies at an elevation of only ~0.3 km above our sample site. Downstream, however, in the eastern sector of Yosemite Valley proper (yellow star in Fig. 5), glaciers excavated a 0.6-km-deep basin that is now filled with sediment (Gutenberg et al., 1956). Our high-quality sample from this region (Fig. 1K; Appendix 2) does indicate significant late Cenozoic valley deepening, similar to Tenaya Canyon, again without constraining the Pleistocene fraction.

Late Cenozoic canyon downcutting diminished westward (Figs. 1M–1N), though some of the smaller cooling westward may reflect reduced geothermal gradients (Blackwell et al., 2011). The reduced incision and old *A*_{He} of the western canyon downstream of Yosemite (>70 Ma: Fig. 1, Table 1, sites MV06 and MC02) compared to Tenaya Canyon raise the possibility that Pleistocene glaciations were primarily responsible for the latter’s late Cenozoic deepening, as glaciation did not extend to the western sites and, in addition, could have prevented incision downstream by mantling the valley bottom with sediment. In some of the scenarios that best match isotopic data (Fig. 7), the majority of Tenaya Canyon incision occurred during the Pleistocene, so this hypothesis is viable. The strongest counterargument is the apparently non-glacial cross-section of upstream Tenaya Canyon.

### Tectonism

Although uncertainty about the Sierra crest’s elevation history prevents the identification of a specific driver for incision, the century-old hypothesis that Yosemite Valley downcutting relates to late Cenozoic tectonism remains probable given our results and accumulated insights about tectonic history. Cenozoic subduction beneath western North America terminated in stages, with the residual downgoing plate diminished enough by 15 Ma to allow westward trench retreat, which initiated the main phase of inland extensional deformation (Schellart et al., 2010), including separation of the Sierra Nevada microplate from the highlands to the east (Wernicke et al., 1988). Ongoing transtensional deformation from ca. 10 Ma to present faulted the eastern front of the range, beheaded valleys, and produced episodic volcanism (Busby and Putirka, 2009). Directly east of Yosemite, active faulting continues, with vertical slip between the Sierra Nevada and Mono Basin averaging ~1 km Ma^{-1} since 0.2 Ma (Bursik and Sieh, 1989). Total displacement of the eastern frontal faults may allow for only a few hundred meters of range crest uplift at this latitude (Martel et al., 2014), but concurrent thermal isostatic uplift could have contributed. Regardless, a small or moderate magnitude of rock uplift, combined with low erosion rates, would lead to surface uplift of the range crest. The associated westward tilting of the topography would only need to flush alluvium out of canyon bottoms to renew incision and produce headward (eastward) propagation of preexisting canyons of sufficient longitudinal slope. This would not occur in all canyons, however. Some rivers with large watersheds, including the San Joaquin and several northern Sierran drainages (Schweickert, 2009), originated east of the faulting zone and likely were already deeply incised. Beheading of these rivers would have reduced discharge and valley incision rates; development of such canyons is not a reliable guide to the development of Yosemite. A scenario in which overall relief of the Sierra Nevada, from Central Valley to range crest, was 2–3 km in mid-Cenozoic time and subsequently increased on the order of 1 km, causing differential responses of canyons according to their prior states, would explain all available unambiguous evidence. This includes the aforementioned geological constraints on canyon ages; the paleorelief of granite surfaces, which is known to be at least 2 km (Wakabayashi and Sawyer, 2001); Central Valley sedimentation rates, which were very low through most of the Cenozoic but increased after 15 Ma (Wakabayashi, 2013); water-isotopic signatures of a lee-side precipitation shadow, which require a mid-Cenozoic mean Sierran crestal elevation of at least 2 km but not necessarily comparable to modern heights (Mulch et al., 2006; Wheeler et al., 2016); and thermochronometric indicators reported here and previously (House et al., 2001; McPhillips and Brandon, 2012). The latter study, a crustal temperature model-based synthesis of *A*_{He} data from throughout the Sierra, concluded at 1σ confidence that crestal uplift started between 30 Ma and 10 Ma and caused ~1–3 km of elevation increase.

Tectonic drivers do not, of course, exclude a role for Pleistocene glaciation in the formation of Yosemite. The 0.6 km bedrock basin depth of the eastern main valley represents nearly 40% of the total bedrock relief at that location, entailing 1 km of exposed relief plus 0.6 km beneath sediment, and provides a minimum value for glacial downcutting here. Valley widening and undercutting to form near-vertical cliffs would inevitably accompany such glacial deepening (Harbor et al., 1988).

### Burial

Gabet and Miggins (2020) argued that late Cenozoic canyon incision in the northern Sierra Nevada did not require range-crest uplift as a driving process but instead occurred in response to the cessation of volcanic aggradation that was active prior to the early Pliocene. By analyzing topography and deposits in this region, they demonstrated that in some cases incision re-exposed prior canyons in the underlying Mesozoic pluton and in other cases carved new canyons. Though characteristic of the northern Sierra Nevada, extensive Tertiary volcanic deposits associated with these events do not extend southeastward from the Sonora Pass region (Fig. A1); they terminate at ~50 km northwest of Yosemite Valley. Although the Tuolumne and San Joaquin drainages, which are immediately north and south of the Merced watershed, conveyed some lahar and lava flows from eastward sources, there is no direct evidence that volcanic or alluvial debris covered the Merced watershed or its divide regions. Absent a way to prove this, however, a few important questions about hypothetical debris cover deserve attention.

Is it possible that Yosemite formed as a deep canyon prior to the Eocene, coincident with rapid erosion of the surrounding uplands (where *A*_{He} is ca. 60 Ma), and then had its *A*_{He} reset to the observed ca. 40 Ma age by deep burial during Miocene and Pliocene volcanism? Thermochronometric data negate this scenario. For example, a set of model calculations for Tenaya Canyon crystal TC01a, which starts with an uplandsite cooling history (with *A*_{He} = 63.15 Ma) and sequentially raises the temperature at 15–5 Ma by various specified amounts, shows that ~70 °C of reheating would be required to transform *A*_{He} to the observed value for Tenaya Canyon (Fig. 12B). But with such high temperatures, the calculated *R*_{4/3} profile is far too diffusive to match the data. In addition, we note that *A*_{He} at sites in the western Merced Canyon (MV06a and MC02c in Fig. 1 and Table 1) have not been reset to an age younger than the upland, an unlikely result if the canyon only 20 km upstream were sufficiently filled with debris to reheat by 70 °C (Fig. 1K).

What thickness of debris could have mantled the upland surfaces in our study region without violating the isotopic constraints? *A*_{He} values observed on upland surfaces throughout our study region and southeastward, paralleling the axis of the Sierra to beyond the Kings River, are all in the range 55 Ma to 70 Ma (Fig. 1, Table 1; House et al., 2001; transect T2). Such an old age and such consistency are very unlikely if ages have been reset by burial reheating. In particular, if resetting occurred most strongly at the northern end, closest to the known volcanic deposits, ages would increase southeastward, and this trend is not observed. We therefore assume that the upland samples have not been reset and perform model calculations for reheating scenarios that are again characterized by elevated temperatures between 15 Ma and 5 Ma. For the highest quality sample, LDP-NH (Fig. 2J), perturbing the best-match *T*(*t*) history in this fashion by various magnitudes indicates that up to ~25 °C of reheating would be undetectable (Fig. 12A). Depending on geothermal gradient, this corresponds roughly to a maximum thickness of 0.5–1 km. A similar reheating applied to the Tenaya Canyon samples’ cooling histories (Figs. 1F–1J) is untenable. As noted previously, however, the isotopic data do not preclude more complex histories such as canyon incision in the Oligocene or Miocene followed by burial of similar magnitude. Yet, such burial could not have propagated downstream with sufficient thickness to reset the isotopic signal at siteŝ20 km west of Yosemite Valley.

## CONCLUSIONS

Thermochronometric constraints demonstrate the validity of two long-argued hypotheses about the Yosemite region landscape: (1) that significant deepening of the canyon coincided with renewed tectonism in the later Cenozoic, and (2) that cumulative Pleistocene glacial erosion of the Tuolumne platform was insubstantial. Modern geothermal gradients in the western Sierra are low, and if also true for the Yosemite region during the last 10 m.y., then Tenaya Canyon deepened by more than 1 km in the same interval. A gradient as large as 40 °C km^{-1} implies at least 0.55 km of deepening. Our site with constrained deepening is situated in a transition zone between a canyon segment displaying little glacial erosion and one with substantial glacial erosion, according to geomorphological evidence. Our site’s elevation is onlŷ0.3 km below the former, which suggests that at least some of the inferred deepening is fluvial. Fluvial incision is plausibly a response to tectonic uplift and tilting, but the effects of spatially or temporally varying burial by volcanic debris merit investigation as an alternative (Gabet and Miggins, 2020), though constrained both by the absence of surviving deposits and by the preservation of old apatite Helium ages on upland surfaces and in the canyon downstream (westward) of Yosemite Valley. Beyond adding quantitative rigor to a foundational debate in geomorphology, and providing a new constraint for controversial arguments about the nature and drivers of Cenozoic evolution of the Sierra Nevada, our study presents a unique opportunity to provoke public interest in contemporary geomorphology and the evolution of a renowned landscape.

^{1}Supplemental Material. Samples and geochemical analysis: Data for bulk-crystal and site-averaged helium ages; ages for crystals used in

^{4}He/3He analysis. Data tables, step-heating: Tenaya Canyon and Yosemite Village; Northern Uplands; Western Canyon. All code available by request to the authors. Please visit https://doi.org/10.1130/GSAB.S.20427810 to access the supplemental material, and contact editing@geosociety.org with any questions.

## ACKNOWLEDGMENTS

We thank The Martin Family Foundation for ongoing support (to K.M. Cuffey), The Ann and Gordon Getty Foundation for ongoing support (to D.L. Shuster), J. Webb and N. Fylstra for field assistance, and Mark Brandon and Emmanuel Gabet for discussion. It was Gabet’s idea to consider debris accumulation as a viable mechanism for drainage reconfiguration. M. Fox was supported by Natural Environment Research Council (UK) award NE/N015479/1. We are grateful, in addition, to W. Guenther and C. Glotzbach for helpful reviews of the manuscript.

### APPENDIX 1. REGIONAL GEOGRAPHY

Figure A1 provides geographical context for some of the major features of central California discussed in the text.

### APPENDIX 2. QUALITY CONTROL OF *R*_{4/3} DATA

Bad data should not be used in analysis. “Bad” means that the process of measurement objectively failed to perform. For example, an out-of-focus image should not be used in a photogrammetric study. To do so may yield an incorrect result and would certainly obscure the precision achieved with better images in the same data set. In our case, the “process” is using the best available intracrystalline isotope model and a thorough exploration of possible multimillion-year cooling histories. In addition to normal measurement errors, a failure to reproduce systematic aspects of *R*_{4/3} data (an “out-of-focus” result) could result from significant and unmeasured spatial variations of parent isotope concentration (zonation, a particular problem if not radially symmetric), incomplete characterization of controlling physics (e.g., zones of enhanced diffusivity), edge effects produced during irradiation, and perhaps brief heating events accompanying fire or lightning strikes.

#### Rejection of Spectra

We compared each measured *R*_{4/3}(*F*) spectrum to its best-fit diffusive model. For the diffusive signal to be discernible, the model must capture the overall shape and value of the measurement. We quantified poor performance as follows. First, the data were smoothed with a spline to obviate random measurement errors. We compared this function, *R*_{4/3}^{D}, and its first derivative, , to the best diffusive model, *R*_{4/3}^{M}, and its first derivative, , using the metric

for a spectrum with *N* data points. We rejected measured spectra with *Q* > 1.0. The constant scale factors *S _{o}* and

*S*

_{1}are limits of acceptability. To compare mismatches between measured and modeled values, we set

*S*= 0.12, which is three times the mean 1σ uncertainty of all measurements. For comparing slopes, note that measured

_{o}*R*

_{4/3}typically ranges from ~0.3–1.3 over the range

*F*∈ [0,1], with a slope

*dR*

_{4/3}/

*dF*of about two at low

*F*values and zero at high

*F*values. If the model reproduces none of that convexity (i.e., is a line of slope one), the typical mismatch of measured and modeled slopes is approximately one. Thus, we set

*S*

_{1}= 1.

Fourteen of our 19 measured spectra passed the test for *Q* < 1.0 and were used in the manuscript: seven at the two sites used for inversion and seven more to illustrate the geographical pattern. Table A1 lists calculated *Q* values ranked from best to worst.

#### Removal of Limit Points

To achieve the clearest quantification of the geological diffusive signal in *R*_{4/3}(*F*) spectra, we removed data at the beginning and ends of the measurement sequences, where deviations from trends occur that cannot possibly be explained by diffusive models. Near the ends of the heating experiments, at *F* values close to 1, very little gas remains to measure, and the data can deviate strongly from the trend d*R*_{4/3}/d*F* ≈ 0 required by diffusive models. At the beginnings of the heating experiments, where *F* ≈ 0, some of the crystals from both canyon and up-land display a thin zone of depletion. The first one to three measurements lie below trend, and the slope d*R*_{4/3}/d*F* is considerably larger in the data than was calculated for any models. Figure A2 illustrates the most prominent example. As we are unsure of the origin of these depleted zones and therefore unable to model them, we have removed the associated measured steps from further analysis. The criterion for removal is a measured slope d*R*_{4/3}/d*F* greater than twice the slope of the corresponding best-fit model. Removal affects a very small fraction of the complete data sets. The maximum *F* values affected by removal—that is, the boundary between the last step removed and the first step retained—are all *F* < 0.015 for upland crystals and *F* < 0.033 for canyon crystals. Cases with more than one step defining the zone indicate that such depletion is only a concern for *F* ≤ 0.012, which is a trivial fraction of the total spectrum measured. As a sensitivity study, we repeated the Monte Carlo generation of cooling histories for all crystals both with and without these initial points. All canyon crystals still imply more than 30 °C of cooling since 15 Ma except for TC01b, for which this cooling drops to 18 °C. All upland crystals except MV05b still imply no significant recent cooling. For MV05b, the expected value of cooling since 15 Ma increases from 3 °C to 13 °C.

In principle, a crystal for which the inferred *T*(*t*) is strongly sensitive to the narrow initial zone should simply be omitted from analysis. For this reason, our definition of crystal weights used in analyses depends on how much inferred *T*(*t*) changes when adding or removing the initial measured step(s). The factor *q*_{2} defined in Appendix 5 fulfills this role.

### APPENDIX 3. COMPARISON OF 2-D AND 3-D MODELS

To assess the validity of a 2-D calculation and justify simplifying the 3-D problem, important issues given the complexity of the Yosemite region topography, we compared 2-D results to 3-D results calculated with the widely used thermo-kinematic model, Pecube (Braun, 2003). The latter uses a complete regional digital elevation model as the surface topography. Calculated differences of *T*(*t*) at our sample sites prove to be minor for the 2-D vs. 3-D cases. Figure A3 displays results from two scenarios, both of which use the modern topography for all times but vary the background exhumation rates over time. In this test, we used modern topography because it has maximal topographic relief, so that the topographic effects on the underlying temperature field are also maximal. Note that thermal parameters are constant in Pecube, so for this comparison we disabled the temperature dependencies of conductivity and capacity in the 2-D model.

### APPENDIX 4. SITE-AVERAGED COOLING HISTORIES

The sets of site-averaged cooling histories (Fig. 7A, gray curves), *T*_{C} and *T*_{U}, were calculated as follows. At either site, averaged histories are

where *i* indicates the crystal and *w _{i}* is the weight factor assigned to the crystal as described in Appendix 5. To reduce the thousands of randomly generated

*T*(

*t*) scenarios for each crystal to a manageable number, while nonetheless preserving the range of scenarios and matches to data, we first classified the individual crystal

*T*(

*t*) into three performance categories (

*j*= 1, 2, 3) for which the relative likelihood statistic λ ∈ [0.9, 1.0], [0.61, 0.9), or [0.13, 0.61). For a given crystal, we assembled the

*T*(

*t*) in each category into pdfs at 2 Ma intervals and calculated expected values. The progression of expected values over time measures the central tendency of the various

*T*(

*t*) in each category. To capture other aspects of the

*T*(

*t*) distributions, we calculated “grouped averages.” For a given crystal and category, we identified the age at which the range of

*T*values is greatest, typically around 2–4 Ma for Tenaya Canyon crystals and in the range of 25–40 Ma for upland crystals. At these ages, we divided the range of

*T*into four equal intervals, grouped together all of the

*T*(

*t*) within each interval, and averaged these

*T*(

*t*) for all times. Thus, a given crystal (

*i*) and category (

*j*) are described by five

*T*(

*t*) curves (

*k*= 1–5), comprising one expected value and four grouped averages. The advantage of using grouped averages instead of other statistics derived from pdfs, such as density maps, is the retention of information about covarying temperatures over time; for example, grouped averages will show whether the

*T*(

*t*) histories that are particularly cold or warm at 3 Ma are also particularly warm or cold at 10 Ma and 30 Ma or not. The entire process of calculating averages was performed separately for the set of individual crystal

*T*(

*t*) with start times of 90 Ma and 70 Ma.

### APPENDIX 5. WEIGHT FACTORS

Samples of excellent quality should contribute more than poor samples to the construction and assessment of site-averaged cooling histories and the calculation of performance metrics. “Quality” includes both goodness of fit to isotopic data and robustness with respect to choices of how to treat the data. For this reason, we defined the weight factor, *w*, for a crystal as the product of four quality factors in the range [0,1],

Table A2 lists the calculated *q* and *w* values for all analyzed crystals.

Factor *q*_{1} assesses the degree to which the *R*_{4/3} data can be explained by any *T*(*t*) model. Consider a spectrum *R*_{4/3}(*F*) for a given crystal, consisting of *N* steps. The degree of non-uniformity of the *R*_{4/3} data is given by , and the scaled mismatch of a *T*(*t*) model is

where *R*_{4/3}^{mi} represents the measured *R*_{4/3} values and *R*_{4/3}^{pi} represents the predicted values for the single best-performing model *T*(*t*) for the crystal, the one for which λ = 1.

Factor *q*_{2} assesses the robustness of the *T*(*t*) inferred from a crystal with respect to the inclusion of initial data points in the *R*_{4/3} spectra. As noted in Appendix 2, some of these measurements are of dubious quality and were omitted from analyses. If we reinstate them, the inferred *T*(*t*) changes, and such changes reduce confidence in the results. Define *E*_{0}(*t*) and *E*_{1}(*t*) as the expected values of the optimal *T*(*t*) distributions calculated without and with the initial *R*_{4/3} measurements. *q*_{2} decreases with the average absolute difference of *E*_{0} and *E*_{1} calculated across a specified age range (0–20 Ma for Tenaya Canyon crystals, and 0–40 Ma for upland crystals) and normalized to a temperature scale (*T*_{∗} = 30 °C):

where the brackets 〈〉 signify the average of the enclosed quantity over the specified time interval.

Factor *q*_{3} assesses the robustness a crystal’s inferred *T*(*t*) by comparing the mode of the optimal distributions versus the mode of the 1σ distributions. Ideally, as likelihood decreases from optimal fits to the outer bounds of the 1s category, the modal values of the *T*(*t*) distributions would not change, even as the distributions become wider. This is not the case for some of our crystals, as the *T*(*t*) models with lower likelihoods tend to be shifted to one direction relative to the optimal ones. *q*_{3} quantifies this effect using an equation identical to Equation 10 but with *E*_{1}(*t*) and *E*_{0}(*t*) referring to the expected values for optimal and 1s distributions.

Finally, factor *q*_{4} assesses the robustness of a crystal’s inferred *T*(*t*) with respect to the starting condition for the randomly generated individual *T*(*t*) paths used for intracrystalline models. In the nominal case, the starting condition is 150 °C at 90 Ma. As a sensitivity test, we shift the start time to 70 Ma. Again, *q*_{4} quantifies this effect using Equation 10 but with *E*_{1}(*t*) and *E*_{0}(*t*) referring to the expected values for the optimal distributions in the 90 Ma and 70 Ma start-time cases.

### APPENDIX 6. REHEATING SCENARIOS

For every monotonic cooling history, *T*(*t*), there exist alternative scenarios of identical *A*_{He} that involve enhanced early cooling followed by warming to temperatures above *T*(*t*) prior to final cooling. Depending on the data quality for particular crystals, these scenarios may also be acceptable with respect to *R*_{4/3} data. Such reheating is unlikely to be significant at Yosemite, given the absence of deep burial or magma injection during the Cenozoic. Nonetheless, some reheating of rocks at depth, such as those now exposed in Tenaya Canyon, may have occurred due to increased asthenospheric temperatures and corresponding enhanced geothermal gradients at times of late Cenozoic volcanism or crustal delamination episodes, as discussed, for example, in Busby and Putirka (2009) and Saleeby and Foster (2004). We examined two scenarios, one a period of enhanced gradient from 16 Ma to 5 Ma and the other a brief rise in gradient from 4 Ma to 2 Ma. The black curve in Figures A4A–A4B is one of the optimal histories for Tenaya Canyon rocks. The colored curves in Figures A4A–A4B illustrate several alternative histories of the same *A*_{He}. To identify these, we specified the timing and magnitudes of prior cooling, and then adjusted the magnitude of the reheating to match *A*_{He} using the He production and diffusion model of Shuster and Farley (2005) and the radiation damage and annealing model of Willett et al. (2017). The similarity of temperatures during the warm reheating periods despite the larger differences during the prior cold periods reflects the exponential increase of He diffusivity with temperature.

The increase of temperature at the onset of a reheating event for a Tenaya Canyon sample—that is, the temperature rise from the end of the cold period to the start of the warm period—depends multiplicatively on the increase of the gradient and the sample depth at that time. The latter also equals erosion of the canyon bottom from onset of reheating to present, and hence it approximates the net canyon incision since that time, given the slow erosion rate of the adjacent upland. Call the gradients immediately before and after the onset of reheating *G _{o}* and

*G*, respectively, and the sample temperatures at these times

_{r}*T*and

_{o}*T*. Given a contemporaneous surface temperature,

_{r}*T*, the net erosion of the canyon bottom since onset time is Δ

_{s}*z*= [

*T*]

_{r}- T_{s}*G*

_{r}

^{–1}and the ratio of gradients would have been

*G*= [

_{r}/ G_{o}*T*] / [

_{r}- T_{s}*T*]. Values for the gradients are unknown, but it is reasonable to assume that

_{o}- T_{s}*G*was in the range of 10–40 °C km

_{o}^{-1}and that

*G*was <60 °C km

_{r}^{-1}. With these stipulations, the six scenarios shown in Figures A4A–A4B imply net canyon erosion values Δ

*z*as displayed in Figure A4C (red and blue curves) for a range of values of the unknown

*G*. The black curves illustrate the no-reheating scenario corresponding to the original monotonic

_{r}*T*(

*t*). Scenarios with reheating generally imply larger net erosion close to the present and thus do not change the essential conclusions of our study.