Abstract
The shear‐wave velocity time averaged over the upper 30 m () is widely used as a proxy for site effects, forms the basis of seismic site class, and underpins site‐amplification factors in empirical ground‐motion models. Many earthquake simulations, therefore, require . This presents a challenge at regional scale, given the infeasibility of subsurface testing over vast areas. Although various models for predicting have thus been proposed, the most popular U.S. national, or “background,” model is a regression equation based on just one variable. Given the growth of community data sets, satellite remote sensing, and algorithmic learning, more advanced and accurate solutions may be possible. Toward that end, we develop national models and maps using field data from over 7000 sites and machine learning (ML), wherein up to 17 geospatial parameters are used to predict subsurface conditions (i.e., ). Of the two models developed, that using geologic data performs marginally better, yet such data are not always available. Both models significantly outperform existing solutions in unbiased testing and are used to create new maps at ∼220 m resolution. These maps are updated in the vicinity of field measurements using regression kriging and cover the 50 U.S. states and Puerto Rico. Ultimately, and like any model, performance cannot be known where data is sparse. In this regard, alternative maps that use other models are proposed for steep slopes. More broadly, this study demonstrates the utility of ML for inferring below‐ground conditions from geospatial data, a technique that could be applied to other data and objectives.