Landslides pose a major natural hazard, and heterogeneous conditions and limited data availability in the field make it difficult to connect mapped landslide inventories to the underlying mass-failure mechanics. To test and build predictive links between landslide observations and mechanics, we monitored 67.89 h of physical experiments in which an incising and laterally migrating river generated landslides by undercutting banks of moist sand. Using overhead photos (every 20 s) and 1-mm-resolution laser topographic scans (every 15–30 min), we quantified the area, width, length, depth, volume, and time of every visible landslide, as well as the scarp angles for those within 3 min prior to a topographic scan. Both the landslide area–frequency distribution and area–volume relationship are consistent with those from field data. Cohesive strength controlled the peak in landslide area–frequency distribution. These results provide experimental support for inverting landslide inventories to recover the mechanical properties of hillslopes, which can then be used to improve hazard predictions.

Each year, documented landslides kill ∼8200 people (Haque et al., 2019) and cause over U.S.$15 billion in direct damages to buildings and infrastructure (Kjekstad and Highland, 2009). Landslide hazards continue to rise as humans destabilize slopes through mining and construction activities (Froude and Petley, 2018) and ongoing climate change increases the probability of high-magnitude rainfall events (Pendergrass and Hartmann, 2014) that saturate the shallow subsurface and induce failure (Pelletier et al., 2015; Chae et al., 2017; Haque et al., 2019). Therefore, understanding landslide triggers and the links between observed landslide statistics and their mechanistic causes is essential to further our fundamental understanding of Earth’s dynamic surface and inform emergency-management decisions that can mitigate loss of life and property.

Efforts to link landscape-scale field observations to explicit physical landslide failure conditions and mechanisms are stymied by (1) heterogeneous and time-varying conditions in the natural environment (Pelletier et al., 1997; Chae et al., 2017) and (2) inconsistent and often insufficient data with which to uniquely resolve these failures and link them to a discrete cause (Marc and Hovius, 2015). Field inventories from before and after landslide-triggering events provide a basis for understanding landsliding statistics (Malamud et al., 2004; Larsen et al., 2010), but they often include only areal extent (Kirschbaum et al., 2015; Chae et al., 2017) and may be biased toward larger landslides (Stark and Hovius, 2001). In this study, we leveraged this readily available information to gain insights into the underlying mechanics of landslides and estimate material parameters relevant to elastic (i.e., Mohr-Coulomb) failure.

To reduce the number of unknowns while retaining key elements of the natural system, we turned to physical experiments (Paola et al., 2009; Prancevic et al., 2018). In these experiments, we investigated the mechanistic reason behind the shape of landslide area–frequency distributions (Malamud et al., 2004). Field observations of landslides spanning different environments, triggers, and failure mechanisms share a common humped distribution that includes a few very small landslides, a peak in landslide frequency at a relatively modest size, and a heavy tail of larger landslides (Noever, 1993; Pelletier et al., 1997; Dai and Lee, 2001; Stark and Hovius, 2001; Guzzetti et al., 2002; Malamud et al., 2004; Hurst et al., 2013). Hypotheses to explain the low occurrence of small landslides include (1) undersampling due to imagery resolution (Stark and Hovius, 2001; Guzzetti et al., 2002; Tanyaş et al., 2017); (2) undersampling due to “landscape annealing,” i.e., the natural regrading and revegetation of landslide scars that make them smaller and less visible over time (Hurst et al., 2013); (3) limits of cohesive and/or frictional forces acting on hillslopes and landslide blocks (Frattini and Crosta, 2013; Milledge et al., 2014; Jeandet et al., 2019), including their modulation by soil moisture (Pelletier et al., 1997); (4) self-arrest within the granular medium (Noever, 1993); and (5) limits in relief above channel heads and valley bottoms (Pelletier et al., 1997; Guzzetti et al., 2002; Jeandet et al., 2019). In the field, multiple conditions that could produce a rollover toward fewer small landslides commonly coexist, making it difficult to extract the underlying cause(s). When analyzing complete (i.e., fully sampled) inventories, researchers interpret the rollover to result from cohesion, friction, and/or topographic relief (Pelletier et al., 1997; Stark and Hovius, 2001; Guzzetti et al., 2002; Malamud et al., 2004; Jeandet et al., 2019).

Within a 3.9 × 2.4 × 0.4 m box filled with uniform 0.140 ± 0.04 mm sand (Tofelde et al., 2019; Savi et al., 2020), we allowed a braided river to incise the material from a water-and-sediment feed toward a user-defined sea level, carving into its valley walls and triggering autogenic landslides in the process (Fig. 1A; Figs. S1 and S2 in the Supplemental Material1). We applied standard remote-sensing approaches for landslide mapping (Larsen et al., 2010; Chae et al., 2017) to this mesoscale experiment using 0.89-mm-resolution overhead photos (every 20 s; Fig. 1B; Fig. S1) and 1-mm-resolution topographic scans (every 15–30 min; Figs. 1C and 1D) to identify and map failures (Figs. 1C–1E; Fig. S1). These data sources are analogous to high-resolution lidar digital elevation models (DEMs) and remotely sensed imagery used in field-based landslide studies. All data and analysis code are freely available via DOI-linked repositories; see Beaulieu et al. (2020), Wickert (2020), and Witte and Wickert (2020).

Visual tests indicated the minimum detectable landslide to be ∼100 mm2. We then defined a limit of consistent detections (Guzzetti et al., 2002) below which we excluded landslides from our area–frequency analysis (Fig. 2); this value was twice the minimum in each horizontal dimension (i.e., 400 mm2). Both our minimum-detection limit and limit of consistent detections were more conservative than pixel-to-landslide-size ratios used similarly in the field (Guns and Vanacker, 2014). Furthermore, the 20 s lag time between images allowed us to record every landslide before landscape annealing (through erosion or reworking of deposits; Hurst et al., 2013) or landslide amalgamation (Marc and Hovius, 2015) became significant sources of error.

Although the 15–30 min lag between topographic scans prevented us from directly observing topographic change between each individual landslide, we were able to measure individual scarp angles for landslides that occurred within 3 min prior to each scan (Figs. 1D and 1E). Undercutting acted as the dominant landslide trigger (Figs. 1B; Fig. S1), creating failure blocks with a trapezoidal cross section bounded by the plateau surface, free surface, undercut surface, and scarp (Fig. 3). For these, we defined failure planes by projecting scarps to the thalweg elevation (Fig. 1E). For blocks too narrow to construct a trapezoid at the given scarp angle (4% of the total inventory), we inferred a triangular geometry with a steeper scarp defined by landslide width and plateau-to-thalweg height. To estimate landslide volume, we multiplied this cross-sectional area by the down-valley length of each landslide (see the Supplemental Material). In addition, we solved the stress balance along the failure plane to compute the cohesion of the unsaturated sand at the time of failure (Fig. 3).

Landslides in each of our six experiments (Table S1) were triggered by slope instability as the river incised and widened its valley. We measured scarp geometries (76° ± 11°, one standard deviation) and failure mechanisms (2% topples; 98% landslides), indicating that failures occurred predominantly as steep translational slides (Fig. 3) as the river undercut and destabilized valley walls. From these landslides, we characterized area–frequency and area–volume relationships, analyzed waiting times between individual landslides, and established a relationship between landslide width and substrate cohesion. Our goal was to characterize these observables beyond what has been possible in the field and link them to stress conditions at failure (Gallen et al., 2015; Jeandet et al., 2019).

The first measure, area–frequency distribution, is the most common measure in field-based landslide inventories due to its amenability to remotely sensed imagery. The area–frequency distribution of landslides in the experiment closely matched an inverse-gamma (Γ) distribution (Figs. 2A and 2B) popularized by Malamud et al. (2004):

Here, f is the probability density, A is the landslide area, r+1 is the power-law exponent that describes the decay in frequency of medium and large landslides, a is the exponential decay constant that describes the decrease in frequency of small landslides, and s is the area of the most frequent (i.e., modal) landslide. When fit to our experimental data, the modal landslide area was 13 cm2, which is significantly greater than the limit for consistent detection (4 cm2), demonstrating that the rollover in the landslide area–frequency distribution is real and not an artifact of inventory resolution.

Measurements of landslide areas are much more common than those of volumes, which require field measurements (Larsen et al., 2010) and/or repeat elevation surveys (Dietze et al., 2017). To estimate landslide volumes, which are significant values for determining total erosion, sediment mass delivery to rivers, and links between physical erosion and atmospheric carbon dioxide consumption, researchers often use a power-law scaling relationship (Hovius et al., 2000; Malamud et al., 2004; Larsen et al., 2010; Marc and Hovius, 2015),

Here, V is landslide volume, α is the scaling coefficient, A is landslide area, and γ is the scaling exponent. The landslide area–volume relationship for all of our experiments followed Equation 2, with α = 0.34 and γ = 1.36 (Fig. 2C). This scaling exponent falls within the field-observed ranges for landslides in both soil (γ = 1.1 to γ = 1.4) and rock (γ = 1.3 to γ = 1.6) (Larsen et al., 2010).

Implementing physics-based approaches to mapping landslide susceptibility (Casadei et al., 2003; Bellugi et al., 2015; Chae et al., 2017) can be difficult due to (1) data resolution, (2) data quality, (3) spatial variability in geotechnical properties, and (4) temporal variability in water content (i.e., pore-fluid pressure) (Pelletier et al., 1997; Chae et al., 2017). Our experiments overcame most of these observational limitations, thus allowing us to directly compare landslide sizes with stress balances, which then could be applied to analyze hazard potential in a landscape (e.g., following Jeandet et al., 2019).

Using the measured mass-failure geometries, we computed the corresponding cohesion of unsaturated sand, σc, by applying a stress balance at incipient slip (Fig. 3, combined for slides) to each block (Equations S1 and S2 in the Supplemental Material). By comparing this with block-topple torque-balance calculations across the full data set (Equation S3), we found that assuming a slide-type failure overpredicted the failure strength of toppled blocks by 30%–35%. If 10% of failures occurred as topples, which is greater than the 2% observed and closer to the 11% predicted from linear Mohr-Coulomb theory, this would shift the peak in the predicted stress–frequency distribution by 3.0%–3.5%. This difference is insignificant compared to the range of experimentally determined (250–750 Pa) and landslide-inverted (∼100–2300 Pa) cohesion values (Fig. 4A), indicating that our results are robust despite the lack of a documented mechanism for each individual failure.

Because the exact moisture content of the unsaturated sand in our experiments was unknown and spatially variable, we computed both the weight of the landslide block (this changed little, see the Supplemental Material; our density calculations here assumed 10% of the pores were water-filled pores) and its cohesion as functions of moisture content. Experiments employing fine sands (Richefeu et al., 2006; Lu et al., 2009) and associated theory (Lu et al., 2009) predict a cohesive-strength range of 250–750 Pa. This range encompasses the peak in our cohesion–frequency distribution (Fig. 4A).

The overlap between the literature-based cohesion and the peak in calculated cohesion at failure from our landslide inventory provides experimental support for the idea that landslide inventories may be inverted to obtain the physical properties of the underlying substrate (Gallen et al., 2015; Jeandet et al., 2019). Landslide areas and slopes are observable from imagery and DEMs, while cohesive strength at failure is the key variable we hope to constrain. Fortunately, as landslide area increases, so does the shear stress at failure due to the increased thickness and—in the case of undercutting—unsupported width of the sliding block (Figs. 2B and 3). The tight range of failure angles within our experiment (76° ± 11°) suggests that the dominant relationship should be between landslide area and cohesion. An analysis of the landslide distribution (Fig. 4B) demonstrated that the area–frequency peak corresponds with failure at the expected cohesion for unsaturated fine sand (Richefeu et al., 2006; Lu et al., 2009).

Our experimental work provides a data-rich basis for inverting landslide distributions in the field to estimate cohesion. This inversion also requires knowledge of slope, slide thickness, pore-fluid pressure, and seismic accelerations. DEMs and seismic networks may provide slope and seismic accelerations. Topographic surveys or scaling arguments (Larsen et al., 2010; Marc and Hovius, 2015) can provide thickness. Pore-fluid pressure integrates heterogeneous rainfall patterns and substrate properties, making it a critical parameter to measure. Lacking this, one can bracket the solution using a range of plausible water pressures at failure. These cohesion estimates can then be applied to generate physics-based maps of landslide hazard.

Our physical experiments triggered landslides through river incision and lateral erosion. We mapped a complete inventory of these landslides. Their area–volume relationship is consistent with field inventories from landslides in both soil and bedrock. Their area–frequency distribution follows an inverse gamma function in which the peak is controlled by cohesion in moist, unsaturated, fine sand. This conveniently means that landslide statistics provide a measure of the Mohr-Coulomb cohesion that determines their occurrence.

Stefanie Tofelde designed the experimental basin with intellectual input from Chris Paola, Jean-Louis Grimaud, and Sara Savi, and constructed it with the support of Ben Erickson, Richard Christopher, Chris Ellis, Jim Mullin, Eric Steen, and Sara Savi. Eric Steen also provided valuable assistance with the laser scanner. We heartily thank two anonymous reviewers for their constructive comments; these significantly improved the form and rigor of the final article. Start-up funds awarded to Wickert by the University of Minnesota supported Beaulieu and the construction of the experimental basin. Beaulieu also received support from the Richard C. Dennis Fellowship awarded by the Department of Earth Sciences at the University of Minnesota. This material is based upon work supported by the National Science Foundation under grant EAR-1944782.

1Supplemental Material. Supplemental methods and figures for landslide experiment analysis. Please visit to access the supplemental material, and contact with any questions.
Gold Open Access: This paper is published under the terms of the CC-BY license.