Abstract
Fitting parametric seismological models to earthquake catalogs often comes with numerical challenges, especially when catalogs are small. An alternative way to quantify parameter values for a seismic region is by eliciting expert opinions on the seismological characteristics that each parameter corresponds to. For instance, expert beliefs on aftershock patterns can be formulated into prior distributions for aftershock parameters, for example, for the epidemic‐type aftershock sequence (ETAS) model. We illustrate such a method by not only eliciting priors for ETAS parameters for the Pacific Northwest (PNW), a subduction zone with a complex tectonic environment, but also a relatively small catalog. We compare these priors with those suggested by the ETAS literature for global subduction zones, discussing implications for aftershock forecasting for the PNW.
Introduction
Parametric models built on earthquake catalogs are ubiquitous across seismology, but estimating these model parameters can be complicated or impossible when the catalogs are small, for example, for regions with lower seismicity rates or on subcatalogs of interest. Statistical approaches exist for eliciting experts’ prior knowledge on model parameters, which can help infer parameters in such data‐limited contexts. We develop a prior elicitation approach and illustrate how it can be used to construct prior distributions for aftershock parameters for the Pacific Northwest (PNW)—a region with a sparse catalog but high seismic risk.
Aftershock models and parameters
Aftershocks are known to follow several empirical patterns, which have been formalized into seismological laws that are governed by parameters. These describe how many aftershocks can be expected and how aftershock rates should decay in time and space around their triggering mainshock. Statistical seismicity models built around these laws can be used to calculate aftershock forecasts following a large mainshock. The epidemic‐type aftershock sequence (ETAS) model is the leading statistical seismicity model (Ogata, 1998) (details in the ETAS Model section in the supplemental material) and has been found to outperform other models in both short‐term and long‐term earthquake forecasting (Schorlemmer et al., 2018). The values of ETAS parameters (described in Table 1) will determine the corresponding aftershock forecasts. For its aftershock forecasts, the U.S. Geological Survey (USGS) first draws parameters from distributions of values that fit well to previous sequences from similar tectonic settings around the world before updating to sequence‐specific parameter estimates (Page et al., 2016).
However, ETAS parameters are known to vary by region, even for the same tectonic environment (e.g., Mesimeri and Pankow, 2022). ETAS parameters are thus concise barometers to summarize and compare the tectonic features of seismic regions. For example, understanding whether a region’s p falls below or above 1, the canonical value, can be related to local stress transfer conditions (Seif et al., 2017). This also means that using broader parameter distributions that combine seismic regions across the globe may lead to inaccurate or imprecise aftershock forecasts. ETAS parameters should be informed by local conditions and ideally should be estimated from regional earthquake catalogs.
Though ETAS parameters conveniently describe aftershock patterns, there are challenges in estimating them from catalogs. Catalogs are often incomplete following a large earthquake, and traditional approaches like raising the cutoff magnitude can bias aftershock forecasts (Harte, 2018). Maximum‐likelihood estimation (MLE) is typically used to estimate parameters, but this is complicated by the correlations between parameters, which can bias parameter estimates and their uncertainties (Seif et al., 2017). The corresponding aftershock forecasts may substantially over‐ or underestimate the actual number of events. Indeed, the asymptotic statistical theory that guarantees well‐behaved likelihood‐based inference for ETAS parameters is only appropriate for large samples (Ogata, 1998), but aftershock forecasts are needed even in regions with sparse catalogs.
Earthquake catalogs are just one source of information about a region’s seismological phenomena, for example, aftershock patterns. The geological and geophysical knowledge that local experts have can also help infer parameters for models like ETAS. Such prior knowledge can be formulated into statistical prior distributions as a first step for understanding a region’s parameters and is then used for Bayesian parameter estimation. Here, prior distributions for the parameters are updated by the data (catalog) using Bayes’ rule, returning posterior parameter distributions. This is especially beneficial when catalogs are sparse, either due to naturally low seismicity rates or tight windows in space, time, or magnitude. If the data do not sufficiently support specific parameter values, then the parameter’s posterior distribution resembles its prior. Both the preexisting beliefs about a parameter and its best estimate from the data can be expressed with uncertainty using statistical probability distributions (Albert et al., 2012).
Statistical methods to capture prior information
Statistical prior elicitation is the process of collecting expert knowledge and expressing it using probability distributions. Many methods for prior elicitation exist, but they tend to follow a similar procedure in four steps:
- 1.
identify experts and quantities of interest for which to elicit priors;
- 2.
elicit priors by interviewing experts about the quantities of interest and aggregate elicited quantities across the experts;
- 3.
fit probability distributions on these elicited parameter values; and
- 4.
collect feedback from experts, and repeat steps 3 and 4 as necessary.
The first step requires recruiting experts that meet prespecified requirements; we refer the reader to the statistical literature, which provides criteria in making these decisions (e.g., Garthwaite et al., 2005; Genest and Zidek, 1986). In our example, we use just two experts to illustrate the method, but more than this are usually required. Furthermore, we may wish to elicit priors on quantities related to the parameter, rather than the parameter itself. For example, for those experts unfamiliar with the ETAS model, eliciting values for aftershock duration will be more accessible than for the p parameter it is linked to (see Table 1).
Elicitation methods for the second step include asking experts directly for summary statistics of the prior distribution (e.g., medians, modes), having experts draw probability distributions by hand or bet on the parameter value being in different intervals (Johnson et al., 2010). We focus on elicitation using summary statistics. Although means and variances might be the most straightforward way to specify common distributions, experts have difficulty expressing variances numerically, while elicitations of means have been found to be error‐prone for skewed distributions (Garthwaite et al., 2005). Another approach is the bisection or quartile method (Oakley, 2010), wherein the expert first specifies the probable bounds of the parameter range and values that bisect the probability distribution (e.g., median, 25th and 75th percentiles). Such interval elicitations may be more intuitive than eliciting means and variances, though experts have been found biased toward intervals that are too short, i.e., variance too small (Garthwaite et al., 2005). Once summary statistics have been elicited, they need to be aggregated across multiple experts; common approaches are averaging, pooling (an extension of weighted averaging), or hierarchical methods (Genest and Zidek, 1986; Albert et al., 2012), though a wider literature exists on this in Bayesian statistics. We refer the reader to textbooks (O’Hagan et al., 2006) and literature reviews (Stefan et al., 2022; Mikkola et al., 2023) on this topic to help select the method of prior aggregation.
These aggregated statistics can be used to specify a best‐fitting parametric distribution, for example, a normal or Gamma distribution. Distributions should be chosen that match the degree of uncertainty or skewness corresponding to the expert’s beliefs. Finally, the distributions can be updated through an iterative feedback process with the experts, which is crucial since numerous distributions can be specified to a given set of summary statistics.
Aftershock patterns in the PNW and other subduction zones
The PNW of North America has substantial seismic risk, with high populations near seismic zones; however, using the ETAS model to characterize its aftershock patterns poses challenges. First, the PNW hosts multiple tectonic regimes, which are hypothesized to have distinct aftershock patterns (Gomberg and Bodin, 2021). Although the Cascadia subduction zone can generate massive but infrequent megathrust earthquakes, the depths of the subducting slab can also produce large intraslab earthquakes, which are more frequent. Numerous shallow faults furthermore exist in the crust of the North American plate, which produces crustal earthquakes (Bostock et al., 2019). Another limitation is the scant size of its instrumental catalog: a recent catalog has less than 100 earthquakes above M 4.0 since 1975, most of them crustal (Schneider et al., 2023).
Previous work has examined aftershock dynamics in other subduction zones by fitting ETAS models or their components to subduction zone aftershock sequences around the world. While Shcherbakov et al. (2013) found that p values in subduction zones stay fairly close to the canonical p = 1, Page et al. (2016) and Tsapanos (1995) both found evidence for p < 1 for subduction aftershock sequences. On the other hand, Zhang et al. (2020) found much higher p values (p > 1), though with considerable variation as well as systematic biases caused by, for example, the model’s isotropic spatial kernel and correlated parameters. Aftershock productivity parameters for subduction zones have been found to be roughly similar, on average, to other seismic zones (Page et al., 2016; Dascher‐Cousineau et al., 2020), though these can vary greatly not just across region but also by sequence (Zhang et al., 2020). Mainshocks at deeper depths have also been observed to produce fewer aftershocks (Bilek and Lay, 2018). Only Zhang et al. (2020) estimated spatial parameters for subduction zone aftershocks, finding that estimates for q and d may depend on magnitude cutoffs, among other factors.
Recent evidence suggests that the PNW may follow patterns distinct from other subduction zones. The sole available study estimating aftershock productivity parameters in the PNW found that its mainshocks are half as productive as in other global subduction zones, except for crustal events (Gomberg and Bodin, 2021). Productivity was anticorrelated with mainshock depth, though this varied across the region. As of yet, no models have been published for the temporal or spatial patterns of aftershocks in the PNW. To avoid overpredicting aftershock productivity following mainshocks in the PNW (as Gomberg and Bodin, 2021 suggest the USGS forecasting system currently would), regional parameter distributions for the PNW are crucial.
We elicit priors for aftershock parameters for the PNW, considering the intraslab and crustal regimes separately. We also describe a second approach for constructing these priors using estimates from the literature on ETAS models for global subduction zones. These different priors were used when building Bayesian ETAS models for the PNW (Schneider, 2021). In this work, we compare these priors and discuss implications for aftershock forecasting and how PNW aftershock parameters may relate to those for other subduction zones.
Eliciting Priors for Aftershock Parameters
We next demonstrate how to elicit priors for aftershock parameters through the four‐step procedure outlined in the previous section. Note that this is intended only as a simplified illustration of this method, and we underscore where it may be expanded to be made more robust.
Step 1: Identifying experts and quantities of interest
We identified two leading researchers on PNW seismicity as experts. One is a former manager of the Pacific Northwest Seismic Network and the other is a seismologist at a federal agency who specializes on subduction zones, including the PNW.
We directly elicited values for the , c, and d parameters. For c and d, we asked for the expected time and distance, respectively, before the power‐law aftershock rate decay begins. For , we elicited beliefs about the proportion of triggered seismicity for a long‐term period of seismicity (which we then converted to background proportion and then background rate). In the PNW, 100 earthquakes of M 2.0+ occur roughly every seven years, on average. We thus phrased this question as: “For the next 100 earthquakes (seven years), what proportion of them do you expect to be triggered by previous seismicity?”
For the other parameters, we elicited values for their corresponding aftershock characteristics (see Table 1) for a reference M 6.0 mainshock and considering aftershocks of at least M 2.0, the commonly used PNW magnitude of completeness (Malone, 2019; Gomberg and Bodin, 2021; Schneider et al., 2023). We asked experts to consider aftershocks triggered directly from the reference mainshock, thus ignoring secondary triggering for simplicity. We derived equations to relate an aftershock characteristic to a corresponding ETAS parameter (see Prior Elicitation Equations section in the supplemental material). Note that we use a single reference mainshock to illustrate our method and that using multiple reference mainshocks with different magnitudes is recommended. We elicited values separately for the two dominant tectonic regimes since instrumental monitoring began (intraslab and crustal).
Step 2: Eliciting priors for quantities of interest
We used the bisection method to elicit priors for each aftershock characteristic (Oakley, 2010). We began by describing to each expert the ETAS parameters and their corresponding aftershock characteristics.
We then asked each expert to specify the probable lower and upper limits of the range, l and u, respectively, by asking “what values could be ruled out and what values could not?” to avoid anchoring on a particular value (Garthwaite et al., 2005). The expert was then asked to provide a number such that there is a 50/50 chance of the parameter being above or below that value. They were then asked to split the interval from l to by another value , such that there is an equal chance of the parameter being in [l, ) and in [, ). Analogous questions yielded the intervals [, ) and [, u]. We refer to the set for a given aftershock characteristic as its five‐number summary, from which probability distributions can be fit.
We attempted to follow this elicitation procedure with both experts. One expert provided five‐number summaries for several aftershock characteristics, though could only provide lower and upper limits for aftershock durations and zones. The expert expressed that crustal and intraslab mainshocks are not different in their aftershock durations and characteristic time before aftershock decay begins, and provided values that held for both regimes. For the other characteristics, they provided values specific to each tectonic regime (see Table 2).
The other expert could not provide PNW‐specific values due to the uncertainty and insufficient data on aftershock patterns in the region, even in light of recent studies (Gomberg and Bodin, 2021). They expressed that their prior beliefs on the aftershock parameters in the PNW would be based on plausible values, that is, from the ETAS literature for other subduction zones (see the Literature‐Based Priors for Subduction Zones section).
Given that only one expert gave PNW‐specific priors, we omit the aggregation of multiple priors and refer the reader to the literature reviewed in the Statistical methods to capture prior information section.
Step 3: Constructing priors by fitting probability distributions
We used the values elicited from the expert to specify priors with either the Gamma or Uniform distributions. For some quantities, the expert expressed which values were more or less probable, as well as different levels of uncertainty and skewness, which can be accommodated by the flexible two‐parameter Gamma distribution. For parameters where the expert could only provide lower and upper limits and had no beliefs about which values in between were more probable, we specified Uniform priors (note that other distributions could be specified, e.g., a Gaussian distribution with tails set to the upper/lower limits to allow for improbable values outside the limits).
We constructed Gamma priors for c, d, and directly from the elicited five‐number summary using an online tool for fitting distributions (Morris et al., 2014). For , we first converted the elicited background proportions into temporal background rates by multiplying by the number of PNW catalog earthquakes with M 2.0+ in that regime (using the catalog of Schneider et al., 2023) and dividing by the catalog’s period (17,897 days).
We constructed Gamma priors for the K parameter using the elicited five‐number summary for aftershock counts and a grid‐search method. We first fixed to prototypical values: for crustal events and for intraslab events, which corresponds to self‐similar and less explosive aftershock triggering, respectively (see the Prior Elicitation Equations section in the supplemental material), as found by Gomberg and Bodin (2021). We calculated the expected aftershock counts under the ETAS productivity law over a sequence of K values and found which K values resulted in the closest match to the expert’s five number summary for aftershock counts. This yielded a five‐number summary for the K parameter, to which we fit a Gamma prior, to reflect the expert’s belief about K’s skew.
We constructed priors for the parameter using a similar grid‐search approach. We used the productivity law to compute the expected productivity using minimum/maximum K prior values and a sequence of for the reference M 6.0 mainshock. We found the minimum/maximum values corresponding to the minimum/maximum elicited aftershock counts. We took the extrema of these values as limits of a Uniform prior because the expert could not provide any prior beliefs on magnitude scaling of productivity in the PNW.
For the p temporal decay parameter, we constructed Uniform priors because the expert could only provide a range of values on aftershock duration. We first computed hypothetical aftershock durations using equation 6 in the supplemental material, taking realistic values for the other parameters: we took K and to yield the expected number of aftershocks for a reference M 6.0 mainshock given in Gomberg and Bodin (2021) (fixing for simplicity K = 0.003 and for a crustal mainshock and K = 0.05 and for an intraslab mainshock) and minimum/maximum c and prior values. For each parameter combination, we computed aftershock durations with a sequence of p values and found which p values resulted in the closest match to the elicited aftershock durations. The extrema of these values were taken as limits of a Uniform prior on p.
We constructed Uniform priors for q in a similar way. We used equation 7 in the supplemental material to compute the expected area of an aftershock zone over a sequence of q values and realistic values for other parameters (combinations of minimum/maximum d values and with K, , and values as described in the preceding paragraph and in the supplemental material). We found the q values that resulted in the closest match to the elicited aftershock zones and took their extrema as limits of a Uniform prior on q.
Step 4: Feedback from experts
We sent the corresponding priors for the aftershock characteristics to the expert for feedback, who approved of the priors’ range of values and distributional shape. These priors are given in Figure 1.
Literature‐Based Priors for Subduction Zones
The other expert expressed that their prior beliefs about PNW aftershock patterns reflected what was known about other subduction zones and should be based on ETAS models published for those. To formulate such priors, we surveyed the literature on ETAS models for subduction zones around the globe. We considered English language studies published before 2021 that fit ETAS models to full catalogs for subduction zone regions, as studies on specific (stacked) sequences require catalog windowing that can bias ETAS parameter estimates (Seif et al., 2017).
We searched the following terms in Google Scholar and Web of Science in October 2021: [ETAS + “X”] or [Omori + “X”], in which X is a subduction zone country or region given in Table 3. We included larger regions (e.g., South America) to capture studies for their subregions that may not be in our search terms.
This search resulted in eight studies: three for Japan (Ogata, 2011; Zhuang, 2011, 2015), two for New Zealand (Console et al., 2010; Harte, 2013) and one each for Chile (Nicolis et al., 2015), south‐central Alaska (Wei et al., 2012), and the Mexican state of Guerrero (Yazdi et al., 2019). Ogata (2011) was removed because we could not extract values from their result plots due to scale label overplotting. The remaining seven studies and the catalogs used are summarized in the supplemental material (Tables S1 and S2). Nearly all studies used the spatiotemporal ETAS model with a magnitude‐dependent spatial kernel for the triggered intensity, though two studies had slightly different parameterizations. We collected the reported parameter estimates (MLEs) and uncertainties (when provided) from each study, giving lower/upper parameter estimates in the Table S2.
We specified subduction zone priors by taking Uniform distributions spanning the minimum lower bound and maximum upper bound for each parameter’s estimates across all studies. This reflected the expert’s belief that no parameter values were more or less probable across subduction regions or between crustal/intraslab regimes for the PNW (note that unlike the PNW, other regions may have many slab interface earthquakes with dissimilar aftershock patterns). These priors are shown in Figure 1.
Discussions
Fitting parametric models to sparse earthquake catalogs poses challenges that may lead to biased or uncertain parameter estimates. But experts’ geoscientific knowledge can inform beliefs on these parameters for a seismic region without necessarily appealing to its catalog. Using statistical methods, one can elicit beliefs and construct prior distributions for seismological parameters. We illustrated such a method by eliciting prior distributions for ETAS parameters for the PNW, considering separately two of its major tectonic environments. We compared these to priors constructed from regional ETAS models for other subduction zones.
The PNW‐specific and subduction‐zone‐wide priors differed across all parameters and priors also varied between intraslab and crustal regimes in the PNW, in ways that would substantially affect aftershock forecasting. The parameters related to aftershock productivity (K and ) take lower values for the intraslab than crustal regimes, corroborating observations that crustal mainshocks are more aftershock‐productive than intraslab mainshocks (Bilek and Lay, 2018; Bostock et al., 2019). The much wider subduction‐zone‐wide priors for K may be due to the PNW having unusually low productivities compared to other subduction zones (Gomberg and Bodin, 2021), though the oft‐reported variability across subduction zones may also explain this wide prior (Page et al., 2016; Zhang et al., 2020). Drawing K parameters from the subduction‐zone‐wide priors could result in aftershock forecasts orders of magnitude higher than the PNW‐specific priors.
The decay parameters also suggest differences between the PNW and other subduction zones. The p parameter range spans lower values for the PNW‐specific distributions than the literature‐sourced global distributions, taking even lower p values than estimated in Page et al. (2016) and Tsapanos (1995). This suggests that even with smaller aftershock counts, the PNW may have aftershock sequences that last longer than similar sequences in other subduction zones. Which priors p is drawn from will thus substantially affect the resulting aftershock forecasts. The q parameter range for the PNW is much broader (especially for the intraslab regime) than for the global subduction zones, even extending into nonphysical values. This suggests a substantial uncertainty in how large aftershock zones may get for PNW mainshocks. Indeed, aftershock decay parameters are notoriously difficult to estimate with low aftershock counts (Seif et al., 2017; Harte, 2018).
Our application to the PNW serves as an illustration of our prior elicitation method, which is generally applicable to other parametric models. This example is limited by its use of just two experts, and prior elicitation for seismological parameters should aggregate priors across more experts (Albert et al., 2012). We also considered a single reference M 6.0 mainshock when eliciting beliefs, and this should be expanded to reference mainshocks of different magnitudes. Furthermore, our literature review indicates a need for more (sub)regional ETAS models for subduction zones like the PNW, given its high seismic risk. To address this, we propose that a Bayesian ETAS model for the PNW using both sets of priors continue to be pursued, as has been investigated by Schneider (2021) and forthcoming publications.
Data and Resources
All data used in this article came from published sources listed in the references. The supplemental material includes additional information about the ETAS model and equations derived for prior elicitation, and tables describing the literature review and corresponding parameters drawn from it.
Declaration of Competing Interests
The authors acknowledge that there are no conflicts of interest recorded.
Acknowledgments
This research was supported by the U.S. Geological Survey under Grant Number G20AP00061. The authors gratefully acknowledge Joan Gomberg, Paul Bodin, and Jeanne Hardebeck and two anonymous reviewers for reviews that improved this article. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government. This work was largely carried out at the University of Washington, Department of Statistics.