The clearest statistical signal in aftershock locations is that most aftershocks occur close to their mainshocks. More precisely, aftershocks are triggered at distances following a power‐law decay in distance (Felzer and Brodsky, 2006). This distance decay kernel is used in epidemic‐type aftershock sequence (ETAS) modeling and is typically assumed to be isotropic, even though individual sequences show more clustered aftershock occurrence. The assumption of spatially isotropic triggering kernels can impact the estimation of ETAS parameters themselves, such as biasing the magnitude‐productivity term, alpha, and assigning too much weight to secondary rather than primary (direct) triggering. Here we show that aftershock locations in southern California, at all mainshock–aftershock distances, preferentially occur in the areas of previous seismicity. For a given sequence, the scaling between aftershock rates and the previous seismicity rate is approximately linear. However, the total number of aftershocks observed for a given sequence is independent of background rate. We explain both of these observations within the framework of rate‐and‐state friction (Dieterich, 1994).

Although individual earthquakes are not predictable, the statistical properties of aftershock occurrence in terms of location and timing are quite predictable. Aftershock rates decay with time from the mainshock (Omori, 1895; Utsu, 1957) and with space (Felzer and Brodsky, 2006) following inverse power‐laws. The clustering evident in these kernels, whereby most aftershocks occur close in space and time to their triggering mainshock, is further intensified by secondary triggering in which aftershocks go on to trigger their own aftershocks. This cascading process of earthquake triggering is what epidemic‐type aftershock sequence (ETAS) models attempt to capture (Ogata, 1988, 1998).

During a productive aftershock sequence, earthquake probabilities can temporarily increase above background levels by orders of magnitude (Field and Milner, 2018). This large probability gain, combined with the potential for large aftershocks to cause significant damage, has spurred research regarding where aftershocks are most likely to occur. Because of the power‐law decay in distance discussed earlier, the first‐order result is, of course, that aftershocks are most likely to occur near to the mainshock. In addition, earthquake triggering implies that early aftershock clusters reveal areas where further aftershocks are more likely (e.g., Wiemer, 2000). Conversely, areas with high mainshock slip are found to be deficient in aftershocks (Rubin, 2002; Wetzler et al., 2018), and, in particular, large aftershocks occur farther away from the mainshock rupture than smaller aftershocks (van der Elst and Shaw, 2015).

Including physical information, namely static Coulomb stresses, also has potential to improve aftershock location forecasts. Whereas earlier tests of Coulomb models found that statistical forecasts such as ETAS were superior in terms of predictive performance (Woessner et al., 2011; Segou et al., 2013), recently more complex Coulomb models have demonstrated comparable results to ETAS, and hybrid Coulomb‐ETAS forecasts have been shown to be more informative than ETAS alone (Cattania et al., 2018). Similarly, Hardebeck (2021) uses a series of synthetic catalogs with both clustering and physical triggering components to demonstrate that ETAS models would be expected to outperform Coulomb forecasts that neglect secondary triggering; this underscores the importance of identifying the variable background conditions that lead to heterogeneities in aftershock density, even in light of physical mechanisms such as static stress changes.

Indeed, a number of physical parameters have been already related to aftershock productivity. For example, regions of high heat flow have higher aftershock productivity and are characterized by more swarm‐like behavior (Enescu et al., 2009). Material contrasts can cause asymmetries in the number of aftershocks on different sides of a fault (Rubin and Gillard, 2000). Aftershocks are also observed to correlate with fault locations, such as a fault‐normal bias (more aftershocks occur in the direction of a nearby fault) and a strike‐parallel elongation (Powers and Jordan, 2007). In this work, we demonstrate that relative aftershock density can be predicted by background earthquakes occurring prior to a mainshock–aftershock sequence, in spite of total sequence productivity being independent of the background rate.

We identify isolated M 3–5 mainshocks in the center of the southern California network from 2008 to 2017 from the QTM catalog developed by Ross et al. (2019), shown in Figure 1a. These mainshocks have no larger events within 50 km for 90 days prior to and 1 hr following their occurrence. Mainshocks outside the central, the most complete part of the network, as defined by the probabilistic magnitude of completeness (Schorlemmer and Woessner, 2008) PMC = 1.8 boundary of Tormann et al. (2010) are also excluded, leaving a total of 342 mainshocks for our analysis.

We collect M ≥1 seismicity within 50 km of each mainshock for the preceding 90 days. This prior seismicity is then divided into distance‐azimuthal bins, relative to the mainshock location, and sorted for each sequence, with the highest‐rate azimuthal bins plotted at the top for each distance annulus, as shown in Figure 1b. These sorted distance‐azimuthal bin counts are stacked for all mainshocks. Seismicity within the first hour following the mainshocks is also stacked into distance‐azimuthal bins, but, importantly, using the mapping derived from the background earthquakes. That is, aftershocks for a given mainshock at each radial distance are sorted according to the level of background seismicity for each azimuthal bin. Stacked results for all mainshocks are shown in Figure 2.

In each column of the aftershock plot (Fig. 2b), the most active azimuths tend to be near the top of the plot. Because each column is sorted by activity rate prior to the mainshocks, this shows that the same areas that have high background seismicity rates also have high aftershock rates. This is true for both near‐ and far‐field aftershocks.

With this type of analysis, care must be taken to ensure that the aftershock signal is not contaminated by background earthquakes. However, the color scale in Figure 2 is logarithmic, and the reader can see that the aftershock rates are over an order of magnitude higher than background rates for productive azimuths at all the distances from the mainshock.

Figure 2c compares the prior seismicity and aftershock fraction, stacked over all mainshock–aftershock distances, versus azimuth, with the azimuths again sorted by background rate. Again, we can clearly see the elevated aftershock rates for azimuths with high background rates; however, aftershock density is not proportional to background rate, as predicted by rate‐and‐state friction (Dieterich, 1994). In this figure, approximately 40% of aftershocks occur proportional to background seismicity rates, while the remaining 60% occur at random azimuths. This is due to resolution of the method, which cannot perfectly sort all azimuthal‐distance bins, given that, for most mainshocks, many of these bins have zero background earthquakes. We can see that the relationship approaches linear scaling between background and aftershock density when we focus on the best‐resolved parts of the signal in the near field (Fig. 3), particularly when we make the azimuthal bins larger, which while it decreases spatial resolution, results in more accurate sorting.

One might worry that this observed signal is due to a lower magnitude of completeness in the areas of past seismicity due to a higher density of template events (e.g., Herrmann and Marzocchi, 2020). For this reason, we also test for the signal using a conventional catalog with a more conservative magnitude of completeness of 1.8, shown in Figure S1, available in the supplemental material to this article. Although there are fewer aftershocks detected with this catalog and parameters, the signal is still present and quite similar.

Given that more aftershocks occur in areas with previous seismicity, one might expect that mainshock sequences occurring in areas with a high background seismicity rate would therefore have higher aftershock productivity overall. Surprisingly, this is not what we observe.

For our analysis of total sequence aftershock productivity relative to background rate, we estimate ETAS Ogata (1998) parameters using the Hauksson et al. (2012) catalog. The QTM catalog used in the previous analysis produces a richer earthquake time series using template matching to identify smaller earthquakes in the vicinity of traditionally cataloged earthquakes. Although this results in a more detailed record in time, it introduces temporal correlations and irregularities in the catalog that prevent application of the ETAS model, and we were unable to obtain a stable ETAS solution using the QTM catalog. The Hauksson et al. (2012) catalog is relocated with more conventional double‐difference techniques, and while it contains fewer earthquakes than the QTM catalog, it retains the statistical uniformity of the traditional catalog, making it amenable to analysis with ETAS.

We show estimates of background rate and total aftershock productivity in Figure 4, which are derived assuming the following seismicity rate as a function of time t, as described by the ETAS model of Ogata (1998):

Here c and p are the Omori time‐decay terms; k, α, and Mref are aftershock productivity terms; and μ is a time‐independent background rate of spontaneous earthquakes. The summation is over previous earthquakes with magnitudes Mi occurring at times ti.

We first invert the entire catalog of Hauksson et al. (2012) (using all earthquakes down to a minimum magnitude of Mref=3) for the temporal parameters, p, c, and whole‐catalog estimates of k and α. We then invert subsets of the catalog on a 0.3° grid, using a 0.1° step size, keeping the temporal parameters and α fixed, but allowing μ and k to vary from their initial values. Using such a fine grid means that each bin may contain only a fraction of the aftershock zone of larger mainshocks within the bin, as well as part of the aftershock zone of mainshocks outside the bin. The inversion therefore considers all mainshocks within ± 1 degree and uses a spatial kernel to weight their contribution by the fraction of their aftershocks expected within the bin. This method is therefore an aftershock‐centered approach rather than a mainshock‐centered approach, although the triggering contribution within each site is usually dominated by mainshocks within that site.

The aftershock zone fraction within the bin is estimated using a power‐law kernel around each mainshock (van der Elst and Shaw, 2015). We use a spatial density of the form
which includes an inner length scale that is proportional to the mainshock rupture length. The free parameters here are a magnitude‐scaling parameter d, a distance‐decay exponent γ, and a reference magnitude M0, for which the inner length scale equals 1 km. In this work, the parameters are fixed to constant values from a previous study using the same catalog (d = 0.5, γ=3, and M0=4.56) (van der Elst and Shaw, 2015).

The correlation between our estimated values of μ and k is near zero (−0.01), although structure is visible (see Fig. 4c). For example, areas with the highest background rates rarely have low aftershock productivity. However, the areas with moderately low background rates (μ103) can have, surprisingly, either low or high aftershock productivities.

We have shown that intrasequence aftershock productivity in southern California is predicted by the background rate; that is, within a single aftershock sequence we see aftershock locations favoring previous seismicity locations with near linear scaling in rate. Nevertheless, as shown in the Overall Sequence Productivity section, no such scaling or correlation exists for total intersequence aftershock productivity. This is a bit of a puzzle. How is total aftershock productivity over an entire sequence not affected strongly by background rate if the number of aftershocks that do occur tend to occur in locations with high background rate?

In fact, it suggests some sort of counterintuitive coordination between aftershocks occurring in different areas, whereby aftershocks occur in areas where background rates are relatively higher but somehow conspire overall so that mainshocks surrounded by higher background areas do not have more aftershocks in total than mainshocks occurring in quiet areas.

Certainly total aftershock variability is affected by mainshock magnitude error and stress drop variability, which add to the scatter in Figure 4c. However, this scatter does not hide a linear scaling, and even the quietest areas premainshock can have the highest aftershock productivities.

We can understand both of these seemingly disparate observations by considering aftershock triggering within the empirical rate‐and‐state friction framework that describes frictional evolution in the lab (Dieterich, 1994). According to the earthquake constitutive law defined by Dieterich (1994), a region producing a background rate r in response to a background stressing rate τ˙r will respond to an instantaneous stress step τ with rate R=reτ/aσ. Over intermediate times, the rate goes as 1/t, and at longer times the total number of triggered earthquakes converges to N=(r/τ˙r)τ (Helmstetter and Shaw, 2009; Marsan and Helmstetter, 2017). The scaling of the instantaneous rate is clearly proportional to the background rate r, but the scaling of N in the final equation is somewhat less obvious.

The term r/τ˙r can be thought of as a productivity parameter controlling the number of earthquakes produced by a given stress change in the Dieterich (1994) model. Conceptually, this could represent the spatial density of faults capable of hosting aftershocks within the affected region.

Over small spatial scales, such as the scale of a single aftershock sequence, the background stressing rate τ˙r varies slowly enough that variability in background rate r can be mapped to variability in fault density or availability (Powers and Jordan, 2010). In this case, the number of triggered aftershocks will also be sensitive to the local fault density and hence scale with r. At all times, we expect more background earthquakes and aftershocks to occur in areas with high fault density.

On a more regional scale, we might expect some averaging of the fault density toward some steady state, such that regional variations in background rate are better mapped to regional variations in loading rate and rτ˙r. At this scale, the total number of aftershocks N should scale only with τ.

This framework also explains the considerable scatter seen in Figure 4c. Highly active regions are likely to have both high local fault density and high background loading rate. We should not expect to see earthquakes with very quiet aftershock sequences in areas of high background rate. On the other hand, a low background rate may only imply a low background stressing rate, regardless of the local fault density, allowing for productivities to range from very low to just as high as they are anywhere else.

Put another way, the local background rate is a poor proxy for the fraction of the volume surrounding a given mainshock that has seismogenic potential, and therefore background rate does not constrain the total productivity of an aftershock sequence. Previous seismicity does, however, illuminate which fraction of that volume is likely to have aftershocks. Thus detailed mapping of the background rate r can improve forecasting efforts by telling us where the aftershocks that do occur are likely to be.

Our results offer a simple way to improve aftershock forecasts by taking past seismicity locations into account. In particular, the Uniform California Earthquake Rupture Forecast (UCERF) model in California (Field et al., 2014), which is currently being updated, could make use of this result. A smoothed seismicity model is already calculated in current UCERF methodology and used for the ETAS spontaneous events (Field et al., 2017). Using this kernel to inform aftershock locations, assuming the linear scaling suggested by Figure 3 (while keeping total productivity independent of background) is a simple way to improve the forecasts. This would also make aftershocks, and therefore seismicity as a whole in the model, more clustered, which would result in UCERF catalogs that look more like the real earthquake catalog in California (Page and van der Elst, 2018).

In addition to improving forward ETAS modeling, these results could potentially impact ETAS parameter estimation as well. Many aftershocks are concentrated in areas of high background seismicity. If the ETAS parameterization assumes isotropic aftershock triggering, the only way for the model to match the observed spatial clustering of aftershocks is to place a higher proportion of the aftershocks in secondary sequences. This can lead to an underestimation of the alpha parameter in equation (1) (Helmstetter et al., 2005). This underestimation could be remedied by including background earthquake locations in the spatial ETAS kernel, allowing the model to match more of the observed clustering with primary aftershocks.

The QTM catalog of Ross et al. (2019) is available at (last accessed August 2020). The relocated catalog of Hauksson et al. (2012) is available at∼hauksson/catalogs/index.html (last accessed September 2021). An additional figure repeating the analysis of Figure 2 but using the conventional U.S. Geological Survey (USGS) ComCat catalog (, last accessed September 2021) and a more conservative magnitude of completeness is given in the supplemental material.

The authors acknowledge that there are no conflicts of interest recorded.

The authors thank Jeanne Hardebeck, who by sharing her manuscript draft helped us to realize that scaling between aftershock locations and background was not sublinear but probably linear. The authors thank Elizabeth Cochran for helpful discussions, Andrea Llenos for a thoughtful internal review, as well as the journal reviewers and editors for their comments.

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Supplementary data