Abstract
P‐arrival backazimuth estimates can be crucial in locating poorly constrained seismic events. Correlating short windows of the vertical waveform with corresponding windows of the radial rotation for different backazimuths can provide estimates, but these are often uncertain and biased due to skewness in the Z–R correlation functions. Assessing how well cosine curves centered on different backazimuths match the Z–R correlation functions provides more reliable estimates that depend less upon the time‐window used. Stacking best‐fit‐cosine curves from neighboring three‐component stations improves stability further in a form of array‐processing that does not require coherence between the waveforms themselves. We demonstrate for recordings of North Korean nuclear tests at the Pilbara 3C array in Australia that the biases in the Z–R correlation functions vary greatly between adjacent stations. This bias is reduced both by the cosine curve fitting and stacking operations. We advocate obtaining backazimuth estimates for all P arrivals at three‐component stations globally. This could improve phase association and event location, identify sensor orientation problems, and provide baseline backazimuth corrections and uncertainty estimates. We propose two benchmark datasets for developing, documenting, and comparing backazimuth estimation algorithms and codes. All the data and code used to generate the results presented here are open.
Introduction
The direction of arrival of a seismic signal can provide important information about the source location. Usually, arrival times alone are sufficient for accurate event location; direction estimates typically play a secondary role. For small events with few constraints, direction estimates can be crucial. This is especially true if a complicated source‐time function makes arrival picking difficult. For example, polarization analysis of three‐component seismic data was crucial for locating the Nordstream gas pipeline explosions in September 2022 (Staehler et al., 2022). For the ultimately sparse network, a single station on the surface of Mars, Zenhäusern et al. (2022) demonstrate how direction estimates using polarization analysis can enable the location of low‐frequency seismic events. Direction estimates are particularly important when the azimuthal gap is large (e.g., Kværna et al., 2023). Continuous polarization analysis for three‐component particle motion is not new (e.g., Vidale, 1986) and is readily extended to three‐component array data for signal detection and characterization (Wagner and Owens, 1996, 2002).
Seismic arrays form the monitoring basis of forensic seismology. Confident detection, location, and identification of explosions necessitate monitoring at great distances and locations using very few global observations (e.g., Douglas, 2002). Delay‐and‐stack beamforming on seismic arrays serves two purposes: (1) improving signal‐to‐noise ratio (SNR) through noise suppression, and (2) calculating the apparent velocity (or, equivalently, slowness) and direction (or backazimuth). The backazimuth is usually more important than the slowness of the event location. An array can usually classify the full wavetrain, including S phases, based on the phase velocity. Polarization analysis on three‐component data is more limited, especially regarding backazimuth estimation for S phases (e.g., Jurkevics, 1988). We limit our scope here to P arrivals: local, regional, and teleseismic. Kvaerna and Ringdal (1992) compared backazimuth estimation capabilities on a single three‐component station, a microarray (four sites), and a complete 25‐element array. The full array gave the best performance, but they found that P‐wave backazimuth estimates with the three‐component station were comparable to the microarray estimates.
There are two important reasons to invest significant effort in improving P‐wave backazimuth estimation using three‐component stations. The first is the relative ubiquity of three‐component stations. (Arrays are comparatively few, far between, and absent from many regions of the globe.) A small event requiring close analysis is far more likely to happen close to three‐component stations than a seismic array. The second is the fact that array processing is often limited by signal incoherence (e.g., Gibbons, 2014). Seismic arrays were traditionally vertical component only, designed to detect and classify teleseismic arrivals with optimal SNR on the vertical components. Some key seismic arrays have now been upgraded with additional three‐component sites, primarily for improved S‐phase detection and characterization. On the SPITS array, the SNR for S phases improved dramatically once beamforming on transverse components was possible (Gibbons et al., 2011). A later upgrade to the ARCES array resulted in only moderate SNR gains for S phases but a spectacular improvement in signal coherence (Gibbons et al., 2019). Here, we explore the potential for P‐wave backazimuth estimation on three‐component arrays without exploiting coherence between sites.
An arriving P‐wave projects onto both vertical and horizontal components, and the radial component at the backazimuth of arrival should most resemble the vertical component. In practice, the direction in which the radial rotation best resembles the vertical component can differ significantly from the true direction of arrival (e.g., Zenhäusern et al., 2022). The Z–R correlation curve (that maps how well the vertical component matches the radial for different backazimuths) often has a peak that is either skewed, biased, or both. The shape of the curve may suggest a different and more plausible direction than the location of its maximum. In the Formulation section, we visualize the issues and formalize a procedure for robust P‐wave direction estimation, both for single and multiple three‐component stations. In the Stability of Direction Estimates for Regional P Waves from Repeating Events section, we apply the approach to estimating the directions for repeating seismic events with ground‐truth locations on six different stations. In the Application to Three‐Component Seismic Arrays section, we extend the procedure to three‐component arrays and apply it to signals in Australia from underground nuclear tests in North Korea. Finally, we summarize observations from the case studies and make recommendations for enhancing P‐wave direction estimation both on single three‐component stations and three‐component arrays.
Formulation
We seek the backazimuth of P waves at single three‐component stations or three‐component arrays with sites close enough for (a) the signals to arrive at all sites within 1–2 s and (b) the receiver‐to‐source backazimuth for all sites to be identical to within the uncertainty of measurement. Regarding point (a), a teleseismic P wave covers the ground with an apparent velocity over 10 km/s, such that an array aperture of 20 km or less would see all signals arrive within two seconds or less. A regional or local P wave can cover the ground with apparent velocities down to 5 km/s, such that an array aperture of 10 km or less would suffice. Point (b) will be satisfied (for all but very local events) if point (a) is satisfied. So, we will assume that all the work presented here will hold for arrays of 10 km or less, and will likely work well at significantly larger apertures (with potential degradation in performance due to time‐delays as the aperture increases).
All operations are performed by the code m3csdirest (see Data and Resources) but summarized here for completeness. Figure 1 displays (in panels c and d) and (in panels e and f) for selected times t, close to the P arrival for two regional signals on single three‐component stations.
The first signals (Fig. 1a,c,e) are from an earthquake near the northern tip of Novaya Zemlya (Gibbons et al., 2016) at station IU.KEV (distance 1395 km). The high‐frequency far‐regional wavetrain has Pn and Sn arrivals followed by lengthy codas; Sn arrives before the Pn coda has subsided. The Pn arrival at 22:51:21.90 UTC is emergent, rising out of strong background noise. The highest values of come in windows starting a second or two after the analyst‐picked arrival time, as the signal starts to exceed the noise. We see in panel (c) that the curve maximum falls encouragingly close to the geographical backazimuth (42.3°) and is consistent for successive time windows. Panel (e) displays the curves for the functions in panel (c). These curves are less flattened, and the optimal azimuths are a little closer.
The second signals (Fig. 1b,d,f) are from the sixth and, so far, the largest underground nuclear test in North Korea, recorded in Japan at station IU.MAJO (distance 948 km). Here, the Pn coda diminishes more rapidly, and there is no clear Sn arrival. The SNR is greater than for the signal in panel (a). Panel (d) displays at the indicated times surrounding the Pn arrival. Despite the higher SNR, the Z–R correlation values are generally lower than for the Kevo signal, and several of the curves are skewed. The curves and optimal angles change from one window to the next. There is a large spread (over 40°) in the optimal backazimuths for the different , and it is clear that estimates based on Z–R coherence will be sensitive to the time‐window specification. Considering instead the curves, panel (f) reveals far more symmetric curves with optimal backazimuths that are far more consistent from window to window.
Figure 1 suggests that the best‐cosine‐fit curves may serve two purposes in helping us estimate the backazimuth: first, by making the curve far more symmetrical than the curves and, second, by making the estimate more stable from one‐time window to the next. In the subsequent sections, we explore the generality and applicability of these properties to different sets of signals. None of the subsequent figures will show explicitly the curves of and as displayed in Figure 1, but it should be understood that every subsequent backazimuth estimate is determined this way.
Stability of Direction Estimates for Regional P Waves from Repeating Events
and are continuous and periodic with respect to bazi. Their usefulness depends on their reliability as indicators of direction of arrival and how repeatable they are for subsequent P arrivals from the same direction. Gibbons et al. (2020) present a database of waveforms for 55 surface explosions in northern Finland. Waveforms for most of the explosions are provided for six stations at distances between 59 and 209 km. They are useful signals for us having exactly one P arrival per segment, generated by an event with a known location. Each data segment starts approximately 30 s before the origin time, such that the arrival of each event at a given station will come after around the same number of seconds along each trace.
Figure 2 displays backazimuth estimates using both (panel a) and (panel b) for each three‐component signal in the dataset. The estimates are made for a single, but identical, time for each event and station combination, as indicated. The performance varies from station to station, but, common for all stations, the repeatability from one event to the next is better for than for . This should come as no surprise, given the observations in Figure 1. We stress that the estimates in Figure 2 come from a completely automated process; the same time window was chosen regardless of noise conditions or interfering signals. An inaccurate value may simply be the result of unrelated signals or noise.
At ARE0, the optimal backazimuth for averages about 20° less than predicted with a large spread. The optimal backazimuth for has both a smaller bias and a smaller standard deviation. A handful of events show very anomalous estimates for both and that are likely the result of unrelated mining events from another direction. An even more impressive improvement from to is seen for KEV. The standard deviations for the KEV estimates (Fig. 2b) are the lowest in the whole dataset, and we can have confidence that the bias is real. Station LP61 shows by far the greatest variability, but even here, the optimal backazimuth shows lower variability than the optimal backazimuth. At SGF, the optimal backazimuth has greater bias but a lower standard deviation than the optimal backazimuth. For all stations in the Gibbons et al. (2020) dataset, both and provide qualitatively correct backazimuth estimates. However, only the optimal backazimuths show satisfactory standard deviations. In several cases, the standard deviation is significantly smaller than the apparent bias.
Application to Three‐Component Seismic Arrays
Much of our motivation is improved estimates on seismic arrays, especially in cases in which conventional processing fails due to signal incoherence. Most small‐to‐medium aperture arrays are vertical only (or with only very few three‐component sites). One fully three‐component array is PSAR (Pilbara) in Western Australia (Kennett et al., 2015), deployed primarily to enhance tsunami warning. Its location and geometry are displayed in Figure 3a,b. It is a so‐called spiral‐arm array with sensors in concentric rings. The central element and innermost two rings form a 7‐element array with an aperture around 4 km, designed for signal coherence up to 3–4 Hz. With the outer two rings, PSAR extends to ∼15 km to become a 10‐ or 13‐element array with enhanced slowness resolution for lower‐frequency teleseismic signals. The spacing and geometry were carefully chosen to optimize the array response function (Kennett et al., 2015).
PSAR recorded the nuclear tests in North Korea between 2013 and 2017. Knowing the event locations almost exactly, we can assess the accuracy and repeatability of direction estimates. The geographical backazimuth from the array center is 7.8°, although we remember that structure along the path may result in significant bias; the repeatability from event to event is most important. The toolbox of methods for direction estimation on arrays is vast (see Rost and Thomas, 2002, for an introduction). For full array processing, we apply a cross‐correlation‐based method between the vertical component signals at the different sites. We chose this method, developed at the University of Alaska, for infrasound processing, because the code is open source and generates pedagogical displays of how key parameters evolve with time (see Data and Resources, for source and references).
Figure 3 shows the PSAR waveforms for the nuclear tests as indicated, together with measures of signal coherence, apparent velocity, and backazimuth. Only the vertical waveforms are used for these estimates. Given the limited time‐bandwidth product, the coherence measure (the median cross‐correlation maximum [MdCCM]; see Iezzi et al., 2022) is frequently high, and a threshold of 0.6 is needed to exclude the majority of the background values. Only around the P arrival is MdCCM significantly above this threshold. The largest explosion (Fig. 3f) is unique in maintaining elevated coherence metrics well into the P‐wave coda, likely due to the longer period energy generated. For the smaller explosions, MdCCM is no greater after the P arrivals than before. The P‐arrival trace velocities lie between 17.5 and 18.5 km/s for all events and the backazimuth estimates between 8.4° and 10.8°. We emphasize that these are “point estimates.” Were we to offset all windows slightly or use a different window length, we would likely obtain somewhat different values. The same would apply if we were to filter in different frequency ranges (e.g., Kværna and Doornbos, 1991; Gibbons et al., 2010). Iezzi et al. (2022) present a narrowband extension to the array processing, which may reveal a frequency dependence, but that would also force the user to make additional choices.
Figure 4 shows and for the January 2016 event, both for individual array sites and for the array stack; this is the first time we have visualized the functions over an extended time window. Focusing on the P‐arrival time, we see, for these seven innermost stations, great variation both in the Z–R correlation values and in the symmetry with respect to the dashed line (the great‐circle backazimuth). Given that the maximum station separation is under 4 km, we conclude that the basis for the asymmetry is very local to each site. For some sites, the behavior is similar from one time window to the next; for others, it changes significantly with time. In both the preceding noise and in the post P‐arrival coda, the patterns of Z–R coherence at different sites appear almost unrelated. Most importantly in Figure 4 are the results of stacking: a strong and symmetric peak at the P arrival and strong suppression both before and after. The asymmetrical components at the P‐wave onset, some biased below and others above, cancel under the stacking operation. This improvement on the stack makes this a genuine form of array processing.
Figure 5 resembles Figure 2 in displaying optimal backazimuths for and at different stations for different events. The biggest difference is the addition of the stack. A stacking is possible for the case in Figure 5, because the intersite distances are very small compared with the distance between source and receiver (i.e., the great‐circle backazimuth is similar for all sites). This is not the case for the stations in Figure 2. Again, the optimal backazimuths (Fig. 2b) are better constrained than the corresponding backazimuths (Fig. 2a). The spread of the (stack) estimates is an improvement on that for the individual channels but is not clearly an improvement upon the individual channel estimates. The (stack) backazimuth estimates, on the other hand, show greatly reduced variability compared with the single‐channel estimates and are almost comparable to the estimates from conventional array processing. The array processing results are still superior (as we would hope), because these exploit time delays between the arrivals at the different sites, ignored in the current procedure. The advantage of conventional array‐processing results will diminish in situations with poor signal coherence.
Conclusions
Accurate backazimuth estimates for P arrivals on three‐component stations would improve phase association, identify sensor orientation errors, and improve seismic event location, especially given sparse observations. One method is to correlate short windows of the vertical waveform with different radial rotations. The Z–R correlation should be optimal for the arrival of backazimuth but, in practice, is subject to high variability and bias. We demonstrate that more stable and less biased estimates are obtained by finding the backazimuth for which a cosine best matches the Z–R correlation. It is straightforward to stack these functions at multiple closely spaced three‐component sites to further improve the estimates. This is a form of array processing that does not demand signal coherence between sites and makes the method promising for situations in which coherent array processing is challenging. This would require fully three‐component arrays, which is rarely the case.
The Z–R correlation bias varies significantly between closely spaced sites and is likely determined locally. This explains the great improvement in estimates through stacking. Under noise conditions, the Z–R correlations at different sites are uncorrelated with each other and are suppressed under stacking, making P arrivals easier to identify. We demonstrate for four DPRK nuclear tests recorded at PSAR that the new estimates are almost as good as those from conventional array processing. We emphasize that a very basic tool for measuring waveform similarity has been applied: time‐domain cross‐correlation at 1 s intervals in the data. Improvements may be possible by averaging correlation estimates over multiple time windows. Other methods, such as phase cross‐correlation (Schimmel, 1999) or multitaper coherence (Prieto et al., 2009; Prieto, 2022), may provide more reliable estimates. Our goals here have been to demonstrate the improvement resulting from (a) matching the Z–R correlation to a symmetrical function and (b) stacking over multiple three‐component stations. The gains are likely to apply to any underlying method for assessing waveform similarity.
The procedures described are applicable to arbitrary three‐component data, and we advocate continuous calculation of these functions on 3C data globally. This would associate all P arrivals with backazimuth estimates, which would enhance event location and, given sufficient observations, facilitate the calibration of backazimuth corrections. (Park et al., 2023; recently demonstrated how backazimuth estimation on multiple three‐component stations could contribute to continuous state‐of‐health monitoring.) For events such as the Nordstream explosions, the backazimuth uncertainty as applied by Staehler et al. (2022) could likely be significantly tightened given adequate calibration. We have considered two case studies: (1) 55 surface explosions in northern Finland and (2) 4 nuclear tests in North Korea. Both the datasets are open, and all event locations are known. We advocate adopting these as benchmark scenarios for developing and evaluating improved algorithms for direction estimation. A comprehensive and systematic comparison between different methods for backazimuth estimation, with regards to both accuracy and repeatability, is overdue. These benchmarks may help in addressing this need.
Data and Resources
The array processing examples were calculated using a slightly modified version of the code available at https://github.com/uafgeotools courtesy of the University of Alaska, Fairbanks. The basis of the algorithms is described in Bishop et al. (2020) and Szuberla and Olson (2004). The only changes made were to the ranges of apparent velocities scanned. Plots were created using the GMT software (Wessel et al., 2019) and the ObsPy software (Krischer et al., 2015). The waveforms for Figure 2 were obtained from Gibbons et al. (2020). Data obtained from the AU (DOI: 10.26186/144675) and IU (DOI: 10.7914/SN/IU) networks via Earthscope and from the supplemental material in Gibbons et al. (2020). All calculations in this article used the m3csdirest program available on https://github.com/stevenjgibbons/m3csdirest. All websites were last accessed in October 2023.
Declaration on Competing Interests
The author is not aware of any competing interest.
Acknowledgments
The author thanks Anthony Lomax, Brian Stump, and an anonymous reviewer for thorough reviews of this article, which helped to improve it significantly.