P‐arrival backazimuth estimates can be crucial in locating poorly constrained seismic events. Correlating short windows of the vertical waveform with corresponding windows of the radial rotation for different backazimuths can provide estimates, but these are often uncertain and biased due to skewness in the Z–R correlation functions. Assessing how well cosine curves centered on different backazimuths match the Z–R correlation functions provides more reliable estimates that depend less upon the time‐window used. Stacking best‐fit‐cosine curves from neighboring three‐component stations improves stability further in a form of array‐processing that does not require coherence between the waveforms themselves. We demonstrate for recordings of North Korean nuclear tests at the Pilbara 3C array in Australia that the biases in the Z–R correlation functions vary greatly between adjacent stations. This bias is reduced both by the cosine curve fitting and stacking operations. We advocate obtaining backazimuth estimates for all P arrivals at three‐component stations globally. This could improve phase association and event location, identify sensor orientation problems, and provide baseline backazimuth corrections and uncertainty estimates. We propose two benchmark datasets for developing, documenting, and comparing backazimuth estimation algorithms and codes. All the data and code used to generate the results presented here are open.

The direction of arrival of a seismic signal can provide important information about the source location. Usually, arrival times alone are sufficient for accurate event location; direction estimates typically play a secondary role. For small events with few constraints, direction estimates can be crucial. This is especially true if a complicated source‐time function makes arrival picking difficult. For example, polarization analysis of three‐component seismic data was crucial for locating the Nordstream gas pipeline explosions in September 2022 (Staehler et al., 2022). For the ultimately sparse network, a single station on the surface of Mars, Zenhäusern et al. (2022) demonstrate how direction estimates using polarization analysis can enable the location of low‐frequency seismic events. Direction estimates are particularly important when the azimuthal gap is large (e.g., Kværna et al., 2023). Continuous polarization analysis for three‐component particle motion is not new (e.g., Vidale, 1986) and is readily extended to three‐component array data for signal detection and characterization (Wagner and Owens, 1996, 2002).

Seismic arrays form the monitoring basis of forensic seismology. Confident detection, location, and identification of explosions necessitate monitoring at great distances and locations using very few global observations (e.g., Douglas, 2002). Delay‐and‐stack beamforming on seismic arrays serves two purposes: (1) improving signal‐to‐noise ratio (SNR) through noise suppression, and (2) calculating the apparent velocity (or, equivalently, slowness) and direction (or backazimuth). The backazimuth is usually more important than the slowness of the event location. An array can usually classify the full wavetrain, including S phases, based on the phase velocity. Polarization analysis on three‐component data is more limited, especially regarding backazimuth estimation for S phases (e.g., Jurkevics, 1988). We limit our scope here to P arrivals: local, regional, and teleseismic. Kvaerna and Ringdal (1992) compared backazimuth estimation capabilities on a single three‐component station, a microarray (four sites), and a complete 25‐element array. The full array gave the best performance, but they found that P‐wave backazimuth estimates with the three‐component station were comparable to the microarray estimates.

There are two important reasons to invest significant effort in improving P‐wave backazimuth estimation using three‐component stations. The first is the relative ubiquity of three‐component stations. (Arrays are comparatively few, far between, and absent from many regions of the globe.) A small event requiring close analysis is far more likely to happen close to three‐component stations than a seismic array. The second is the fact that array processing is often limited by signal incoherence (e.g., Gibbons, 2014). Seismic arrays were traditionally vertical component only, designed to detect and classify teleseismic arrivals with optimal SNR on the vertical components. Some key seismic arrays have now been upgraded with additional three‐component sites, primarily for improved S‐phase detection and characterization. On the SPITS array, the SNR for S phases improved dramatically once beamforming on transverse components was possible (Gibbons et al., 2011). A later upgrade to the ARCES array resulted in only moderate SNR gains for S phases but a spectacular improvement in signal coherence (Gibbons et al., 2019). Here, we explore the potential for P‐wave backazimuth estimation on three‐component arrays without exploiting coherence between sites.

An arriving P‐wave projects onto both vertical and horizontal components, and the radial component at the backazimuth of arrival should most resemble the vertical component. In practice, the direction in which the radial rotation best resembles the vertical component can differ significantly from the true direction of arrival (e.g., Zenhäusern et al., 2022). The Z–R correlation curve (that maps how well the vertical component matches the radial for different backazimuths) often has a peak that is either skewed, biased, or both. The shape of the curve may suggest a different and more plausible direction than the location of its maximum. In the Formulation section, we visualize the issues and formalize a procedure for robust P‐wave direction estimation, both for single and multiple three‐component stations. In the Stability of Direction Estimates for Regional P Waves from Repeating Events section, we apply the approach to estimating the directions for repeating seismic events with ground‐truth locations on six different stations. In the Application to Three‐Component Seismic Arrays section, we extend the procedure to three‐component arrays and apply it to signals in Australia from underground nuclear tests in North Korea. Finally, we summarize observations from the case studies and make recommendations for enhancing P‐wave direction estimation both on single three‐component stations and three‐component arrays.

We seek the backazimuth of P waves at single three‐component stations or three‐component arrays with sites close enough for (a) the signals to arrive at all sites within 1–2 s and (b) the receiver‐to‐source backazimuth for all sites to be identical to within the uncertainty of measurement. Regarding point (a), a teleseismic P wave covers the ground with an apparent velocity over 10 km/s, such that an array aperture of 20 km or less would see all signals arrive within two seconds or less. A regional or local P wave can cover the ground with apparent velocities down to 5 km/s, such that an array aperture of 10 km or less would suffice. Point (b) will be satisfied (for all but very local events) if point (a) is satisfied. So, we will assume that all the work presented here will hold for arrays of 10 km or less, and will likely work well at significantly larger apertures (with potential degradation in performance due to time‐delays as the aperture increases).

For each site i, or a single three‐component station, we have three channels: yZi(t), y1i(t), and y2i(t), in which y1i(t) and y2i(t) are the horizontal channels with azimuth values azi1i and azi2i (this is to say the directions relative to north). These may be 0° (north) and 90° (east), but we cannot assume this, and we specify the azimuths from the station metadata explicitly for all channels. We first calculate the north and east components using
and
(and verify that the transformation is trivial if our channels are already oriented south–north and west–east).
We seek a 2D function, CZRi(t,bazi), measuring the similarity between NCC consecutive times samples, beginning at time t, of the vertical component yZi(t) and radial component with backazimuth bazi, yRi(t,bazi). CZRi is a fully normalized correlation coefficient between the two time series in the specified interval, with
We assume that all channels have NPTS samples in total. For regional and teleseismic signals, we anticipate windows with a length of a few seconds (say 3–5). We probably do not need to evaluate the cross correlation at every sample; one correlation per second (or 0.5 s) appears to suffice. If NSKIP (≥1) is the number of samples we move between each correlation in our sliding window analysis, then we evaluate NWIN = (NPTS − NCC + NSKIP)/NSKIP elements per backazimuth. We evaluate an even number of backazimuths to exploit the symmetry relation:
and only evaluate CZRi for bazi in the interval [0°, 180°), filling in the remaining values by symmetry. We specify an integer NAZBY2, the number of evenly spaced angles in the interval [0°, 180°), such that our matrix has NWIN elements in time multiplied by NAZ = 2 × NAZBY2 elements in direction.
Given NS three‐component stations in our array, we can stack these matrices with
and find the angle for which this continuous function (either for a single station, i, or the array stack) is the greatest. The stacking should cancel the contributions from noise to give more accurate estimates.
CZRi(t,bazi) is always in the range [−1,1] and, for a given t, has only one global maximum and one global minimum (at backazimuth bazi − 180). CZRi(t,bazi) is a weighted sum of trigonometrical functions and may be skewed, falling more steeply on one side of the maximum than the other. For any reference angle, θ, we define a cosine curve:
and calculate a correlation function:
in which C(θ) is the vector of NAZ values C(θ,bazij), CZRi(t) is the vector of NAZ values CZRi(t,bazij), and xcorr is defined by
for vectors X and Y. The function fi(t,θ) is a fully normalized measure of how well the curve CZRi(t,bazi) matches a pure cosine curve centered on the angle θ.
We define a new array BCF (best cosine fit) for station i with the same dimensions (NWIN by NAZ) and values:
in which [CZRi]max(t) is the maximum of the CZRit vector for time t. (We scale thus to ensure the amplitude of BCF still reflects the similarity between vertical and radial waveforms.)
We define a “best‐cosine‐fit stack”:

All operations are performed by the code m3csdirest (see Data and Resources) but summarized here for completeness. Figure 1 displays CZRi(t,bazi) (in panels c and d) and BCFi(t,bazi) (in panels e and f) for selected times t, close to the P arrival for two regional signals on single three‐component stations.

The first signals (Fig. 1a,c,e) are from an earthquake near the northern tip of Novaya Zemlya (Gibbons et al., 2016) at station IU.KEV (distance 1395 km). The high‐frequency far‐regional wavetrain has Pn and Sn arrivals followed by lengthy codas; Sn arrives before the Pn coda has subsided. The Pn arrival at 22:51:21.90 UTC is emergent, rising out of strong background noise. The highest values of CZRi(t,bazi) come in windows starting a second or two after the analyst‐picked arrival time, as the signal starts to exceed the noise. We see in panel (c) that the CZRi(t,bazi) curve maximum falls encouragingly close to the geographical backazimuth (42.3°) and is consistent for successive time windows. Panel (e) displays the BCFi(t,bazi) curves for the functions in panel (c). These curves are less flattened, and the optimal azimuths are a little closer.

The second signals (Fig. 1b,d,f) are from the sixth and, so far, the largest underground nuclear test in North Korea, recorded in Japan at station IU.MAJO (distance 948 km). Here, the Pn coda diminishes more rapidly, and there is no clear Sn arrival. The SNR is greater than for the signal in panel (a). Panel (d) displays CZRi(t,bazi) at the indicated times surrounding the Pn arrival. Despite the higher SNR, the Z–R correlation values are generally lower than for the Kevo signal, and several of the curves are skewed. The curves and optimal angles change from one window to the next. There is a large spread (over 40°) in the optimal backazimuths for the different CZRi, and it is clear that estimates based on Z–R coherence will be sensitive to the time‐window specification. Considering instead the BCFi(t,bazi) curves, panel (f) reveals far more symmetric curves with optimal backazimuths that are far more consistent from window to window.

Figure 1 suggests that the best‐cosine‐fit curves may serve two purposes in helping us estimate the backazimuth: first, by making the curve far more symmetrical than the CZRi(t,bazi) curves and, second, by making the estimate more stable from one‐time window to the next. In the subsequent sections, we explore the generality and applicability of these properties to different sets of signals. None of the subsequent figures will show explicitly the curves of CZRi(t,bazi) and BCFi(t,bazi) as displayed in Figure 1, but it should be understood that every subsequent backazimuth estimate is determined this way.

CZR(t,bazi) and BCF(t,bazi) are continuous and periodic with respect to bazi. Their usefulness depends on their reliability as indicators of direction of arrival and how repeatable they are for subsequent P arrivals from the same direction. Gibbons et al. (2020) present a database of waveforms for 55 surface explosions in northern Finland. Waveforms for most of the explosions are provided for six stations at distances between 59 and 209 km. They are useful signals for us having exactly one P arrival per segment, generated by an event with a known location. Each data segment starts approximately 30 s before the origin time, such that the arrival of each event at a given station will come after around the same number of seconds along each trace.

Figure 2 displays backazimuth estimates using both CZR(t,bazi) (panel a) and BCF(t,bazi) (panel b) for each three‐component signal in the dataset. The estimates are made for a single, but identical, time for each event and station combination, as indicated. The performance varies from station to station, but, common for all stations, the repeatability from one event to the next is better for BCF(t,bazi) than for CZR(t,bazi). This should come as no surprise, given the observations in Figure 1. We stress that the estimates in Figure 2 come from a completely automated process; the same time window was chosen regardless of noise conditions or interfering signals. An inaccurate value may simply be the result of unrelated signals or noise.

At ARE0, the optimal backazimuth for CZR(t,bazi) averages about 20° less than predicted with a large spread. The optimal backazimuth for BCF(t,bazi) has both a smaller bias and a smaller standard deviation. A handful of events show very anomalous estimates for both CZR(t,bazi) and BCF(t,bazi) that are likely the result of unrelated mining events from another direction. An even more impressive improvement from CZR(t,bazi) to BCF(t,bazi) is seen for KEV. The standard deviations for the KEV BCF(t,bazi) estimates (Fig. 2b) are the lowest in the whole dataset, and we can have confidence that the bias is real. Station LP61 shows by far the greatest variability, but even here, the optimal BCF(t,bazi) backazimuth shows lower variability than the optimal CZR(t,bazi) backazimuth. At SGF, the optimal BCF(t,bazi) backazimuth has greater bias but a lower standard deviation than the optimal CZR(t,bazi) backazimuth. For all stations in the Gibbons et al. (2020) dataset, both CZR(t,bazi) and BCF(t,bazi) provide qualitatively correct backazimuth estimates. However, only the optimal BCF(t,bazi) backazimuths show satisfactory standard deviations. In several cases, the standard deviation is significantly smaller than the apparent bias.

Much of our motivation is improved estimates on seismic arrays, especially in cases in which conventional processing fails due to signal incoherence. Most small‐to‐medium aperture arrays are vertical only (or with only very few three‐component sites). One fully three‐component array is PSAR (Pilbara) in Western Australia (Kennett et al., 2015), deployed primarily to enhance tsunami warning. Its location and geometry are displayed in Figure 3a,b. It is a so‐called spiral‐arm array with sensors in concentric rings. The central element and innermost two rings form a 7‐element array with an aperture around 4 km, designed for signal coherence up to 3–4 Hz. With the outer two rings, PSAR extends to ∼15 km to become a 10‐ or 13‐element array with enhanced slowness resolution for lower‐frequency teleseismic signals. The spacing and geometry were carefully chosen to optimize the array response function (Kennett et al., 2015).

PSAR recorded the nuclear tests in North Korea between 2013 and 2017. Knowing the event locations almost exactly, we can assess the accuracy and repeatability of direction estimates. The geographical backazimuth from the array center is 7.8°, although we remember that structure along the path may result in significant bias; the repeatability from event to event is most important. The toolbox of methods for direction estimation on arrays is vast (see Rost and Thomas, 2002, for an introduction). For full array processing, we apply a cross‐correlation‐based method between the vertical component signals at the different sites. We chose this method, developed at the University of Alaska, for infrasound processing, because the code is open source and generates pedagogical displays of how key parameters evolve with time (see Data and Resources, for source and references).

Figure 3 shows the PSAR waveforms for the nuclear tests as indicated, together with measures of signal coherence, apparent velocity, and backazimuth. Only the vertical waveforms are used for these estimates. Given the limited time‐bandwidth product, the coherence measure (the median cross‐correlation maximum [MdCCM]; see Iezzi et al., 2022) is frequently high, and a threshold of 0.6 is needed to exclude the majority of the background values. Only around the P arrival is MdCCM significantly above this threshold. The largest explosion (Fig. 3f) is unique in maintaining elevated coherence metrics well into the P‐wave coda, likely due to the longer period energy generated. For the smaller explosions, MdCCM is no greater after the P arrivals than before. The P‐arrival trace velocities lie between 17.5 and 18.5 km/s for all events and the backazimuth estimates between 8.4° and 10.8°. We emphasize that these are “point estimates.” Were we to offset all windows slightly or use a different window length, we would likely obtain somewhat different values. The same would apply if we were to filter in different frequency ranges (e.g., Kværna and Doornbos, 1991; Gibbons et al., 2010). Iezzi et al. (2022) present a narrowband extension to the array processing, which may reveal a frequency dependence, but that would also force the user to make additional choices.

Figure 4 shows CZR(t,bazi) and BCF(t,bazi) for the January 2016 event, both for individual array sites and for the array stack; this is the first time we have visualized the functions over an extended time window. Focusing on the P‐arrival time, we see, for these seven innermost stations, great variation both in the Z–R correlation values and in the symmetry with respect to the dashed line (the great‐circle backazimuth). Given that the maximum station separation is under 4 km, we conclude that the basis for the asymmetry is very local to each site. For some sites, the behavior is similar from one time window to the next; for others, it changes significantly with time. In both the preceding noise and in the post P‐arrival coda, the patterns of Z–R coherence at different sites appear almost unrelated. Most importantly in Figure 4 are the results of stacking: a strong and symmetric peak at the P arrival and strong suppression both before and after. The asymmetrical components at the P‐wave onset, some biased below and others above, cancel under the stacking operation. This improvement on the stack makes this a genuine form of array processing.

Figure 5 resembles Figure 2 in displaying optimal backazimuths for CZRi(t,bazi) and BCFi(t,bazi) at different stations for different events. The biggest difference is the addition of the stack. A stacking is possible for the case in Figure 5, because the intersite distances are very small compared with the distance between source and receiver (i.e., the great‐circle backazimuth is similar for all sites). This is not the case for the stations in Figure 2. Again, the optimal BCFi(t,bazi) backazimuths (Fig. 2b) are better constrained than the corresponding CZRi(t,bazi) backazimuths (Fig. 2a). The spread of the CZR(t,bazi) (stack) estimates is an improvement on that for the individual channels but is not clearly an improvement upon the individual channel BCFi(t,bazi) estimates. The BCF(t,bazi) (stack) backazimuth estimates, on the other hand, show greatly reduced variability compared with the single‐channel BCFi(t,bazi) estimates and are almost comparable to the estimates from conventional array processing. The array processing results are still superior (as we would hope), because these exploit time delays between the arrivals at the different sites, ignored in the current procedure. The advantage of conventional array‐processing results will diminish in situations with poor signal coherence.

Accurate backazimuth estimates for P arrivals on three‐component stations would improve phase association, identify sensor orientation errors, and improve seismic event location, especially given sparse observations. One method is to correlate short windows of the vertical waveform with different radial rotations. The Z–R correlation should be optimal for the arrival of backazimuth but, in practice, is subject to high variability and bias. We demonstrate that more stable and less biased estimates are obtained by finding the backazimuth for which a cosine best matches the Z–R correlation. It is straightforward to stack these functions at multiple closely spaced three‐component sites to further improve the estimates. This is a form of array processing that does not demand signal coherence between sites and makes the method promising for situations in which coherent array processing is challenging. This would require fully three‐component arrays, which is rarely the case.

The Z–R correlation bias varies significantly between closely spaced sites and is likely determined locally. This explains the great improvement in estimates through stacking. Under noise conditions, the Z–R correlations at different sites are uncorrelated with each other and are suppressed under stacking, making P arrivals easier to identify. We demonstrate for four DPRK nuclear tests recorded at PSAR that the new estimates are almost as good as those from conventional array processing. We emphasize that a very basic tool for measuring waveform similarity has been applied: time‐domain cross‐correlation at 1 s intervals in the data. Improvements may be possible by averaging correlation estimates over multiple time windows. Other methods, such as phase cross‐correlation (Schimmel, 1999) or multitaper coherence (Prieto et al., 2009; Prieto, 2022), may provide more reliable estimates. Our goals here have been to demonstrate the improvement resulting from (a) matching the Z–R correlation to a symmetrical function and (b) stacking over multiple three‐component stations. The gains are likely to apply to any underlying method for assessing waveform similarity.

The procedures described are applicable to arbitrary three‐component data, and we advocate continuous calculation of these functions on 3C data globally. This would associate all P arrivals with backazimuth estimates, which would enhance event location and, given sufficient observations, facilitate the calibration of backazimuth corrections. (Park et al., 2023; recently demonstrated how backazimuth estimation on multiple three‐component stations could contribute to continuous state‐of‐health monitoring.) For events such as the Nordstream explosions, the backazimuth uncertainty as applied by Staehler et al. (2022) could likely be significantly tightened given adequate calibration. We have considered two case studies: (1) 55 surface explosions in northern Finland and (2) 4 nuclear tests in North Korea. Both the datasets are open, and all event locations are known. We advocate adopting these as benchmark scenarios for developing and evaluating improved algorithms for direction estimation. A comprehensive and systematic comparison between different methods for backazimuth estimation, with regards to both accuracy and repeatability, is overdue. These benchmarks may help in addressing this need.

The array processing examples were calculated using a slightly modified version of the code available at https://github.com/uafgeotools courtesy of the University of Alaska, Fairbanks. The basis of the algorithms is described in Bishop et al. (2020) and Szuberla and Olson (2004). The only changes made were to the ranges of apparent velocities scanned. Plots were created using the GMT software (Wessel et al., 2019) and the ObsPy software (Krischer et al., 2015). The waveforms for Figure 2 were obtained from Gibbons et al. (2020). Data obtained from the AU (DOI: 10.26186/144675) and IU (DOI: 10.7914/SN/IU) networks via Earthscope and from the supplemental material in Gibbons et al. (2020). All calculations in this article used the m3csdirest program available on https://github.com/stevenjgibbons/m3csdirest. All websites were last accessed in October 2023.

The author is not aware of any competing interest.

The author thanks Anthony Lomax, Brian Stump, and an anonymous reviewer for thorough reviews of this article, which helped to improve it significantly.