## ABSTRACT

Prediction-error filters (PEFs) are essential for seismic deconvolution and other geophysical estimation problems. We find that nonstationary multidimensional PEFs can be computed in a “streaming” manner, wherein the filter is updated incrementally by accepting one new data point at a time. The computational cost of estimating a streaming PEF is reduced to the cost of a single convolution. In other words, the cost of the PEF design while filtering is equivalent to the cost of applying the filter. Moreover, the nonlinear operation of finding and applying a streaming PEF is invertible at a similar cost, which enables a fast approach for missing data interpolation.

## INTRODUCTION

Prediction-error filters (PEFs) play an essential role in geophysical data analysis. They can be used directly in different forms of seismic deconvolution (Webster, 1978; Robinson and Osman, 1996). More importantly, PEFs are a practical approximation for the covariance operators used in geophysical estimation problems (Claerbout, 2014). As such, they provide a way to absorb and characterize statistical patterns (Claerbout and Brown, 1999).

Because most natural patterns are nonstationary, extending the classic stationary formulation to nonstationarity is vital. This extension can be done either by patching, that is, breaking data into overlapping windows, or by a smoothly nonstationary estimation with regularization (Crawley et al., 1999; Curry, 2003). Shaping the regularization provides a particularly effective method for constraining the filter variability (Fomel, 2009; Liu and Fomel, 2011; Liu et al., 2012). Both approaches have an additional computational expense, particularly in storing variable filter coefficients (Ruan et al., 2015).

This paper presents an efficient approach for computing and applying nonstationary PEFs without excessive storage requirements. Using an adaptive-filtering approach (Widrow and Stearns, 1985; Haykin, 2002), we show that a nonstationary PEF can be updated on the fly by accepting one data point at a time. The cost of computing and applying such a PEF is reduced to the cost of a single convolution. In other words, the cost of the PEF design while filtering is equivalent to the cost of applying the filter. Moreover, the nonlinear operation of finding and applying a PEF has an exact inverse, which finds an immediate application in missing data interpolation problems. We extend the filter from one dimension to multiple dimensions using the helix transform (Claerbout, 1998) and show its application to simple benchmark problems.

We have implemented the proposed algorithm in the Julia programming language (Bezanson et al., 2017). For clarity, the essential code is included in the paper. All presented examples are reproducible using the Jupyter Notebook (Kluyver et al., 2016; Granger and Pérez, 2021) provided with the paper.

### Theory

Suppose the filter is updated with each new data point $dn+1$. The additional constraint we can impose to control the variability of the filter is that the new filter $a$ stays close to the previous filter $a\xaf$.

#### Inversion

### Multiple dimensions

### Cost comparison

Table 1 compares the computational cost of different approaches for computing PEFs. The cost of a streaming PEF is minimal and reduces to the cost of a single convolution. Streaming PEF can capture the input data’s nonstationary character without storing multiple copies of the filter.

Moreover, all streaming computations are local, which makes them particularly suitable for modern hardware accelerators such as general-purpose graphics processing units (Sanders and Kandrot, 2010) or Intel Xeon Phi (Jeffers and Reinders, 2013).

### Missing data interpolation

The existence of the exact inversion for the application of a streaming PEF allows for a straightforward approach to missing data interpolation. When reading data in a streaming fashion, every time we meet a missing data point $dn+1$, we can replace its value with a value computed from the residual. The residual value $rn+1$, in this case, can be set to zero or a random number (white noise) with the expected variance of the residual. In the latter case, it is possible to generate multiple equiprobable distributions for the interpolation result (Clapp, 2000). The missing data interpolation is implemented in Program 4.

A simple 1D interpolation example using data from Figure 1 is shown in Figure 2. As evident from Figure 2, such interpolation, although capable of picking the nonstationary data pattern, remains one sided and may create discontinuities on the other side of the data gap. To help with this issue, we can rerun the streaming PEF using the opposite direction and stack the results.

Figure 3 shows a 2D missing data interpolation test using a synthetic pattern from Claerbout and Brown (1999). The input data for this test are shown in Figure 4b. The interpolation proceeds by running the streaming PEF twice in forward and backward directions (Figure 3a and 3b) and averaging the interpolation results (Figure 3c). The cost of such a procedure is the cost of two convolutions, as opposed to multiple iterations required in the conventional approach to missing data interpolation with PEFs (Naghizadeh and Sacchi, 2010; Liu and Fomel, 2011; Claerbout, 2014). A better interpolation result, shown in Figure 4c, is achieved by filling the residual inside the gap with small-variance white noise instead of zeros while reconstructing the data from the residual.

The random residual approach can generate multiple equiprobable realizations for the missing data interpolation problem in the spirit of geostatistical stochastic simulations (Clapp, 2000). Figure 5 shows three realizations created using different seeds for the pseudorandom number generator.

Figure 6 shows a similar interpolation test applied to 2D data with a nonstationary pattern. Although streaming PEF fails to achieve a perfect reconstruction in this case, it manages to capture most of the pattern and hide the location of the gap.

Figure 7 shows a similar test applied to 2D seismic data. The pattern made by reflection events with variable slopes is well captured.

Removing the dominant reflection events by PEF filtering leaves weaker hyperbolic diffraction events (Figure 8).

## DISCUSSION

A streaming application, wherein the adaptive filter is updated one data point at a time, is appropriate in a continuous stream of a large amount of data, such as in passive monitoring of carbon storage (Alumbaugh et al., 2024). In more typical scenarios, wherein the data are immediately accessible, more accurate results can be achieved with other forms of regularization, such as regularizing adaptive filter coefficients with smoothing shaping operators (Fomel, 2009; Liu and Fomel, 2011; Liu et al., 2012). For greater efficiency, a hybrid regularization strategy can be developed. Such strategies are discussed by Geng et al. (2024), who also extend the streaming approach to other applications of seismic attributes.

In applications such as missing data reconstruction, the streaming approach represents an extreme point of the accuracy-efficiency trade-off. To achieve better accuracy, some efficiency can be sacrificed at the expense of performing extra iterations. The presented approach is also limited to the situations of missing values in regularly sampled data and will need to be modified for situations of irregular data.

In multidimensional applications, the helical boundary conditions allow for easy invertibility (Claerbout, 1998) but are not always appropriate. In this case, extra accuracy can also be bought at the cost of sacrificing some of the efficiency.

We hope that providing reproducible benchmarks with this paper will invite the reader to make direct comparisons with more advanced methods. Such comparisons are beyond the scope of this paper because they would require a different codebase.

## CONCLUSION

We have presented an efficient approach for computing and applying nonstationary PEFs. Instead of storing multiple copies of varying filters, the streaming approach stores only one copy and updates the filter on the fly with every new data point. The cost of this procedure is equivalent to the cost of a single convolution and does not require multiple iterations. Moreover, the nonlinear operation of estimating and applying a streaming PEF has an exact inverse, which becomes helpful in missing data interpolation problems. A streaming approach for missing data interpolation does not require iterations and can be accomplished effectively at the cost of two convolutions. We anticipate many possible applications of the proposed technique in geophysical estimation problems.

All examples in this paper are reproducible using the attached Jupyter Notebook.

## ACKNOWLEDGMENTS

We thank R. Clapp, S. Levin, Y. Liu, and R. Sarkar for inspiring discussions about prediction-error filters. We thank the sponsors of the Texas Consortium for Computational Seismology (TCCS) and the Stanford Exploration Project (SEP) for their financial support.

## DATA AND MATERIALS AVAILABILITY

The data associated with this research are available and can be accessed via the following URL: https://doi.org/10.5281/zenodo.11099632.

Biographies and photographs of the authors are not available.