## Abstract

Horizon-based subsurface stratigraphic model building is a tedious process, especially in geologically complex areas where seismic data are contaminated with noise and thus are of weak and discontinuous reflectors. Seismic interpreters usually use stratal (proportional) slices to approximately inspect 3D seismic data along seismic reflectors yet to be interpreted. We introduce an artificial intelligence workflow consisting of three deep learning steps to provide a conditioned seismic image that is easier to interpret, a stratigraphic model that outlines major formations, and moreover a relative geologic time volume that allows us to automatically extract an infinite number of horizons along any seismic reflectors within a seismic cube. Depending on the availability of interpreters, the proposed workflow can either run fully unsupervised without human inputs or using sparse horizon interpretation as constraints to further improve the quality of extracted horizons, providing flexibility in both efficiency and quality. Starting from only seismic images and a few key horizons interpreted on very sparse seismic lines, we demonstrate the workflow to generate a stack of complete horizons covering the entire seismic volume from offshore Australia.

## Introduction

Originated in the 1970s, seismic stratigraphy interpretation is the method of choice for understanding the depositional system and mapping the strata, facies, and reservoir distribution beyond well control. It forms the basis of modern analysis of sedimentary rocks, especially for hydrocarbon exploration (Mitchum et al., 1977). There are two key branches under seismic stratigraphy analysis: sequence stratigraphy and seismic facies. A seismic sequence represents a package of sediments deposited under similar geologic conditions during a specific time interval (Posamentier and Vail, 1988). These sequences are bounded by unconformities, which are surfaces representing gaps in the geologic record due to sea level change, erosion, or nondeposition. By identifying seismic sequences and unconformities, geoscientists can reconstruct the geologic history of an area and understand the factors that influenced sediment deposition. Seismic facies analysis is another essential aspect of seismic stratigraphy. It involves the classification of seismic reflections based on their amplitude, frequency, continuity, and geometry (Mitchum et al., 1977). Different seismic facies correspond to distinct lithologies or depositional environments, allowing geoscientists to infer the nature of the subsurface sediments. Such information is critical for predicting the presence of potential reservoirs of natural resources, especially in oil and gas exploration.

For both sequence stratigraphy and facies analysis, geoscientists rely heavily on horizon-based seismic interpretation to construct the subsurface structural and/or stratigraphic models from seismic images. However, as it is probably the most time-consuming task in seismic interpretation, horizon interpretation is always restricted to a few key seismic events that delineate the overall structure of the region, as well as within a small interval of interest, usually around the reservoir layers. As it becomes impractical to manually interpret every peak/trough in a 3D seismic volume into an individual horizon, over the past decades, researchers have been developing automatic horizon extraction methods, aiming to extract as many horizons as possible from the seismic volume, while reducing human efforts. These automatic horizon extraction methods fall mainly into three categories. The first and simplest category is to use stratal slices (also called phantom horizons) to interpolate between two existing horizons (Zeng et al., 1998a, 1998b). This method relies on existing interpreted horizons as the upper and lower bonds for interpolation and is extremely easy to implement. However, the extracted horizons do not always follow the seismic events precisely, and the error will accumulate with increasing distance between the upper and lower existing interpretation. The second category of solutions is based on merging local horizon patches automatically extracted along peaks and troughs (Borgos et al., 2003; Monsen et al., 2007). The last category is based on seismic structural attributes, such as tracking along structure tensor or dip fields (Lomask et al., 2006; de Groot et al., 2010; Wu and Hale, 2015), and extracting as iso-surfaces from relative geologic time (RGT) volumes (Stark, 2004; Wu and Zhong, 2012). In this study, we propose a deep learning workflow that consists of three steps — seismic conditioning, key horizon extraction, and RGT estimation — to extract an infinite number of horizons from a 3D seismic volume. Depending on the availability of interpreters, the proposed workflow can either run fully unsupervised without any human inputs or use sparse horizon interpretation as constraints to further improve the quality of the extracted horizons, providing flexibility in both efficiency and quality.

In a traditional seismic interpretation workflow, seismic image conditioning is routinely used to improve the accuracy and efficiency of horizon interpretation when using traditional horizon picking tools. This is enabled by removing the noise and artifacts in the data, as well as improving the continuity of seismic events and clarity of faults. Similarly, conditioning the seismic image also provides better performance for downstream machine learning (ML) tasks, such as ML-based horizon picking and fault detection. Although there are abundant classical methods for seismic conditioning (Fehmers and Höcker, 2003; Hale, 2011; Zhang et al., 2016), it usually relies on great expertise to parameterize these methods and requires balancing among different conditioning objectives. To make seismic image conditioning a fully automated process, we use a deep learning model trained in a self-supervised fashion to prepare the seismic data for the next steps.

Deep learning methods have greatly advanced seismic stratigraphy interpretation in the past few years (Zhao, 2018; Peters et al., 2019; Wu et al., 2019; Di et al., 2020; Abubakar et al., 2022). These methods enable geoscientists to map the major stratigraphic layers or facies by training deep learning models with a few labeled examples. In this study, we use deep-learning-based seismic stratigraphy interpretation to generate complete coverage of key horizons in a 3D seismic volume, which serve as constraints when performing the fully automatic horizon extraction over the entire seismic volume.

The concept of RGT was introduced in Stark (2004), where the author used phase unwrapping to generate a 3D volume representing the estimated relative age of each seismic sample in the volume. With a sequence of sedimentary rocks, RGT indicates the relative age of a sedimentary rock layer compared to the other layers (Catuneanu, 2017). As younger layers sequentially deposit on older layers, RGT can be determined by tracking the vertical order of rock layers (Dawes and Dawes, 2013). With a determined RGT volume, one can conveniently identify the order of the sequence and extract horizons as iso-surfaces. We use a deep learning model to generate an RGT volume under the constraints of the key horizons interpreted from the previous step to improve the quality of the horizons in regions with complex structures.

In the rest of the paper, we use an example from the Stybarrow oil field in offshore Australia to demonstrate the proposed workflow, while also providing more technical details on the deep learning methods for each of the three steps.

## Data set

The Stybarrow oil field is located in a northeast–southwest-tilted fault block, bound to the east and west by faults and with structural closure to the southwest in Western Australia (Figure 1a). It was discovered in 2003 by BHP Billiton (now BHP), which drilled Stybarrow-1 to target the turbiditic Late Jurassic Macedon Sandstone in a fault-bound structural closure. After an 18.6 m net thickness oil column was discovered, the field was appraised with Stybarrow-2 to Stybarrow-4, and the Stybarrow 3D seismic survey was collected between 2003 and 2005. The subset used in this paper covers an area of 4.875 by 11.875 km and the depth from 800 to 2572 ms two-way traveltime. The seismic survey consists of 391 inlines, 951 crosslines, and 887 samples per trace. As shown in Figure 1b, the seismic data are imaged of good quality. One of the dominant features in the 3D seismic data is the pinch-out, above which the structures are relatively continuous and clearly imaged. However, the zones below the pinch-out contain numerous faults and relatively poor signal-to-noise ratio (S/N).

## Workflow description

Figure 2 illustrates the proposed workflow for generating a horizon volume corresponding to any 3D seismic data. The workflow consists of three components, detailed in the following.

** Seismic image conditioning.** We define conditioning as transformations to simultaneously accomplish the tasks of random noise attenuation, coherent migration artifact attenuation, smoothing and/or filling in broken horizons, and accentuating faults. The conditioning approach also must maintain resolution between finely spaced consecutive horizons, maintain fault integrity, and within reason, locally maintain relative amplitudes. Here, seismic image conditioning is conducted employing a single model built using a deep convolutional neural network. The network is trained in an unsupervised manner using multiple self-supervised tasks such as denoising, occlusion, distortion, etc. During training, the parameters related to the self-supervision tasks are continuously and randomly modified in a controlled manner to ensure that the model excludes inconsistent features in the survey and only learns meaningful consistent attributes of the data. Such an approach endows the model with capabilities to detect and extract widely varying noise/artifact characteristics and strengths. Furthermore, the nature of the training tasks also provides the model with the ability to improve horizon continuity in poor S/N zones while maintaining fault integrity. It is noteworthy that the model will also attenuate coherent noise and migration artifacts.

Figure 3 shows an example of how the imaging quality of the Stybarrow survey is improved by employing the conditioning ML model. Note that the method sharply attenuates migration artifacts, significantly boosts the horizon continuity in deeper zones, and effectively sharpens the fault planes. While a certain amount of signal change is observed locally at a short scale, such changes have minimal effect on structural interpretation tasks including the two described in the following.

** Key horizon extraction.** With the conditioned seismic volume, the next step is to extract a set of key horizons that outline the major formations and are to be used for constraining the following step of RGT estimation. In general, we formulate the task as a multiclass segmentation problem. To address the challenge of sparse horizons in a seismic image, the segmentation is targeted at the stratigraphic formations bounded by these target horizons. Figure 4 illustrates the corresponding workflow. More specifically, it starts with picking the key horizons in a few sections. Next, the horizons are converted into stratigraphic images with the horizons as the stratigraphic boundaries. With the converted stratigraphic images at the selected sections as training labels, the semisupervised learning workflow for stratigraphy interpretation by Di et al. (2020) is applied to generate the stratigraphy cube (Figure 5). Finally, we extract the stratigraphic boundaries throughout the survey to extract the horizons while correlating them with the corresponding seismic peak/trough events.

For the Stybarrow data set, six key horizons are identified. Manual picking of the six horizons is on eight inline and 10 crossline sections (Figure 6), and the corresponding stratigraphic image contains seven sequences. Figure 7 displays the predicted stratigraphy cube as well as the extracted six horizons. All of them match the seismic events closely and clearly capture the depositional features from the seafloor to the unconformity. Additionally, Figure 8 compares the stratigraphy prediction from original and conditioned seismic images, which verifies the added values of preconditioning seismic images in producing artifacts (denoted by circle and arrow). Although a few small pieces are missing as the machine prediction is rejected while snapping to the seismic events, it has minimal effect for ensuing interpretation workflows including the horizon-constrained RGT estimation described in the following.

** RGT estimation.** The workflow to generate the RGT volume is illustrated in Figure 9. In addition to the step of key horizon extraction noted earlier, a 3D seismic survey is fed into another processing branch — FlowNet inference, from which a flow field for each pair of seismic traces will be obtained. The following is loop-tie optimization, in which both the key horizons and flow fields will be fed into an optimization engine to iteratively improve the RGT volume until it satisfies both the flow field correlation and the key horizons constraints. In this workflow, the flow field between each pair of seismic traces is obtained by applying a pretrained FlowNet on the 3D seismic survey. The training and implementation details of FlowNet can be found in Li and Abubakar (2020a, 2020b).

Figure 10 illustrates the optimization process in detail. The RGT cube is assumed to be in a Cartesian grid, which consists of inline, crossline, and depth dimensions. The inline dimension is indexed by *i* = 1, 2, …, *n*, with *n* representing the number of grids along the inline dimension. Crossline and depth dimensions are indexed by *j* = 1, 2, …, *m* and *k* = 1, 2, …, *l*, respectively. *RGT*_{i,j,k} is used to indicate the RGT value at inline *i*, crossline *j*, and depth *k*. Flow fields involved in this study are oriented along both inline and crossline directions, annotated as *F*^{in} and *F*^{cr}, respectively. $Fi,jin$ indicates the flow from trace *RGT*_{i,j} to *RGT*_{i+1,j} along the inline direction, and $Fi,jcr$ indicates the flow from trace *RGT*_{i,j} to *RGT*_{i,j+1} along the crossline direction.

Starting from a flow field trace $Fi,jin$ along an inline direction between two RGT traces *RGT*_{i,j} and *RGT*_{i+1,j}, an accurate flow field will provide the pixel-wise vertical shift distance of seismic horizons from *RGT*_{i,j} and *RGT*_{i+1,j}. Specifically, the horizon located at depth *k* on *RGT*_{i,j} will move to depth *k* − $Fi,j,kin$ on *RGT*_{i+1,j}, where $Fi,j,kin$ is the shift displacement. Let $RGTi+1,j,k\u2212Fi,j,kin$ be the value of *RGT*_{i+1,j} sampled at depth *k* − $Fi,j,kin$. A good estimate of an RGT cube should satisfy the following equation:

As the depth *k* − $Fi,j,kin$ might not be at the exact grid location, linear sampling is applied to obtain an estimate of $RGTi+1,j,k\u2212Fi,j,kin$:

In this study, flow field is obtained by training and inferencing seismic FlowNet. Based on equation 1, the following mean square error loss functions can be defined by including all the trace pairs, along both inline and crossline dimensions:

In addition to the flow field-based loss functions in equations 3 and 4, loss functions based on seismic amplitude can be included in the optimization. Let *A*_{i,j,k} represent a seismic amplitude that shares the same grid location with *RGT*_{i,j,k}. For every trace, an RGT value should correspond to a seismic amplitude value. Let $A\u2032i,jin$ be the seismic amplitude values of trace *A*_{i+1,j} to which *RGT*_{i,j} corresponds. $A\u2032i,jin$ can be calculated be linear sampling *A*_{i+1,j} at all *RGT*_{i,j}. For a good RGT cube, $A\u2032i,jin$ should have maximum cross correlation with *A*_{i,j}. The following amplitude-based loss function can be defined:

where $Ci,j,kin$ is cross correlation, and *s* is the half size of the window for calculating cross correlation. The final loss function can be defined in the following, where *b* is a scaling factor to balance between flow-based loss and amplitude-based loss:

The principle of superposition in stratigraphy states that for a sequence of sedimentary layers, the older layer is at the base and layers above are progressively younger with ascending order (Dawes and Dawes, 2013). This requires that the RGT on a single trace is monotonically increasing as the depth increases. Furthermore, ensuring that each RGT trace is a monotonic function can significantly reduce the complexity of the algorithm to extract horizons from the RGT cube.

Given a seed point, horizon extraction from an RGT cube is a process of finding the iso-surface so that all the points on the surface have the same RGT value as the seed point. An RGT value search on a monotonic curve can be accelerated in different ways by taking advantage of the assumption of monotonicity. This will make horizon extraction much faster than on a nonmonotonic curve.

To ensure that the monotonic constraint is satisfied for each trace, we parameterize the RGT value as a running total of a sequence of positive numbers. Let a real number *w*_{i,j,k} ∈ ℝ be the parameter to be optimized at inline *i*, crossline *j*, and depth *k*. The RGT trace value at inline *i*, crossline *j* will be formulated as

For any value of *w*_{i,j,k} ∈ ℝ, *e*^{wi,j,k} is positive, and this guarantees that *RGT*_{i,j} is monotonically increasing along depth.

The final loss is differentiable with respect to each parameter *w*_{i,j,k}. Automatic differentiation of equation 7 with respect to *w*_{i,j,k} can be implemented in many deep learning frameworks as an industry standard. Once the loss function reaches a preset threshold, the optimization is completed and the RGT cube can be calculated using equations 6 and 7.

The flow field generated by seismic FlowNet (Li and Abubakar, 2020a) can vary in scale. The distance that a flow field can extend ranges from one grid interval to *n* − 1 grid intervals, where *n* is the number of grids along one of the two lateral dimensions. Selecting the flow field with shorter working distance requires more computer memory and runtime to complete the optimization, as more variables and computing steps are involved. On the other hand, using a flow field that is too short may cause the optimization to be trapped in local minima. Identifying the proper flow field range is important to balance convergence stability, computing time, and output resolution.

To achieve stable convergence and save computing resources, optimization in this study is performed on a decimated grid. Every 10th trace is kept for RGT optimization. This requires the input of a flow field with a working distance of 10 grid intervals. The flow field is derived by simply aggregating all the flow fields in between with distance of one trace interval. Another method to obtain such a flow field is to feed the seismic FlowNet with image pairs that are 10 grids apart and output the required flow field. Once the optimization is completed on the sparse grids, RGT with full resolution can be obtained using interpolation.

## Results and applications

Figure 11 demonstrates a 3D view of the generated horizon cube. It not only successfully tracks the continuous horizons in the top zone and clearly reflects the pinch-out, it also captures well the geologic complexities in the presence of faults in the deep zone.

The generated horizon cube can be used for other tasks of seismic interpretation, such as

Figure 12 demonstrates an example of tracking four horizons in the Stybarrow data set. All of them match the original seismic events with high accuracy, especially the two horizons in the middle where the seismic quality is relatively low.*Seeded horizon tracking.*A stratum allows detailed analysis of rock relationships within a chronostratigraphic framework and moreover the distribution of facies and lithology to identify potential stratigraphic traps. Figure 13 demonstrates the extracted stratum that clearly reflects the pinch-out and indicates the lateral variations in deposition.*Strata extraction.*By treating the horizon cube as a low-frequency model, incorporating it into ML-based static property modeling has led to significantly improved consistency of machine prediction, particularly in the absence of sufficient training wells (Di et al., 2022).*Structure-constrained ML-based static property modeling.*

## Conclusions

In this paper, we presented an integrated workflow for converting 3D seismic data into the corresponding horizon volume via deep-learning-accelerated seismic data conditioning, key horizon extraction, and RGT estimation. As tested over the public Stybarrow survey in Western Australia, it captures the structural complexity well in the presence of faults and generates a horizon volume that is of high quality and can further assist other seismic interpretation tasks such as seeded horizon tracking, strata extraction, and structure-constrained static property modeling.

## Acknowledgments

We would like to thank Chester J. Weiss, Mrinal K. Sen, and one anonymous reviewer for their insights and suggestions on improving the quality of this work. Thanks also go to Geoscience Australia for providing the Stybarrow seismic survey and SLB for granting permission to publish.

## Data and materials availability

Data associated with this research are available and can be obtained by contacting the corresponding author.