Machine-learning (ML) applications in seismic exploration are growing faster than applications in other industry fields, mainly due to the large amount of acquired data for the exploration industry. The ML algorithms are constantly being implemented for almost all the steps involved in seismic processing and interpretation workflow, mainly for automation, processing time reduction, efficiency, and in some cases for improving the results. We carry out a literature-based analysis of existing ML-based seismic processing and interpretation published in SEG and EAGE literature repositories and derive a detailed overview of the main ML thrusts in different seismic applications. For each publication, we extract various metadata about ML implementations and performances. The data indicate that current ML implementations in seismic exploration are focused on individual tasks rather than a disruptive change in processing and interpretation workflows. The metadata indicate that the main targets of ML applications for seismic processing are denoising, velocity model building, and first-break picking, whereas, for seismic interpretation, they are fault detection, lithofacies classification, and geobody identification. Through the metadata available in publications, we obtain indices related to computational power efficiency, data preparation simplicity, real data test rate of the ML model, diversity of ML methods, etc., and we use them to approximate the level of efficiency, effectivity, and applicability of the current ML-based seismic processing and interpretation tasks. The indices of ML-based processing tasks indicate that current ML-based denoising and frequency extrapolation have higher efficiency, whereas ML-based quality control is more effective and applicable compared with other processing tasks. Among the interpretation tasks, ML-based impedance inversion indicates high efficiency, whereas high effectivity is depicted for fault detection. ML-based lithofacies classification, stratigraphic sequence identification, and petro/rock properties inversion exhibit high applicability among other interpretation tasks.

Seismic processing and interpretation rely on a workflow that uses a series of standard steps customized and calibrated by expert operators for each specific data set. The choice of an optimal processing workflow and the selection of the most appropriate interpretation strategy are a mixture of constantly evolving technologies, scientific knowledge, technical competence, talent, and intuition. The growing size of the data sets (Arrowsmith et al., 2022) and the need for reducing the acquisition-to-delivery time on one side, and the growing power of computational facilities on the other, have made the use of data-driven methods extremely attractive for industry and researchers. In this context, machine-learning (ML) methods have been the object of remarkable investments and developments witnessed by an exponentially growing number of publications.

ML methods rely on techniques in which the “machine” learns how to process and interpret the data by optimizing complex relationships between the input and the output data space that are often agnostic to the modeling of physical phenomena. This represents a real revolution with respect to the traditional methods that are strongly based on the physics of the problem. However, is the goal of science to understand or predict? Alkhalifah (2022) believes, that in the age of data science, we are moving back to predictive models that can resolve many limitations of the physical models. He defines ML as “a very extensive numerical tool, nothing more and nothing less, which is based on optimization.” It means that instead of hand crafting the transformations and representations, such as the Fourier transform, the machine develops its own transformations through optimization that is calibrated for specific problems. Despite the controversies, many believe that the ML revolution is the one that we cannot miss. It is very big, and it might fundamentally change the way we do things in the industry.

The main driver for moving to fully data-driven methods is the idea of mitigating some drawbacks of the current technologies and methods. The first aspect is efficiency. Currently, we use complex workflows that demand many parameters and solutions that require an expert operator based on analysis, tests, and evaluations carried out with a Galilean approach (trial and error). This process requires time and expertise. It is very attractive to have technologies that can directly provide results on a computer-based process without the need to make decisions. The second aspect is bias. Various technologies and different operators will provide different results, which are biased in an uncontrolled way by the choices that are made during the process. The objective of an ML application would be obtaining results that are more repeatable and, in a sense, more “objective” because they are from a process that is only related to the real information present in the data. The third aspect is effectiveness. Every processing and interpretation technique has intrinsic limitations and, according to the method and choices, the results will allow us to “see” different things. The objective here is to have methods that automatically provide high-quality results, reflecting rich information about the subsurface.

All of these limitations are the reasons for investing in ML techniques. We look at these new developments as a potential holy grail of seismic exploration where the raw data are given to a machine that provides automatically the best and most informative model in an efficient manner. This idea undoubtedly represents a revolution in the way we carry out seismic processing and interpretation, but many issues related to ML implementation must be addressed. The complexity in the relationship between the model and the data in seismic exploration is incomparable to most of the problems that are now routinely solved by ML techniques. For example, the wave propagation inside the earth is much more complex than the wave propagation inside human organs that is used for medical imaging (Pratt, 2019; Jakobsen et al., 2023). As a result, in most cases, fine-tuning already trained ML models from other disciplines (also referred to as transfer learning) for seismic processing and interpretation tasks is useless. Moreover, training ML models from scratch strongly relies on abundant labeled data, which in seismic exploration is challenging because ground truth for real data does not exist. Even though synthetic data, which can be labeled, complement the real training data, they hardly replicate the complexity of the real noisy field data. For these reasons, the current state of ML implementation is mostly devoted to making some processing or interpretation steps of the traditional workflow more efficient, effective, and unbiased. Therefore, the main question is: Where are we on our path to search for the holy grail?

The use of ML techniques for seismic processing and interpretation has exponentially expanded in the past 10 years, even at a faster pace compared with other industries (Figure 1). However, are the existing ML-based seismic exploration applications effective and efficient? Do ML algorithms considerably change the traditional seismic exploration workflow and bypass traditional intermediate steps to provide end-to-end algorithms? These are the important questions that we want to answer in this paper by creating a clear picture of the existing ML-based applications in the context of seismic exploration.

We considered the major literature repositories in the field of seismic exploration (SEG Digital Library and EAGE EarthDoc) between 2010 and 2021 and analyzed more than 500 ML-based publications. Most of the publications were from the EAGE annual conference, the SEG technical program, and Geophysics (Figure 2). We carried out a literature-based analysis by harvesting metadata of the publications, focusing on the type of the addressed seismic processing/interpretation task, the dimensionality of the problem and solution (one dimension, two dimensions, or three dimensions), the characteristics of the implemented ML model and its architecture, the optimization method, data format and data conditioning, size of the input and output, number of training samples, attribute features, computational power requirement, generalization of the model to unseen data, type of training and testing data (synthetic and real), as well as basic publication information such as the type of paper, affiliation and company involvement, synthetic data simulation method and simulated noise, and data augmentation for training. We avoided duplications and ignored similar abstracts and journal papers from the same authors presented to EAGE and SEG.

We divide the publications into processing and interpretation applications. Already from this very general representation, it is possible to infer some research trends. More than half (53%) of the documents are related to interpretation applications (the pie chart in Figure 3), which is in contrast with the general (ML-based and non-ML-based) research focus within the same period. According to EAGE and SEG repositories between 2010 and 2021, only 44% of total publications are devoted to interpretation (the pie chart in Figure 3). One of the main drivers for the implementation of ML-based interpretation is the similarity of the tasks to those in computer vision. In addition, interpretation is the most time-consuming aspect of seismic exploration in terms of human involvement. As a result, the automation of interpretation tasks is another motivation for the implementation of ML methods. In contrast, for processing steps, the larger size data (raw prestack) have led to the development of many data-driven quantitative measurements in the traditional processing workflow. Nevertheless, the processing steps such as velocity model building (VMB) are usually computationally very demanding. The ML implementations to processing tasks usually aim at enhancing computational efficiency and at creating competitive solutions to traditional ones with higher accuracy. Another remarkable aspect is the large interest from exploration companies in the research and development of ML-based seismic exploration applications, which has been the main booster for these applications. Almost half (47%) of the ML-based processing applications are developed directly by or in collaboration with companies, whereas the involvement of the companies for ML-based interpretation applications increased to 66% (Figure 3).

In the following, we review and briefly describe the state-of-the-art ML approaches that are currently applied to seismic processing and interpretation. Because synthetic data play a major role in ML model training, we provide a short overview of such synthetic models and associated data that are being used for ML-based seismic exploration. The ML-based interpretation applications can be reinforced by using appropriate attributes as the input. We investigate the use of attributes in ML-based interpretation surveys and describe the essential dimensionality reduction methods used for attribute selection. In the “Discussion” section, based on the extractable statistical data and described applications, we define indices that aim at approximating the current efficiency, applicability, and effectiveness of ML-based processing and interpretation tasks. In general, very rich glossaries and online materials are available for ML and deep-learning terminologies and descriptions. We suggest Google Developer’s glossary (Google, 2023) for terminologies and short descriptions of ML and deep-learning algorithms and scikit-learn (Scikit-learn, 2023) for ML supervised and unsupervised algorithms. A detailed description of deep-learning methods can be found in Goodfellow et al. (2016) and Aggarwal (2018). A comprehensive description of ML methods can be found in Kroese et al. (2019) and Kelleher et al. (2020). In addition, a useful overview of ML methods in seismology is provided in Mousavi and Beroza (2022). From the implementation point of view for ML and deep-learning algorithms, we suggest the guidelines on Keras and scikit-learn platforms.

The use of synthetic seismic data sets for the training stage of ML for seismic processing and interpretation is rapidly growing, mainly due to the challenging task of labeling the real data, bias associated with labeled real data, and lack of open-access field data that are representative of different geologic settings. The synthetic data sets can be labeled automatically or semiautomatically in most of the applications such as denoising, frequency extrapolation, and fault detection. If we neglect the bias in the numerical modeling of the synthetic data, the synthetic labeled data can be considered as the ground truth and, as a result, bias-free data. This assumption is reasonable as numerical modeling is already a part of many traditional workflows (e.g., full-waveform inversion [FWI] and impedance inversion). Nevertheless, an ML model that is trained solely by synthetic labeled data can underperform when it is applied to a real data set due to a distribution mismatch between synthetic and real data. Alkhalifah et al. (2022) propose a strategy to bridge the gap between synthetic and real data using domain adaptation principles that can potentially resolve this issue and enhance the performance of the ML model on real data. For seismic processing problems, usually raw synthetic data from elastic/acoustic numerical solvers are considered, whereas the use of the 1D convolutional model is more common for interpretation steps because it is computationally much more efficient.

Many realistic synthetic models have been introduced in the literature and, for most, the open-access raw data are also available. Figure 4 shows that the Marmousi I, Marmousi II, and SEG Advanced Modeling Program (SEAM) models correspond to approximately 70% of the synthetic models that are used for ML-based seismic processing and interpretation applications.

According to the literature, there is a diverse use of ML in different processing steps and, although certain technical problems are routinely addressed with ML, for many others the application of ML is still limited. Hence, we analyzed the publications concerning ML’s application to various processing tasks and, according to the published examples, we divided seismic processing into four main categories, each containing several steps: preprocessing, processing, VMB, and passive surveys (Figure 5). Preprocessing and VMB are comprised of 82% ML-based applications, mostly focused on denoising (22%), trace interpolation (11%), VMB from raw data (14%), frequency extrapolation for FWI (6%), first-break picking (9%), event separation (6%), near-surface velocity model estimation using groundroll (6%), and quality control (QC) (4%).

Another aspect that we considered was the choice of different ML methods and we analyzed the literature to identify the most used methods, also in relation to specific applications. It is interesting to notice that whereas for certain tasks there are examples of using many different ML methods (e.g., for surface wave processing), there are processing steps for which the use of methods other than convolutional neural networks (CNN) seems to be negligible (Figure 6). We analyzed all the published applications for processing and identified subcategories and their share. Among all the published examples of application, we selected the most relevant ones in terms of technical innovation in terms of efficiency and ML model selection and the significance of the results. For each processing step, we will describe in the following the most important applications and technical issues.

Preprocessing

Quality control

Because in many of the QC applications, such as anomalous trace detection, the prestack raw data are considered, using an automatic and fast approach is of high importance. The ML-based QC applications (outer pie in Figure 5) are mainly focused on noise recognition (Farmani and Pedersen, 2020; Walpole et al., 2020), anomalous trace identification (Damianus et al., 2020), and erroneous first-break detection (Duan et al., 2018). By noise detection, we intend the step of recognizing the noise only and not the process of performing the denoising.

Noise detection is treated as classification and regression problems. Martin et al. (2021) use CNN U-Net architecture to classify 2D patches of the labeled seismic data into four classes of data (i.e., signal, signal and noise, noise, and mask). Conversely, Walpole et al. (2020) consider the CNN InceptionV3 architecture with a single output node that defines the noise level of the input trace. The ML implementation for anomalous trace detection is implemented using unsupervised (Hou et al., 2019; Damianus et al., 2020) and supervised (Vishwakarma, 2021) algorithms.

Denoising

Despite the many advances of these denoising methods, they include many parameter settings, and if poorly selected, they can lead to an ineffective denoising application. However, ML-based algorithms have a long history in nonseismic image denoising (Elad and Aharon, 2006; Vincent et al., 2010; Chen et al., 2014). Similar ML algorithms are usually adopted for ML-based seismic denoising, and they can be applied in a fully automatic manner after training. The ML-based seismic denoising is mainly focused on random noise and groundroll attenuation (Figure 5).

The supervised CNN denoising models are being applied for random denoising (Wu et al., 2019b; Yu et al., 2019), multiple removal (Wang and Nealon, 2019), groundroll attenuation (Jia et al, 2018; Li et al., 2018a; Yu et al., 2019), and seismic interference and swell noise removal for marine data (Slang et al., 2019; Brusova et al., 2021). Denoising CNN (DnCNN) algorithms are very common for image-denoising applications, in which usually the target is the noise (residual data) rather than the clean data (Zhang et al., 2017). Residual mapping provides a tremendous advantage for the simplicity of the implementation as the model is required to simulate only the noise. Therefore, the details of the seismic events are well preserved when predicted noise is subtracted from the noisy data. Several authors have implemented the DnCNN method for seismic random denoising (Liu et al., 2018; Zhang et al., 2018a). Generative adversarial network (GAN) models are also routinely used for random denoising (Alwon, 2018) and groundroll attenuation (Si et al., 2020), which are usually implemented in a semisupervised manner. Nevertheless, Ovcharenko and Hou (2020) show that although a GAN model works well for trace interpolation, CNN (U-Net architecture) performs better for random noise removal compared with GAN (U-GAN architecture). Unsupervised denoising autoencoder (DAE) models are the other common methods for denoising seismic random noise (Liu et al., 2020a; Saad and Chen, 2020; Birnie et al., 2021; Gao et al., 2021). DAEs are specific types of autoencoder models, in which the training data are intentionally corrupted with noise. Then, the corrupted data are encoded into useful features and decoded to reconstruct the clean data, removing the random noise (Liu et al., 2017b). Saad and Chen (2020) pretrain the DAE with a synthetic data set in a supervised manner, in which the loss function was designed to represent the misfit with respect to the true clean data. In other words, the DAE was treated as a normal encoder-decoder network in the first step. Then, they fine tune the DAE in an unsupervised manner using the field data sets with a customized loss function that does not require the labels. This approach is also referred to as self-supervised training. They show with four synthetic and two field tests that the DAE performs better in the denoising task compared with f-x singular spectrum analysis (SSA) (Oropeza and Sacchi, 2011) and f-x deconvolution (deconv) (Canales, 1984) benchmark algorithms. Specifically, the DAE model is better at preserving useful signals during the denoising process compared with the other two methods. In Figure 7, a comparison between the performance of the DAE compared with f-x deconv is shown. Using a similar self-supervised approach, Birnie and Alkhalifah (2022) aim at damping field noise in the data rather than solely the random noise.

Trace interpolation

Several interpolation methods currently exist that aim at recovering randomly and regularly missing traces, such as the sparse transform method (Duijndam and Schonewille, 1999), frequency space filter method (Spitz, 1991), and rank reduction method (Trickett et al., 2010). Nevertheless, each of these methods is efficient under certain assumptions such as linearity, sparsity, and sampling regularities. The supervised ML implementation for data interpolation is much easier compared with other ML-based processing applications because the labeled data can be prepared automatically. Usually, traces are randomly or regularly and in an automatic manner removed from the shot gathers to create the input data, and the full shot gathers or patches of the full shot gathers are considered as the output.

CNN is by far the most common model in these applications (Mandelli et al., 2018; Wang et al., 2018a; Wang et al., 2019a; Zhang et al., 2020b). Mandelli et al. (2018) compare the results of the CNN-based interpolator with the benchmark multichannel SSA algorithm (Oropeza and Sacchi, 2011) that was applied to a seismic shot gather field data set with 10%, 30%, and 50% missing traces; the signal-to-noise ratio (S/N) of the ML-based results was, on average, more than 70% higher in all three cases. Another common model for seismic data reconstruction is GAN (Alwon, 2018; Chang et al., 2018; Garg et al., 2019; Ovcharenko and Hou, 2020; Wei et al., 2021a), which is usually implemented in a semisupervised manner. The Ovcharenko and Hou (2020) comparison of the CNN and GAN models for interpolation suggests a better performance of GAN in reconstructing weak events of noisy data. Data interpolation is also conducted using the support vector regression (SVR) (Jia and Ma, 2017), long short-term memory recurrent neural networks (LSTM-RNN) (Kuijpers et al., 2020; Yeeh et al., 2020), autoencoder (Wang et al., 2020), and transformer (Harsuko and Alkhalifah, 2022) methods. Jia and Ma (2017) combine a data-driven tight frame with the classical SVR algorithm to further improve the performance of the training and enhance the S/N of the reconstructed data. Harsuko and Alkhalifah (2022) create a transformer model that involves pretraining and fine-tuning procedures to process seismic data. They pretrain the data in a self-supervised manner to store useful features of specific data needed for various processing tasks such as trace first-break picking and denoising. They use the masked language modeling concept from natural language processing to pretrain the ML model. In this context, the seismic sections are treated as sentences and traces as individual words. The pretrained model can reconstruct the missing traces in addition to extracting useful features for downstream processing tasks (the fine-tuning stage).

Event separation

The ML-based seismic event separation task is focused on P- and S-wave separation, diffraction, and deblending (Figure 5). Traditional P- and S-wave separation algorithms require accurate velocity models to perform well in the far offsets. In the ML-based P- and S-wave separation, multichannel input and output are considered mainly in the framework of the CNN (Xiong et al., 2020) and GAN (Wei et al., 2021b) neural networks. The input channels include the horizontal and vertical components of the data, whereas the output channels correspond to the separated S and P waves.

Traditional diffraction separation methods rely on exploiting the different kinematic properties of the reflection and refraction waves (e.g., Landa et al., 1987; Bansal and Imhof, 2005; Fomel et al., 2007; Moser and Howard, 2008) and attempting to destroy the reflection data. Nevertheless, the remaining noise in the data can have a noise level similar to diffraction data, obscuring the refractions (Decker et al., 2013). In addition, these analytical models can be computationally very expensive (Lowney et al., 2020). Recently, noticeable attention has been dedicated to the use of ML algorithms in the recognition of the diffraction data from seismic gathers. Most applications use the supervised approach in the scheme of CNN (Lowney et al., 2019; Kim et al., 2020; Tschannen et al., 2020; Bauer et al., 2021) to separate the reflection and diffraction data. Semisupervised methods such as GAN models (Durall et al., 2020; Lowney et al., 2020) are also other common methods for this task. In the training stage of almost all the applications, synthetic data with or without field data are considered, given the difficulty of obtaining precise diffraction data from real data using traditional methods. The applications can be applied to various stages of the data, such as raw, zero offset, and prestack migrated data.

Traditional deblending involves many steps that should be optimized and are computationally very expensive. All reviewed ML-based deblending applications are considered CNN models and models are trained in a supervised manner (Slang et al., 2019; Nakayama and Blacquière, 2020; Sun et al., 2020; Hou and Messud, 2021), using the raw shot gathers as the input and deblended results as the output. Given that the synthetic data simulation of blended data is complicated, most applications only consider real data for the training stage (Slang et al., 2019; Hou and Hoeber, 2020; Sun et al., 2020; Hou and Messud, 2021; Li et al., 2021b).

Processing

Seismic processing has been automated to a great degree, and consequently, few ML applications are available to address this stage of seismic exploration workflow. In Figure 5, we show the fraction of ML-based applications focused on deconvolution, migration, and stacking.

Chen et al. (2019) and Lu et al. (2019) use multilayer perceptron (MLP) to estimate the seismic wavelet, whereas Xiao et al. (2020) consider CNN to perform sparse-spike deconvolution. Almost all ML-based migration applications focus on performing least-squares migration (Liu et al., 2020b) and least-squares reverse time migration (Huang and Huang, 2021; Torres and Sacchi, 2021; Vamaraju et al., 2021). Cheng et al. (2020) address the prerequisite of the migration step, and they use the CNN model to find the Fresnel location required for a successful migration.

Stacking is already a fully automated and computationally reasonable process. Nevertheless, sometimes the number of shots being stacked is not sufficient to significantly increase the S/N and enhance the seismic image. To further increase the stacking capability, Aharchaou et al. (2021) develop a CNN model to find similar small patches of poststack data and stack these patches to enhance the seismic image. From another perspective, the stacking can be observed as a temporal resolution issue and the seismic images can be enhanced by recovering high frequencies of the data. Halpert (2018) and Zhang et al. (2019) develop a GAN model, Choi et al. (2021) use the CNN model, and Yuan et al. (2021) consider a sequential CNN scheme to recover the high-frequency data. Although the three former authors considered the poststack migrated data as the input, Yuan et al. (2021) aim to recover the high frequencies of the raw data.

Velocity model building

Frequency extrapolation

Low-frequency data can significantly enhance the performance of FWI, mitigating cycle skipping (Bunks et al., 1995). Recently, significant research has been conducted to recover low-frequency data, which can be categorized into envelope calculation methods (Wu et al., 2014), a phase tracking method (Li and Demanet, 2016), and an exponential damping method (Choi and Alkhalifah, 2015). Still, these approaches do not exploit an intrinsic relationship between the high and low frequencies. ML-based applications have gained remarkable attention for the task of frequency extrapolation because the training data can be automatically generated using low-cut filters applied to full-bandwidth data.

Most ML-based frequency extrapolation applications consider CNN models (Kazei et al., 2019; Ovcharenko et al., 2019; Fang et al., 2020; Sun and Demanet, 2020) and other schemes such as RNN (Fabien-Ouellet, 2020) and physics-guided neural networks (Hu et al., 2020) are rarely used. The existing applications use various innovative representations of the input data to the ML architecture. Ovcharenko et al. (2019) treat the frequency extrapolation as an estimation of the spectral values of considered target frequencies. They consider spectral real and imaginary values at 34 discrete frequencies as the input to estimate spectral values of a single frequency lower than the input frequencies. They perform FWI on the extrapolated data, which shows that the low-frequency elements significantly helped in correcting the large-scale errors in the initial model and in the convergence of the inversion. The issue with this approach is that, for each frequency recovery, a separate ML model should be trained. Sun and Demanet (2020) consider the single trace with only high-frequency and full-band frequency elements as the input and output, respectively, of the CNN scheme. Fang et al. (2020) use 2D patches of the raw data with high-frequency elements as the input and full-frequency band patches as the output. They test the frequency extrapolation and FWI on the synthetic SEG/EAGE overthrust model and field data; in both cases, the FWI of the extrapolated data from the CNN model resulted in better continuity of the layers compared with the FWI when the high-frequency data were used. Ovcharenko et al. (2022), using a similar approach, are able to estimate the seismic data as low as 2.5 Hz for real marine streamer data and perform FWI.

VMB from raw data

Recently, ML-based VMB from raw data has gained significant attention, with the aim of providing an ML model that can substitute for the FWI. The input of these ML models is the seismic raw gather, and the output is the velocity model. Because ground truth velocity models for real data sets do not exist, all supervised implementations use synthetic data sets for the training stage. In most popular applications, CNN and MLP schemes are used to model the nonlinear relationship between the input raw data and target velocity models (Lewis and Vigh, 2017; Araya-Polo et al., 2018; Yang and Ma, 2019; Kazei et al., 2020; Li et al., 2020). In these applications, the parameters of the ML models are updated iteratively by computing the loss between the target and estimated velocity models. Deep GAN algorithms are also applied to VMB tasks (Mosser et al., 2018) that can address the limitation in the availability of abundant labeled data. Araya-Polo et al. (2019) train a GAN model to generate arbitrary velocity models using a small number of variables and compute the seismic data corresponding to them using a finite-difference algorithm. Yao et al. (2023a) use adversarial neural networks to regularize anisotropic FWI to balance the increase in the sensitivity of the inversion to anisotropy and to constrain the updates at each iteration. Then, a CNN model was trained to map the relationship between the seismic data and the velocity models. The application of the model to the test data set shows good accuracy (Figure 8). In another notable application, Yao et al. (2023b) use the domain translation CNN-GAN to translate acoustic data to elastic data during FWI, to reduce the cost of simulating the elastic data. FWI is a significantly nonlinear optimization problem. Recently, the use of neural-network-assisted FWI (Sun and Alkhalifah, 2020) and regularized neural network FWI (Wu and McMechan, 2019; Zhang and Alkhalifah, 2022) has shown promising results.

In other innovative approaches, the physics of the seismic wavefield is considered in the ML-based model for VMB (Costa Nogueira Junior et al., 2019; Xu et al., 2019; Jin et al., 2020; Sun et al., 2021). In contrast to model-based schemes, in these physics-based applications, the loss is computed between the simulated and true seismic data, reducing the dependencies of the ML model on the training set by enforcing physical constraints on the model-data relationship. Within these types of applications, various ML methods with different terminologies (e.g., physics-guided, theory-guided, and physics-based neural networks) are considered. Sun et al. (2021) compare the performance of the CNN and physics-guided recurrent neural network architecture and show that the physics-guided model better resolves the boundaries of velocity anomalies such as salt geobodies. Recently, the use of a specific type of physics-based neural network, a physics-informed neural network (PINN) (Raissi et al., 2019), is emerging in the context of VMB (Costa Nogueira Junior et al., 2019; Xu et al., 2019; Jin et al., 2020; Voytan and Sen, 2020; Rasht-Behesht et al., 2022). PINN can approximate partial derivative equations that govern physical problems. Xu et al. (2019) apply PINN to synthetic data and show that it can provide a higher accuracy velocity model compared with FWI.

First-break picking

First-break picking, often used for traveltime tomography, is a crucial step of the seismic exploration workflow to image the complex near surface and compute the corresponding statics. Many data-driven semiautomatic approaches have been introduced that consider various features of the raw trace to pick the first arrivals. Nevertheless, these methods often involve a hyperparameter setting based on the geologic properties of the site and the S/N of the data for a successful implementation.

First-break picking is inherently a binary problem. In most ML-based applications, patches of the seismic data, in contrast to a single trace, are considered as the input and the output label is again a 2D matrix with the same size as the input that represents a segmentation mask separating the noise (the recorded data before the first arrivals) and data (e.g., Tsai et al., 2018; Xie et al., 2018; Yuan et al., 2019). In other applications, the feature of a single trace or set of traces such as the short-term average (STA), long-term average (LTA), and Fourier transform is used as the input (e.g., Song et al., 2011; Maity et al., 2014; Mezyk, and Malinowski, 2018; Luo and Zhu, 2020). In rare applications, small windows of single traces are considered as the input and the output labels contain one if the window contains a first break and zero for a nonfirst break (Loginov et al., 2019).

Most ML-based first-break applications are implemented in the scheme of CNN models (Hollander et al., 2018; Loginov et al., 2019; Ma et al., 2019, 2020; Yuan et al., 2019; Zhang et al., 2020a). The CNN-based first-break picking models in Wu et al. (2019a) and Luo and Zhu (2020) show superior performance compared with traditional STA/LTA automatic algorithms. Cova et al. (2020) show that CNN can be very effective for first-break picking even in the presence of sharp elevation contrasts but can be challenging for the portion of the data with a low S/N. Other algorithms such as SVR (Yalcinoglu and Stotter, 2018) and LSTM-RNN (Kirschner et al., 2019) are rarely applied to first-break picking.

VMB from groundroll

Surface waves, also known as groundroll, are dominant in land seismic data. They contain valuable information about the near-surface. The dispersion curve of surface waves is manually picked in the spectral domains, such as f-v, f-k, and τp, and these dispersion curves are inverted individually or simultaneously to obtain a near-surface S-wave velocity model, and in rare cases, a P-wave velocity model (Socco and Comina, 2017). Nevertheless, for large-scale field data, the manual picking of the dispersion curves can become unrealistic. In addition, given that dispersion curve inversion is strongly a nonlinear problem, the inversion may require a priori and calibration of the hyperparameters to converge to the global minimum and lead to a realistic model.

Most ML-based dispersion-picking algorithms focus on the fine tuning of an automatically picked dispersion curve using unsupervised algorithms such as density-based spatial clustering of applications with noise (DBSCAN), K-means, principal component analysis (PCA), or combinations of these algorithms (Masclet et al., 2019; Kaul et al., 2020; Rovetta et al., 2020; Yao et al., 2021). In other more sophisticated ML implementations, the frequency-wavenumber representation of the data is used as the input and the mask representing the dispersion curve is considered as the output for the CNN model (Kaul et al., 2021b; Ren et al., 2021). Some rare but innovative approaches aim at bypassing the dispersion picking step and estimating the S-wave velocity model from the raw surface wave data in the frequency-wavenumber domain (Yablokov and Serdyukov, 2020; Aleardi and Stucchi, 2021).

Passive seismic data

With the growth of fiber-optic distributed acoustic sensing (DAS), the acquisition of passive monitoring data has been significantly boosted and has created the necessity to develop fully automatic data-driven signal detection and event location detection approaches. A comprehensive overview of the current ML-based passive seismic data tasks can be found in Anikiev et al. (2023). Most ML-based passive signal detection methods consider CNN models (Binder and Chakraborty, 2019; Stork et al., 2020; Rajeul, 2021). In these approaches, usually the patches of the seismic data are created as the input, and binary labeled output is considered to define whether the patch contains microseismic signal or not. Binder and Chakraborty (2019) train a CNN model to detect time windows of signals using a combination of simulated and real patches of DAS data. The application of the trained model to real DAS data showed better results compared with the STA/LTA method. Stork et al. (2020) train the YOLOv3 model to detect signals in DAS data using synthetic samples. The trained model was tested on real DAS data and outperformed the STA/LTA method. Zhang et al. (2020a) show that continuous wavelet transform CNN performs better than the MLP model, although it can be trained faster with lower computational requirements. Alternatively, Qu et al. (2018) consider features of the passive data that are selected by a random forest model as the input to the support vector machine (SVM) algorithm.

Another group of ML-based passive applications is focused on locating the passive events. Most of the ML models for locating microseismic events consider CNN models (Rodriguez, 2021; Wang and Alkhalifah, 2021; Wang et al., 2021). Wang and Alkhalifah (2021) use two CNN models, one for detecting the events and another for locating the seismic event. Using a similar model, Wang et al. (2021) apply the trained model to recorded data during the hydraulic fracturing process of a shale gas play. Their comparison of the results with traditional time-reversal imaging showed faster prediction and similar accuracy. Gu et al. (2019) consider Bayesian CNN and implement a stochastic regularized technique to quantify the uncertainty of the ML-based seismic location estimation.

Seismic attributes are obtained through the mathematical manipulation of seismic data, and they aim to highlight various physical, petrophysical, and geologic properties. Each attribute is usually defined to highlight a specific property in the seismic data. As a result, numerous seismic attributes have been defined over the years to improve various interpretation tasks (Chopra and Marfurt, 2007). Many of the ML-based methods such as encoder-decoder architectures follow a similar strategy. In the training stage of these methods, the encoder is trained to encode the input data (seismic data) to useful, informative, and compact intermediate features, and the decoder is trained to use these intermediate features to predict the results (interpretation task). If instead of processed seismic data suitable attributes are considered as the input data, the performance of the ML model can significantly improve, and the training time can be reduced. We insist on the use of suitable attributes because: (1) not all attributes are rich with information about the target interpretation task, (2) the use of correlated attributes can significantly bias the ML model, and (3) the use of multiple attributes as input instead of seismic data significantly increases the memory requirements. The attribute selection algorithms decrease the dimensionality of the data space into useful attributes with reduced correlation. Hence, the process of attribute selection plays a crucial role in tailored attribute-based ML-based interpretation applications.

More than half of the current attribute-based ML seismic interpretation applications suffer from a lack of a criterion for selecting the attributes (Figure 9). PCA is the most common method for dimensionality reduction of the attribute space. Other methods such as the random forest, probabilistic neural network (PNN), Gaussian mixture model (GMM), and maximal coefficient mixture have also shown promising results in reducing the dimensionality of the attribute space without disregarding useful information. Zhao et al. (2015a) provide a comprehensive review and comparison of various unsupervised algorithms for attribute selection.

According to the literature, ML methods are heavily applied to seismic interpretation tasks, mainly due to the similarity of computer vision ML methods that are adaptable to seismic interpretation tasks and also because of the necessity for the automation of these tasks. We analyzed the publications within the framework of three main categories of structural interpretations, lithologic interpretations, and petrophysical/rock properties estimation, each including many applications (Figure 10). The ML implementations are mainly focused on fault detection (23%) and lithofacies identification (22%).

The CNN algorithm is commonly used in seismic processing and interpretation tasks (Figures 6 and 11). Nevertheless, more diverse algorithms are exploited for seismic interpretation tasks (Figure 11). This aspect can be mainly observed in lithofacies classification applications for which a broad range of supervised and unsupervised algorithms are considered. Nevertheless, CNN as yet remains the most common model for most applications.

Structural interpretations

Structural interpretation is a highly subjective task, which considerably depends on the domain knowledge and experience of the interpreters. In this section, we focus on ML implementations for three main structural interpretation applications, fault detection, salt and geobody identification, and horizon picking.

Fault detection

Almost all ML-based implementations for fault detection are in a supervised manner and rarely are in a semisupervised manner. Supervised CNN and MLP are the popular techniques for ML-based fault detection (Araya-Polo et al., 2017; Huang et al., 2017; Ma et al., 2018; Maniar et al., 2018; Wang et al., 2018c; Xiong et al., 2018; Zhao and Mukhopadhyay, 2018; Wu et al., 2019c; Zheng et al., 2019; Yang et al., 2020). Another common method for fault detection is SVM (Di et al., 2017, 2019; Guitton et al., 2017; Du et al., 2019). In other less common applications, the GAN algorithm is used (Lu et al., 2018; Durall et al., 2021). Uncertainty quantification is an important task in interpretation applications such as fault detection. The uncertainty can be divided into aleatoric and epistemic uncertainties. The former refers to the uncertainty related to the randomness of the data, which is significantly highlighted when the developed model is applied to unseen data. The epistemic uncertainty captures the uncertainty related to the standard deviation of the network parameters. Unlike aleatoric uncertainty, the epistemic one can be improved by increasing the number of samples. Bayesian CNN is a probabilistic ML model based on Bayesian principles that allow the quantification of both uncertainties. Feng et al. (2021) and Mosser et al. (2020) show the application of Bayesian CNN and quantification of the uncertainties for fault detection.

Fault detection is always treated as a binary problem, in which the output is either fault or no fault. The input patches of stacked seismic data are introduced to the ML model in various manners. Maniar et al. (2018), Lu et al. (2018), and Zhang et al. (2014) consider 2D seismic data (patches) as the input, whereas Huang et al. (2017), Wu et al. (2019c), and Yang et al. (2020) use 3D (small cubes) input data. Alternatively, Ma et al. (2018) and Xiong et al. (2018) define the input as tree channel seismic sections in X, Y, and time directions, which significantly reduces the memory demand of the training process compared to the use of 3D cubes as the input. Xiong et al. (2018) consider small slices in x, y, and time directions, each 24 × 24, within a CNN architecture to extract features, followed by two layers of a conventional neural network for classification. The application of the ML model to a big data set can still be time consuming. In the experiment of Xiong et al. (2018), the trained model was able to predict fault probabilities of a 1000 × 655 × 1083 field seismic data set within 2.5 h using a 20-node computer cluster, each node equipped with 20 CPU cores. The results showed higher resolution compared to the seismic coherence attribute, especially in detecting channels (Figure 12).

In approximately 15% of the ML applications for fault detection, seismic attributes with or without seismic stacked data are considered as the input. Some of these applications manually select the relevant attributes for ML implementation. Huang et al. (2017) consider nine attributes that are known to carry useful information about the faults (e.g., fault likelihood attributes and curvature shape index) and prepare the input data as nine channel patches of 2D sections. Di et al. (2017) define 19 attributes of each seismic pixel as the input to the neural network. In other attribute-based fault detection methods, the attributes are selected according to a criterion to eliminate the correlation between them, which also reduces the data space and computational capability requirement. Jiang and Norlund (2020) consider the random forest model to rank the importance of 30 seismic attributes in building a fault probability map, and among them, four attributes were selected for the CNN model training. The small 3D cubes of the attributes were assembled to create the multichannel input data for fault detection.

Horizon picking

Similar to many traditional horizon-picking algorithms, ML-based methods also consider seed points to track the horizons (Peters et al., 2019; Shi et al., 2020; Ferdinand Fernandez et al., 2021). In these applications, the target of the ML implementation is not to obtain a global model (GM) that can be applied to unseen data. In contrast, reliable seed points (labels) are associated with the input traces and used as the training data. The rest of the data are then inserted into the ML model to predict the horizon. In some applications (e.g., Wu and Zhang, 2019; Guillon et al., 2020), the training data are segmented to obtain the seed points for various horizons. Most horizon-picking ML-based applications use CNN models (Gramstad and Nickel, 2018; Wu and Zhang, 2019; Guillon et al., 2020). To quantify the probability of the horizon associated with the predictions, Siahkoohi et al. (2020) consider Bayesian CNN. In an innovative application, Shi et al. (2020) consider an unsupervised autoencoder model to encode the cropped short waveforms into encoded latent space. Then, the waveform patches that have a similar vector of features compared with the known seed points are identified as the horizons.

Salt and geobody identification

In ML applications, the geobody detection problem is viewed as a horizon picking (Gramstad and Nickel, 2018; Kaul et al., 2021a) and segmentation problem (Waldeland and Solberg, 2017; Shi and Wu, 2019; Di and AlRegib, 2020). In former applications, the principles explained in the previous section are used to pick the top and bottom horizons of the salt body. In contrast, the segmentation approach aims at classifying each pixel of seismic data and/or seismic attributes into salt or no-salt categories. Most applications for geobody (salt) identification consider CNN networks (Gramstad and Nickel, 2018; Wang et al., 2018b). Di and AlRegib (2020) compare the efficiency of a CNN model with MLP in predicting salt bodies and they conclude that the CNN model is much more efficient and can provide reliable results even without seismic attributes. Waldeland and Solberg (2017) consider small cubes of seismic stacked data (65 × 65 × 65) as the input. They consider three convolutional layers and an average pooling to extract 40 features (attributes) from the cube, followed by a set of conventional fully connected layers for classification (Figure 13a). In contrast to common neural network applications that consider the rectified linear unit (ReLU) operator to include nonlinearity in the problem, they use the exponential linear unit (ELU) operator, which can, in certain conditions, speed up the learning stage. In this survey, they train the data on a single section of the Norwegian continental shelf data set and use it to label the rest of the data. Figure 13b and 13c shows the training section and example of the delineated salt from the test set, respectively. To account for epistemic and aleatoric uncertainties, Mukhopadhyay and Mallick (2019) and Zhao and Chen (2020) consider the Bayesian CNN algorithm to identify the salt.

Lithologic interpretations

In the following, we investigate in detail the ML-based lithofacies classification and stratigraphic sequence identification applications.

Lithofacies classification

The manual lithofacies interpreters are usually experts in detecting useful features that others cannot identify. These experts take advantage of the seismic stacked section and various attributes to identify the lithofacies. Still, the manual approach can be very expensive and time consuming when large data sets are considered. Alternatively, many unsupervised, supervised, and semisupervised ML-based algorithms have been introduced for lithofacies classification (Figure 14).

The use of attributes for ML-based lithofacies classification is very common. Almost 66% of these applications consider seismic attributes with or without seismic amplitude as the input. Thirty-eight percent of the attribute-based applications consider the already available attributes or manually select the appropriate attributes for the lithofacies classification (Zhao et al., 2015b; Sacrey and Roden, 2018a). The rest of the applications consider a criterion such as PCA (Roden et al., 2015; Abd-Elfattah and Fahmy, 2017; Sacrey and Roden, 2018b; Ha et al., 2021; Hussein et al., 2021), wrapper analysis (Kim et al., 2019), PNN (Lubo-Robles et al., 2019), genetic algorithm (Kuroda et al., 2016), maximal information coefficient (Liu et al., 2019b), step-wise regression method (Keynejad et al., 2020), and GMM (Qi et al., 2020) to select the most suitable attributes.

The ML-based models for seismic facies classification are usually trained specifically for a single data set as the lithology of the data sets can be much different. The CNN algorithm is the most common supervised method (26% of the total) for lithofacies classification (Alaudah et al., 2019; Liu et al., 2019a; Pires de Lima et al., 2019), which is sometimes also implemented in a probabilistic manner using Bayesian inference to obtain the uncertainties associated with the estimated facies (Mosser et al., 2019; Mukhopadhyay and Mallick, 2019; Xie et al., 2021). Zhang et al. (2021) compare the conventional CNN with the U-Net and DeepLabv3+ encoder-decoder architectures, which shows that the encoder-decoder architectures provide more consistent results, and among them, DeepLabv3+ is more accurate (Figure 15). In addition, DeepLabv3+ uses pointwise separable convolution instead of 2D convolution which is computationally more efficient. The models were trained assuming nine possible classes according to the characteristics of the Netherlands F3 block data set.

Similar to the encoder-decoder architecture of the U-Net, Alaudah et al. (2019) develop an open-source model that considers the patches of the seismic stacked data as the input and provides the facies as the output (same size as the input). Salvaris et al. (2020) develop ML-based seismic classification algorithms (so-called DeepSeismic) based on multiple architectures (U-Net, SEResnet, and HRNet) that are available online. Mosser et al. (2019) consider less than 1% of the Dutch NLOG data set as the labeled training set in the Bayesian CNN framework but are able to estimate the facies and associated uncertainty for the rest of the data set with good approximation. In other supervised applications for facies classification, SVM (Zhao et al., 2015a), MLP (Kuroda et al., 2016; Liu et al., 2017a), RNN (Lei et al., 2019; Grana et al., 2020) and PNN (Abd-Elfattah and Fahmy, 2017; Lubo-Robles et al., 2019), and random forest (Kuhn et al., 2018; Zhang et al., 2018b) are considered. Zhao et al. (2015a) compare the application of SVM and artificial neural networks (ANN), which suggests a more accurate classification of the facies by SVM but with a much higher computational cost.

Unlike other seismic applications, ML-based lithofacies classifications are commonly implemented in an unsupervised manner (33%; Figure 14). Among them, the self-organizing map (SOM) (Roden et al., 2015; Chopra and Marfurt, 2018), K-means (Qian et al., 2018), and generative topographic map (GTM) (Qi et al., 2020) are commonly used. SOM is a data visualization technique that reduces high-dimensional data space and classifies similar patterns in the data. The output of the SOM is defined as a 2D mesh (e.g., 8 × 8) that each stands for a cluster that is correlated through a color scale. The SOM algorithm is a very powerful tool for lithofacies classification, and the model can usually be used for various data sets. Nevertheless, sometimes the output mesh size should be calibrated to be suitable for the lithology of the site. Similar to SOM, GTM assumes that the attributes can be expressed by N-dimensional Gaussian distribution, where N is the number of attributes for each sample. The algorithm transforms the original plane of the data into a 2D plane that best reflects the N-dimensional space. In an innovative approach, Qian et al. (2018) use an autoencoder unsupervised neural network to extract features of the data in latent space, and they use a K-means unsupervised algorithm to cluster the features from the autoencoder. Zhao et al. (2015a) compare the performance of the four unsupervised algorithms (PCA, K-means, SOM, and GTM) and two supervised ones (ANN and SVM) for the estimation of the lithofacies. They conclude that K-means is the simplest and easiest to apply an ML-based algorithm for noncomplex geology; SOM provides a color map that is much more interpreter friendly than K-means; GTM is complicated as it relies on a probability distribution, which is usually not accessible; and the supervised SVM and ANN algorithms require much more expertise to optimize the computational cost and can better perform when constrained by preresults from unsupervised algorithms.

Semisupervised facies classification schemes are rarely implemented and most of them are based on GAN (Liu et al., 2019a; Kim and Byun, 2020; Singh et al., 2021). The comparison of Singh et al. (2021) between the performance of GAN and CNN suggests that, in the abundance of labeled data, the CNN model provides more accurate results, whereas GAN is preferable and more accurate when limited labeled data are available.

Stratigraphic sequence identification

Similar to ML-based lithofacies classification, the ML-based stratigraphic sequence identifications are usually implemented on a single dataset, using a portion of the same data to train the model. Most ML-based stratigraphic sequence estimations use CNN models (Huot et al., 2019; Li et al., 2019; Di et al., 2020). Di et al. (2020) consider an ML network consisting of an unsupervised autoencoder and a supervised CNN network. The autoencoder extracts many features from the data that are the input to the supervised CNN model. They consider three scenarios for creating the output labels: (1) a 1D stratigraphy profile, (2) a 2D patch of the stratigraphy with the same size as the input, and (3) paint-brush labels that highlight the target seismic sequences. The paint-brushing approach gives the interpreter flexibility in annotating any zone of interest in a seismic data set. In other supervised implementations, Li et al. (2018b) and Kuroda et al. (2016) consider RNN and MLP algorithms, respectively. Among the unsupervised algorithms, DBSCAN (Corlay et al., 2020) and SOM (Laudon et al., 2019) are also used for stratigraphic sequence identification. Bugge et al. (2019) compute an attribute vector for each small cube of seismic data and use the DBSCAN algorithm to cluster them into a stratigraphic sequence.

Petrophysics, rock physics, and inversion

We divide the ML-based inversion tasks into petrophysical and rock properties estimates, impedance estimates, and 4D data interpretations.

Petrophysical and rock properties

ML algorithms are used for various petrophysical and rock properties estimations (Figure 16). The ML applications mainly focus on the estimation of porosity (Kuroda et al., 2016; Yenwongfai et al., 2019; Feng et al., 2020), density (Alfarraj and AlRegib, 2018; Priezzhev and Stanislav, 2018; Biswas et al., 2019), brittleness (Zhao et al., 2015b; Mlella et al., 2020), VP/VS (Mosser et al., 2020; Li et al., 2021a), and Vshale/Vclay (Muradov and Shahtakhtinskiy, 2017). In many of the implementations, the same ML architecture is separately trained for the prediction of various petrophysical and rock properties (e.g., Das and Mukerji, 2020; Zhang et al., 2020c). In ML-based applications, usually the processed seismic data or corresponding attributes are considered as the input, and well-log data are used as the desirable targets. Similar to lithofacies identification, most applications take advantage of manually or experimentally selected seismic attributes (Zhao et al., 2015b; Iturrarán-Viveros et al., 2018; Mlella et al., 2020). Alternatively, some publications select the attributes from an attribute space using various methods such as the genetic algorithm (Kuroda et al., 2016), multilinear regression (Feng et al., 2020), gradient boosting (Roy et al., 2020), and PCA (Abd-Elfattah and Fahmy, 2017).

Most applications consider the CNN model to estimate the properties (Das and Mukerji, 2020; Downton et al., 2020; Jaglan et al., 2021). Unlike other CNN applications that are implemented in a supervised manner, Feng et al. (2020) develop an unsupervised CNN scheme, in which the model aims at estimating high-resolution porosity to be added to the low frequency a priori porosity. The output porosity is used in the framework of convolutional 1D modeling to simulate the seismic data, and the loss is automatically computed between the real and synthetic seismic data. Choi et al. (2020) and Mosser et al. (2020) consider a Bayesian CNN that allows the quantification of the uncertainty for the estimation of density and VP/VS, respectively. Several examples of SVR (Jiang et al., 2020), MLP (Muradov and Shahtakhtinskiy, 2017), and PNN (Malik, 2019; Mohamed et al., 2020) for petrophysical and rock properties estimations exist. Zhao et al. (2015b) consider five manually selected attributes to estimate the brittleness index in the framework of proximal support vector regression. Jiang et al. (2020) compare the performance in predicting the porosity of synthetic data using random forest, MLP, SVR, and PNN, which showed that the former three methods provide higher accuracy compared with the PNN. In another real experiment, Ore and Gao (2021) compare the performance of MLP, SVR, and gradient boosting in estimating the brittleness; gradient boosting provided superior results compared with the other two.

Impedance and elastic parameters

The ML-based applications are more focused on acoustic impedance estimation (71%) rather than elastic impedance estimations. Some of the ML-based applications focused on elastic properties aim at obtaining elastic impedance as defined by Connolly (1999) (e.g., Alfarraj and AlRegib, 2019a), the rest aim at estimating the elastic parameters, S-wave velocity, P-wave velocity, and density (e.g., Biswas et al., 2019; Choi et al., 2020).

Almost all ML-based impedance estimation methods consider pre- or poststack seismic data as the input and not seismic attributes. The ML-based impedance estimation is usually coupled with petrophysical and rock properties estimations, either in a single ML network (e.g., Das and Mukerji, 2020) or separate ones (e.g., Downton et al., 2020). The impedance inversion is usually implemented in the framework of CNN algorithms (Biswas et al., 2019; Das and Mukerji, 2020). Das and Mukerji (2020) consider cascade CNNs with two CNN networks in a Bayesian framework. First, the network considers the seismic data as the input and provides the acoustic impedance and VP/VS. The outputs of the first network are then used as inputs to the second CNN model that estimates porosity and shale volume. Biswas et al. (2019) compare the performance of conventional CNN and physics-guided CNN in estimating the elastic parameters. In both cases, they use seismic angle stack gathers as the input to the network and S- and P-wave velocities and density as the three channels of the output. In contrast to the conventional CNN that requires labeled data, the physics-guided framework is implemented in an unsupervised manner. The outputs of the model are automatically used to generate synthetic convolutional seismic data to be compared with the input data, based on which the update of the weights is performed. The comparison between the performance of the models to a synthetic data set showed similar accuracy for the S- and P-wave velocities but higher accuracy of the conventional CNN in predicting density. Figure 17 shows an example of the comparison between the results of the two models and the difference compared with the true model in one of the sections from the 3D data set.

In other CNN-based works, Das et al. (2019) and Choi et al. (2020) use Bayesian CNN to quantify the uncertainty of the acoustic impedance and elastic properties estimations, respectively. Alfarraj and AlRegib (2019b) use a semisupervised CNN-RNN network that considers seismic data and acoustic impedance as the input and output time series. The conditional GAN (cGAN) model has also been shown to be very effective in acoustic impedance estimation (Wang et al., 2019b; Cai et al., 2020). Synthetic tests of Cai et al. (2020) suggest better performance of the cGAN with the Wasserstein loss function and gradient penalty loss compared with conventional cGANs.

4D data

Some of the ML-based applications for processing the 4D data aim at estimating pressure, water content, and gas content changes within the same ML architecture (Dramsch et al., 2019; Côrte et al., 2020; Alali et al., 2022). Other applications such as the ones in Xue et al. (2019) focus on the mapping of the water content changes only. Kaur et al. (2020) use the GAN model to monitor the CO2 saturation from the 4D data in the framework of carbon storage in the reservoir. Most of the ML-based 4D data processing consider deep neural networks (Côrte et al., 2020) and CNN (Weinzierl and Wiese, 2020). Xue et al. (2018) consider 4D attributes, porosity, net-to-gross, and water content baseline as the input to predict the water content changes. Their comparison between the performance of MLP and many other ML algorithms such as random forest, decision tree, and SVR suggests that the MLP and random forest provide the highest accuracy. In traditional FWI of time-lapse data, it is important to incorporate the well data into the inversion. Li et al. (2021c) develop an MLP-assisted regularization technique to enhance the resolution of the FWI and increase accuracy. Babalola (2019) considers the mixture density neural network to estimate the changes in water content and pressure.

We reviewed the implementation of the ML algorithms for various seismic processing and interpretation tasks. To evaluate the efficiency, applicability, and effectiveness of the current ML implementation for each seismic task, we define certain indices based on parameters that can be extracted from the statistical data collected from the published material. We consider data preparation simplicity (DPS) and computational power requirements as indicators of the efficiency of ML-based applications. We also consider fractions of the real data test (RDT) and fractions of the GM estimation for any unseen data as indicators of applicability. Finally, we consider the diversity index (DI) as an indicator of effectiveness. In the following, we describe in detail these indices. To define the indices, we used only information that was available in the publications. This created a significant constraint on the analysis because some parameters that would have been ideal indices for efficiency, applicability, and effectiveness had to be ignored due to the lack of such information from the publication. For example, accuracy is an important index to evaluate the applicability of a proposed ML implementation. Nevertheless, many of the publications included only a qualitative analysis of accuracy, and the ones that performed a quantitative analysis used specific metrics, making it impossible to draw statistical information.

In general, depending on the application, raw data, prestack/poststack migrated data, and seismic attributes are used as the input. The migrated data can be simulated using the 1D convolutional model. As a result, we assigned the DPS index a value of three (3) (easiest) to synthetic migrated data. Raw data are usually available in abundance but synthetic simulation (i.e., finite difference and finite elements) of the raw data requires significant computational power. As a result, we assigned the DPS index a value of two (2) (medium) to the applications requiring raw (prestack) data. Attributes can be computed from the synthetic stacked section/cube. Nevertheless, as was mentioned, numerous attributes have been defined and usually a criterion should be considered to reduce the attribute space. Consequently, we assigned the data preparation index a value of one (1) (hardest) to the attribute-based applications that require an extra step for selecting the suitable attributes for the applications. Given n publications for each ML-based processing and interpretation task, we compute the average DPS index as
(1)
The approximation of the computational power requirement is very challenging because very limited information is disclosed in publications. An important indicator for the computational power requirement of an ML model is the number of trainable parameters. Nevertheless, the publications were not abundantly forthcoming about the architecture of the models, and as a result, many publications lacked information on the number of trainable parameters. Instead, statistical data regarding the number of training samples were available as well as the dimensions of the input and output data. In general, these parameters often correlate well with the number of trainable parameters and computational requirements. We define the computational power efficiency (CPE) using these parameters as
(2)
where diminput,i and dimoutput,i are the dimensions of the input and output data as the number of pixels used in the ML model, respectively, and Ni is the number of training samples considered in the applications. Of course, the defined CPE, which is based on the extractable metadata from the publications, does not fully reflect the computational requirement of the applications but is an acceptable proxy of it.
We consider two indices to measure the applicability of the ML-based applications. In the first one, we compute the fraction of the publications that considered RDT to evaluate the ML model. In the second indicator, we consider the GM ratio as the fraction of publications that aim at providing a GM to process any unseen data set. These models are opposed to ones that are trained separately for each real data using a portion of the data as the training data:
(3)
We take into consideration the DI of the implemented ML algorithms for a single seismic application as an indicator of its effectiveness. We consider the Simpson DI (Simpson, 1949) given by
(4)
where i is the index of each algorithm used for the application (e.g., CNN and GAN), mi is the number of times (publications) that this algorithm is used, and n is the total number of the considered publications for the seismic task. We want to stress that the DI generally must be analyzed with measurements of accuracy to provide a full view of the efficiency. Nevertheless, statistical data from the publications were not sufficient to be realistically used in this analysis.
We normalize all indices between zero and one using a range normalization while discriminating between the processing and interpretation tasks. For example, we normalize the DI of a processing task (DIi) as
(5)

In Tables 1 and 2, we report the results of the computed indices for processing and interpretation applications, respectively. The attributes or features are rarely used for seismic processing applications. For trace interpolation, frequency extrapolation, VMB from raw data, and first-break picking, only the raw data are considered. As a result, these applications have the lowest DPS index. In contrast, denoising ML-based applications are regularly applied to raw and migrated seismic data, which leads to the highest DPS among processing applications. Nevertheless, denoising requires high computation power (low CPE index). VMB from raw data is a very interesting application that bypasses many of the traditional seismic application steps to directly provide elastic properties from the raw data. Nevertheless, it is still in the theoretical stage as most applications only consider synthetic testing (low RDT). QC is one of the most promising applications that has high RDT, GM, and DI indices but requires high computational power during the training stage.

Among interpretation applications, ML-based horizon picking, impedance inversion, and fault detection mostly consider a seismic stacked section/cube that resulted in higher DPS compared with other ML-based interpretation applications such as lithofacies classification or petrophysical and rock properties estimation that rely on seismic attributes. Nevertheless, horizon picking and fault detection with low CPE are computationally demanding interpretation ML-based tasks. In contrast, ML-based petrophysical and rock properties, despite the low DPS, rank very high for other indicators, making them one of the most promising ML-based interpretation applications. However, it should be noted that the low CPE of these applications is also caused by limited available log data (labeled data) for petrophysical and rock properties and further evaluation of the accuracy is required for more comprehensive analysis. In contrast, the low DI of fault detection is mainly due to the consolidation of the CNN model for fault detection that has been evaluated by numerous quantitative and qualitative measurements. In addition, many fault detection CNN-based open-source models are available and can be tested by unseen data.

ML algorithms are actively applied to almost all stages of seismic processing and interpretation. The current state of ML implementations shows significant achievements in the automation of individual processing and interpretation tasks, resulting sometimes in even better outcomes compared with classical methods. Except for a few attempts that aim at bypassing intermediate processes of the traditional seismic exploration workflow, most ML-based applications focus on improving the efficiency and effectiveness of individual processing and interpretation tasks aligned with the traditional exploration workflow. At this stage, ML-based seismic exploration has not reached its holy grail yet, with an ultimate target to provide the raw data to the algorithm and to obtain the subsurface models and petro/rock physical properties. Nevertheless, the evolution of ML implementation in other sectors, such as autonomous driving and natural language processing, has shown that the development of individual ML-based tasks is an essential step for reaching end-to-end comprehensive ML models. We believe that the research on ML-based seismic exploration is only at the early stage of its development. Reaching the holy grail requires further research, and more importantly, abundant labeled data that could only be achieved by an open-access data campaign. The research on ML-based seismic exploration has increased exponentially in the past few decades and is expected to expand even more in the upcoming years with more focus on developing comprehensive models, getting closer and closer to the holy grail.

We thank T. Alkhalifah and two anonymous reviewers for their useful suggestions. This study was carried out within the FAIR — Future Artificial Intelligence Research and received funding from the European Union Next-Generation EU (Piano Nazionale Di Ripresa e Resilienza [PNRR] — Missione 4 Componente 2, Investimento 1.3 — D.D. 1555 11/10/2022, PE00000013). This paper reflects only the authors’ views and opinions; neither the European Union nor the European Commission can be considered responsible for them. The authors also acknowledge the support from the SmartData@PoliTO Center for Big Data and Machine Learning.

The literature database, including all extracted meta-data for publications, is available at https://github.com/GeoPolitec/ML-based-seismic-exploration.git.

Biographies and photographs of the authors are not available.