## ABSTRACT

The quantitative reconstruction of subsurface earth properties from the propagation of waves follows an iterative minimization of a misfit functional. In marine seismic exploration, the observed data usually consist of measurements of the pressure field, but dual-sensor devices also provide the normal velocity. Consequently, a reciprocity-based misfit functional is specifically designed, and it defines the full reciprocity-gap waveform inversion (FRgWI) method. This misfit functional provides additional features compared to the more traditional least-squares approaches, in particular, in that the observational and computational acquisitions can be different. Therefore, the positions and wavelets of the sources from which the measurements are acquired are not needed in the reconstruction procedure and, in fact, the numerical acquisition (for the simulations) can be chosen arbitrarily. Based on 3D experiments, FRgWI is shown to behave better than full-waveform inversion in the same context. It allows for arbitrary numerical acquisitions in two ways: when few measurements are given, a dense numerical acquisition (compared to the observational one) can be used to compensate. However, with a dense observational acquisition, a sparse computational one is shown to be sufficient, for instance, with multiple-point sources, hence reducing the numerical cost. FRgWI displays accurate reconstructions in both situations and appears more robust with respect to crosstalk than least-squares shot stacking.

## INTRODUCTION

The full-waveform inversion (FWI) method has been developed extensively in the past few decades for quantitative recovery of subsurface earth media in seismic exploration. The concept of FWI is to minimize, with respect to the earth parameters, a misfit criterion defined from the simulations of wave propagation and the measured seismograms (i.e., the “full waveform”). The method was introduced originally by Bamberger et al. (1977, 1979) for the 1D wave equation, and was extended by Lailly (1983) and Tarantola (1984, 1987b). The method first was used with time-domain wave propagation, and the frequency-domain formulation of FWI, which requires a Fourier transform of the original time-dependent seismic traces, was established by Pratt et al. (1996, 1998).

In marine seismic, the data usually consist of measurements of the pressure field from hydrophones, but new devices, dual sensors, have been deployed recently and also give access to the vertical velocity (Carlson et al., 2007; Tenghamn et al., 2007). This additional information has shown advantages of reducing noise for image processing (see Whitmore et al., 2010; Rønholt et al., 2015), and it has motivated new analysis, such as in the work of Alessandrini et al. (2018, 2019), and a specific numerical methodology by Faucher (2017) and Zhong and Liu (2018) for the seismic inverse problem. In our work, we implement a misfit functional dedicated to dual-sensor data and demonstrate its efficiency for the recovery of subsurface physical parameters.

FWI relies on an iterative minimization of the misfit functional, which, in the traditional least-squares approach, is the $L2$ difference between the observations and the simulations. In addition to the numerical challenges induced by the large-scale domain, one main difficulty of FWI (which is a nonlinear and ill-posed inverse problem) is that the misfit functional suffers from local minima, in particular when low-frequency data are missing and/or in the absence of a priori information, as has been observed by, for example, Gauthier et al. (1986), Luo and Schuster (1991), Bunks et al. (1995), and Faucher et al. (2020a). To mitigate the “cycle-skipping” effect, selection of increasing frequency content in the data is commonly used (Bunks et al., 1995; Sirgue and Pratt, 2004). In our work, we follow the frequency-domain formulation, in which this approach is natural, and we further use increasing sequential frequency, as advocated by Faucher et al. (2020a).

Several alternatives to least-squares have been studied to enhance the convexity of the misfit functional, such as logarithmic function, mentioned by Tarantola (1987a), that is particularly appropriate in complex-frequency FWI (Shin and Min, 2006; Shin and Cha, 2008; Faucher, 2017). Comparisons of misfit using the phase and amplitude of signals are carried out in the work of Shin et al. (2007), Bednar et al. (2007), and Pyun et al. (2007). Fichtner et al. (2008) study the use of a misfit based upon the phase, correlation, or the envelope; the latter also is advocated by Bozdağ et al. (2011), and it is shown to be efficient for the reconstruction of attenuation properties in global earth seismology by Karaoğlu and Romanowicz (2017). Brossier et al. (2010) study the use of the $L1$ criterion. The optimal transport distance is considered in the work of Métivier et al. (2016) and Yang et al. (2018) and is shown to improve the misfit functional convexity. To avoid the local minima, one also can rely on a specific model parameterization such as the migration-based traveltime method that decomposes the velocity model into a (smooth) background profile and a reflectivity in the data space (Clément et al., 2001; Barucq et al., 2019). This approach is shown to increase the size of the attraction basins for background velocity reconstruction by Barucq et al. (2019).

In the case in which Cauchy data are available on the boundary, that is, a field and its normal derivative (e.g., the pressure and the normal derivative of the pressure, which actually relates to the normal velocity; see Appendix A) the reconstruction can use a combination of the Dirichlet and Neumann traces, as in the work of Kohn and Vogelius (1985) and Colton and Haddar (2005) in inverse scattering, or Alessandrini et al. (2018) for seismic. This approach is labeled as the “reciprocity-gap” in the literature (de Hoop and de Hoop, 2000; Colton and Haddar, 2005). Lipschitz stability can be obtained for the partial Cauchy data inverse problem, in particular in the seismic context with surface measurements, as proved by Alessandrini et al. (2018, 2019). This result further holds for piecewise-linear parameters, used for the numerical experiments of Faucher (2017) and Alessandrini et al. (2019). In the context of acoustic waves governed by the Euler’s equations, we shall see that Cauchy data are in fact equivalent to the pressure field and the normal velocity, that is, dual-sensor data. In Appendix A, we further connect the boundary measurements to the volume properties of the media. Hence, the minimization of the misfit functional defined from surface data is connected to the recovery of the volume medium parameters. In our work, we use the reciprocity-gap formulation and further combine all sources, defining the full reciprocity-gap waveform inversion (FRgWI) framework for seismic imaging using dual-sensor data.

In the time-harmonic formulation, the reciprocity-gap results in a misfit functional in which the observations and simulations are multiplied. Although our work is conducted in the frequency domain, it can similarly be carried out in the time domain where it relates to the family of correlation misfits. Correlation-based misfit functionals have been studied in seismic tomography, for instance, by Van Leeuwen and Mulder (2010) and in the work of Choi and Alkhalifah (2011) and Zhang et al. (2016), although using a single type of measurements (i.e., the acoustic pressure fields). The use of the velocity fields is studied in the time domain by Zhong and Liu (2019), assuming that all directional velocities are available and convolving the same fields. A fundamental difference with these earlier studies, and in particular with Zhong and Liu (2019), is that in our work, we correlate different fields (i.e., pressure with velocity). This is the essence of the reciprocity gap, and it is crucial for relating surface measurements to global model reconstruction using Green’s identity (Appendix A). In addition, our framework is naturally designed for the normal velocity, in accordance with the dual-sensor devices. The works of Menke and Levin (2003) and Bodin et al. (2014) in seismology also combine observations and simulations, but using a cross-convolution formula made of the vertical and horizontal components of the waves. More generally, approaches based upon a specific filtering of the data, such as in the work of Warner and Guasch (2016) and Guasch et al. (2019), also rely on a minimization in which the fields are not directly the quantities compared.

The main feature of FRgWI is to allow different observational and numerical acquisitions (Faucher, 2017; Alessandrini et al., 2019; Zhong and Liu, 2019). Minimal information regarding the observational sources is required: the source function and the source positions are not needed to conduct the reconstruction. This is due to the definition of the misfit functional, which does not compare an observation with a simulation directly, but products of an observation with a simulation. This flexibility in the choice of numerical probing sources opens up many perspectives and, in particular, to use denser or sparser computational acquisitions than the given observational one. The use of sparse acquisition relates to the shot-stacking approach for data decimation, which sums several single point-source data using the linearity of the wave equation. It has been used early in seismic applications to reduce the computational cost by Mora (1987). It is based upon the redundancy of information in the data, but it has a major drawback in that it is difficult to avoid crosstalk between the encoded sources (Krebs et al., 2009; Zhang et al., 2018). It has motivated several works to efficiently assemble the encoded sources in seismic, that is, source blending (Berkhout, 2008). We mention, for instance, the random combination of sources changing with iteration by Krebs et al. (2009), the approach based upon compressive sensing by Li et al. (2012), while wavelet encoding is used by Zhang et al. (2018). In our applications, we will see that FRgWI, by using arbitrary numerical sources, can work with multiple-point sources (for computational or observational acquisition) while being naturally robust to crosstalk. In fact, it is not exactly crosstalk in this context because the key of FRgWI is that the measurements are not modified; that is, they still are tested independently, one by one, with respect to the numerical simulations.

In this paper, we study the seismic inverse problem associated with time-harmonic waves using dual-sensor data. We first state the mathematical problem and define the two misfit functionals that we analyze for the iterative reconstruction: the traditional least-squares difference and the reciprocity-based functional. Next, we detail the features provided by the reciprocity-gap formulation. We carry out 3D reconstruction experiments: first with a layered medium, in which we compare the performance of reciprocity-gap and FWI. We further demonstrate the efficiency of FRgWI with simultaneous point sources to reduce the computational cost and its robustness compared to traditional shot stacking. Finally, we carry out a larger scale experiment including salt domes using the SEG Advanced Modeling Corporation (SEAM) benchmark. In particular, we highlight that FRgWI performs equally well in the case of acquisitions that are sparse for observations/dense for simulations and when acquisitions are dense for observations/sparse for simulations. Our experiments are carried out using data acquired in the time domain, and, even though our reconstruction algorithm is conducted in the frequency domain, which might present some scale limitations for field configurations, we believe that our method can be implemented with the time-domain wave equation. This can be more appropriate for larger scales, and it requires only a slight modification to our model. A discussion about it is carried out in the “Conclusion” section.

## TIME-HARMONIC SEISMIC INVERSE PROBLEM WITH DUAL-SENSOR DATA

We work with time-harmonic wave propagation for the identification of the physical parameters in a seismic context. The quantitative reconstruction is conducted using an iterative minimization of a misfit functional. We first give the misfit as the traditional $L2$ difference and further design the reciprocity-gap version of the functional, which combines pressure and normal velocity data.

### Acoustic wave equation, forward problem from dual sensors

_{v}denotes the normal derivative. The propagation is characterized by the two physical parameters of the medium: density $\rho $ and bulk modulus $\kappa $. In addition, the velocity $c$ is defined such that

For boundary conditions, we follow a geophysical context in which the boundary is separated into two: $\Gamma =\Gamma 1\u222a\Gamma 2$. We denote by $\Gamma 1$ the interface between the air and the acoustic medium, in which a free-surface boundary condition holds, equation 1c. On the other part of the boundary $\Gamma 2$, we implement an absorbing boundary condition (ABC), Engquist and Majda (1977), equation 1d, to ensure that waves reaching the boundary are not reflected back into the domain. It corresponds to the fact that the domain $\Omega $ is a numerical restriction of the earth.

### Misfit functionals

In the following, we omit the frequency index and space dependency for the sake of clarity (in the experiments, we use an increasing sequential frequency as suggested by Faucher et al., 2020a). We introduce two misfit functionals that evaluate the difference between the observations and simulations.

### Iterative reconstruction procedure

The reconstruction procedure follows an iterative minimization of the selected misfit functional. Least-squares formulations such as equation 7 are traditionally referred to as the FWI method in seismic (because one makes use of the full data seismograms); see the review of Virieux and Operto (2009). Consequently, we shall refer to the minimization of equation 8 as FRgWI. In any case, the iterative minimization follows successive updates of the physical models, such that, at iteration $l$, the new model is given by $ml+1=ml\u2212\alpha lsl$, where $\alpha $ is the scalar step size typically computed via a line-search method (Nocedal and Wright, 2006) and $s$ is the search direction.

The search direction depends on the gradient of the cost function, which is computed using the adjoint-state method. The method has its foundation in the body of work of Lions (1971) and is reviewed for seismic application by Plessix (2006). Application with complex fields is further described by Barucq et al. (2018, 2019). The adjoint-state method for the reciprocity-gap functional is briefly reviewed in Appendix B (Alessandrini et al., 2019). In our implementation, the search direction depends only on the gradient of the misfit functional, and we use the limited-BFGS algorithm (Nocedal, 1980; Nocedal and Wright, 2006). We review the steps for the iterative minimization in Algorithm 1.

### Discretization with the hybridizable discontinuous Galerkin method

In the numerical implementation, the pressure and velocity fields must be computed to feed the misfit functional. We discretize problem 1 using the hybridizable discontinuous Galerkin (HDG) method, which is specifically designed for first-order problems, and avoid oversize linear systems. Indeed, the global matrix using HDG is composed only of the degrees of freedom associated with the trace of the pressure field, that is, only the degrees of freedom on the faces of the elements of the discretization mesh (Cockburn et al., 2009; Griesmaier and Monk, 2011; Bonnasse-Gahot et al., 2017). Then, local (small) systems are solved to calculate the volume solutions of the pressure and the velocity fields, computed with similar accuracy. We refer to Faucher and Scherzer (2020) for the numerical implementation associated with problem 1.

With more traditional discretization methods such as continuous Galerkin, finite differences of internal penalty discontinuous Galerkin methods, one has to create a linear system whose size is the total number of degrees of freedom for all unknowns (i.e., the pressure and the three components of the velocity), possibly leading to large linear systems. Alternatively, one can solve only for the scalar pressure field and postprocess the solution to obtain the velocity, but the computed velocity loses one order of accuracy compared to the discretized pressure field due to the derivative in equation 1a. In the HDG method, we obtain the pressure and velocity fields with the same accuracy, whereas the global linear system contains only the degrees of freedom of a scalar unknown (Faucher and Scherzer, 2020). In addition, only the degrees of freedom on the faces of the elements are taken into account, hence removing all interior ones. Consequently, the HDG method has been shown to be more efficient (i.e., less memory consumption for the matrix factorization due to the smaller global matrix) compared to other approaches by Kirby et al. (2012) and Bonnasse-Gahot et al. (2017), for discretization.

In the context of large-scale seismic applications, the use of HDG is crucial (particularly in the frequency domain) to efficiently compute the pressure and the velocity because it eventually leads to linear systems with sizes that are not larger than for second-order wave problems. For the resolution of the subsequent linear system, we use a direct solver (Amestoy et al., 2001; Liu et al., 2020) for the multiple right-sides feature, in particular, the solver MUMPS, designed for sparse matrices (Amestoy et al., 2006). Therefore, the matrix factorization of the forward problem is reused for the backward one that serves to compute the gradient of the misfit functional depicted in Appendix B.

## FEATURES OF FRgWI AND DATA ACQUISITION

Following Faucher (2017) and Alessandrini et al. (2019), the fundamental feature of the reciprocity-gap misfit functional equation 8 compared to equation 7 is that the set of computational sources is separated from the observational ones (respectively, $gj$ and $fi$ in equation 8). It implies that (1) we do not have to know the observational source positions (the location of the $fi$ in equation 8) for the reconstruction algorithm. (2) We do not have to know the observational source signatures (wavelet). Consequently, and contrary to the least-squares misfit such as equation 7, minimal information is required regarding the observational acquisition (only the positions of the receivers). The recovery of the source wavelet in parallel to the iterative minimization procedure usually performs well in seismic exploration, using the method prescribed by Pratt (1999); see remark 1. However, incorrect knowledge of the position of the sources is a strong difficulty, which can lead to the failure of the reconstruction procedure. Namely, FRgWI increases robustness by being free of such considerations. In addition, (3) the set of computational sources can be arbitrarily taken; hence, they can differ from the observational one (respectively, $nsrcobs$ and $nsrcsim$ in equation 8).

Therefore, it provides high flexibility regarding the choice of computational sources. It is important to notice that whatever computational sources are selected, the observed data still are independently tested one by one against the simulations; that is, the measurements are not modified in any way.

### Data acquisition

The observational setup uses a fixed lattice of receivers and sources located slightly above, as illustrated in Figure 1a. Furthermore, the position of the receivers remains the same for all sources. This configuration is consistent with the TopSeis acquisition system for marine seismic developed by Compagnie Générale de Géophysique and Lundin Norwary AS. This principle has been recently deployed in the field, showing benefits for near-offset coverage. Namely, it consists in having two boats: one that moves and carries the sources and one that carries the receivers and remains fixed. Similarly, the use of ocean-bottom node networks is appropriate.

For the reconstruction using FRgWI, we can use arbitrary numerical sources, which do not have to coincide with the observational acquisition. Here, we investigate the two following configurations.

### Dense point-source computational acquisition for sparse multiple-point measurements

In the case when the observational acquisition is composed of few sources (e.g., to reduce the cost of field experiments), FRgWI can use a dense coverage of computational point sources to enhance the sensitivity. The sources for the measurements can be multiple points, for example, if several air guns are excited at the same time. In our experiments, the choice of positions for the multiple points follows a structured decomposition: adjacent sources are grouped, with a fixed distance in $x$ and $y$ between each (Figure 1b). It appears more natural to use such a structured decision; for example, the boat will carry the air-gun sources all together. In addition, we have observed in our experiments that this structured partition gives better performance than using a random combination of sources.

### Sparse multiple-point computational acquisition for dense point-source measurements

In the case when the observational acquisition is composed of many (point) sources (as is routine in exploration seismic), FRgWI can instead use multiple-point sources in the computational acquisition to reduce the computational cost. It consists of a sparsification of the observational acquisition.

Therefore, we will investigate in the following numerical sections different situations in which one of the acquisitions (observational or computational) is composed of point sources whereas the other is made of multiple-point sources. The main advantage of FRgWI is that it does not require any modification of the data, which are tested one by one against the computational acquisition in equation 8. However, the shot-stacking version of FWI requires the summation of data in equation 11, which results in the possible loss of information, that is, crosstalk (Zhang et al., 2018).

## NUMERICAL EXPERIMENT 1: LAYERED MEDIUM

In this section and the next, we carry out 3D experiments of geophysical reconstruction to study the performance of FRgWI. In this first test, we compare with the traditional least-squares functional, and we use multiple-point sources to probe the robustness of arbitrary source positions; we also test the combination of sparse and dense acquisitions.

### Velocity model

We consider a 3D acoustic medium (provided by Statoil) of $2.55\xd71.45\xd71.22\u2009\u2009km3$ in the $x$-, $y$-, and $z$-axes, respectively, (i.e., $z$ is depth). We assume a constant density $\rho =1000\u2009\u2009kg\u2009m\u22123$, and the subsurface velocity is pictured in Figure 2. It consists of different geophysical layers, with nonmonotone model variations, from 1500 to $5200\u2009ms\u22121$. The first 500 m are mostly constant with a velocity of approximately $1600\u2009\u2009m\u2009s\u22121$.

### Time-domain data with noise

We work with time-harmonic wave propagation, but we follow a seismic context where measurements are obtained in the time domain. Furthermore, we incorporate white Gaussian noise in the synthetic seismograms to have a more realistic setup. The observational acquisition consists of 160 point sources, excited one by one. They are positioned at the depth of $z=10\u2009\u2009m$, and form a regular array such that sources are every 160 m along the $x$-axis and every 150 m along the $y$-axis. There is a total of 1376 receivers, which are located at a fixed depth of 100 m, every 60 m along the $x$-axis, and every 50 m along the $y$-axis. Note that the receivers remain at the same position for all sources (Figure 1).

We generate the time-domain seismograms (we have used the parallel time-domain code Hou10ni [see https://team.inria.fr/magique3d/software/hou10ni/]; it relies on internal penalty discontinuous Galerkin discretization [whereas we use HDG]. Also, the meshes are different between the time-domain modeling and the harmonic inversion) and incorporate noise with a signal-to-noise ratio (S/N) of 10 dB. Then, we proceed with the discrete Fourier transform to feed algorithm 1. In Figure 4, we picture the noisy time-domain pressure trace for a single point source, and the corresponding Fourier-transform that we use for the reconstruction. We respect the seismic constraint that the low frequencies are not available from the time-domain data (because these are not contained in the active source), and we work only with data from 5 to 15 Hz frequency. In Figure 5, we picture 2D time-space sections of the traces for the pressure and normal velocity, and we illustrate the effect of the added noise (10 dB S/N in our experiments).

### Performance comparison of the mist functionals

We first compare the performance of both misfit functionals $JL2$ and $Jr$ in the same context: The numerical acquisition is taken to be the same as the observational one (which is anyway mandatory for $JL2$). Therefore, we take $nsrcsim=nsrcobs$ and the same set for $f$ and $g$ in equation 8. We use algorithm 1, using sequential frequency progression from 5 to 15 Hz; more precisely, we use ${5,6,7,8,9,10,12,15}\u2009\u2009Hz$ data and 30 minimization iterations per frequency. In Figures 6 and 7, we show the final reconstruction after 15 Hz iterations when minimizing $JL2$ and $Jr$, respectively. For the discretization, we use a mesh of approximately 75,000 tetrahedra and polynomials of order three to five (depending on the frequency) for accuracy (note that the mesh differs from the one used to generate the time-domain data).

#### Remark 1

For the minimization of $JL2$ equation 7, one has to use the same source wavelet for the observations and simulations. Because the observational source wavelet is not precisely known, we need to reconstruct the source during the iterative process as well. We use the update formula given by Pratt (1999) for the iterative point-source reconstruction. However, this can induce additional difficulty in cases in which the source characteristic is not well recovered or, more important, when the source positions are not precisely known.

In this experiment, we use the same acquisition context for the two choices of misfit functional, to observe the intrinsic differences between the two methods. We observe the following:

- 1)
Both approaches provide a good reconstruction, where the layers of velocity appear (see the vertical section), and the correct velocity values are retrieved.

- 2)
The FRgWI provides slightly better results, in particular for the parts that are near the boundaries (see the vertical and horizontal sections of Figures 6 and 7). This can be explained as FRgWI formulation equation 8 tests every simulation source with each observational one (the two sums in equation 8), and it somehow compensates for the limited illumination of the boundary zones.

- 3)
Regarding the computational time, both approaches are similar (the only difference is the two sums in the misfit functional equation 8, which is a computationally efficient operation compared to the matrix factorization and linear system resolution).

Therefore, in this first test case, we have demonstrated the efficiency of FRgWI, which produces a better reconstruction compared to the traditional approach, without incurring any increase of computational time. But more important, the reciprocity-gap offers flexibility for the choice of computational acquisition, which we now study.

### Comparison of multiple-point sources FRgWI with shot stacking

The main feature of FRgWI is to enable the use of arbitrary probing sources for the numerical simulations ($g$ in equation 8). Here, we investigate the use of multiple-point sources to reduce the computational cost. The observational acquisition is composed of 160 point sources; that is, each source function $fi(x)$ in equations 7 and 8 corresponds to a delta-Dirac function in $xi(f)$, according to equation 9. For the computational acquisition ($g$ in equation 8), we now consider $nstack$ multiple-point sources, each composed of $npt$ points (equation 10).

We perform the iterative reconstruction using $nstack=5$ multiple-point sources, each of them composed of $npt=32$ points. The reconstruction using the shot-stacking version of FWI equation 11 is shown Figure 8, and the result using FRgWI (i.e., where only the computational acquisition is changed) is shown Figure 9.

We observe that FRgWI performs better than the shot stacking in FWI, in this experiment of relatively small scale:

- 1)
The vertical and horizontal sections shown in Figure 9 are actually very close from the reconstruction obtained using the same computational and observational acquisition (Figure 7). The pattern of increasing and decreasing velocity is well captured with the appropriate values.

- 2)
However, the FWI with shot stacking loses accuracy, in particular the vertical section in Figure 8 where the nonmonotonic variation is not even recovered.

Testing each of the observational sources independently with the arbitrary computational ones, the FRgWI is robust with multiple-point sources and allows a sharp decrease in the number of numerical sources for the simulations.

#### Remark 2

The results using the shot stacking with FWI could be improved by using a more advanced strategy of shot blending as mentioned in the “Introduction” section (Krebs et al., 2009; Li et al., 2012). We illustrate here that even a naive approach (i.e., simple to implement numerically) of shot stacking is sufficient for the FRgWI to produce satisfactory results. Also, note that we have tried to use a random selection of positions for the multiple point, but it was not performing as well as the structured criterion. This would, however, require more testing to confirm.

We can further reduce the number of computational sources. The reconstruction using FRgWI with $nstack=2$, $npt=80$ is shown in Figure 10. In Figure 11, we show the reconstruction using only one source: $nstack=1$, $npt=160$. We see that FRgWI still behaves quite well, in particular for two sources the reconstruction carries similar accuracy than before: The vertical structures are discovered accurately, with pertinent variations of velocity. For the reconstruction using only one source, Figure 11, we see some deterioration in the recovered layers, but it remains more accurate than the shot-stacking FWI using five sources (Figure 8).

The essence of FRgWI is that it does not modify the observational acquisition and instead it probes every measured seismogram independently. It increases the robustness of the reconstruction procedure, and it allows for the design of arbitrary computational acquisitions to reduce the computational cost. In this experiment, we have used a naive approach of shot summation, which requires no effort for its implementation and already shows accurate reconstructions.

### Sparse observational acquisition

We now investigate a different situation: when the measurements are obtained from only a few set of multiple-point sources. This happens, for example, in marine seismic when an array of air guns are excited simultaneously. It consequently speeds up the field acquisition and reduces the amount of data obtained, hence reducing the cost of the field acquisition.

Hence, we now consider that the measurements are obtained from five multiple-point sources. We follow the configuration of the previous subsection but inverting the computational and observational acquisitions: it is now $f$ in equation 8 that is represented with a multiple-point sources equation 12 whereas the computational sources $g$ consist of 160 single point sources. It means that the observed measurements are acquired from the physical phenomenon corresponding with multiple-point sources. Note that from this type of measurement, the FWI reconstruction coincides with the shot-stacking result of Figure 8. In Figure 12, we show the reconstruction in which the observations are obtained from $nstack=5$ multiple-point sources and the computational acquisition consists of 160 single point sources. In Figure 13, we show the reconstruction in which the measured data result from one multiple point source, composed of 160 points (i.e., all sources are excited at the same time) whereas the numerical acquisition remains with 160 point sources.

We observe that the reconstructions obtained from a drastically reduced set of measurements, resulting from multiple-point sources, retrieve the appropriate velocity variation. It displays similar accuracy compared to the use of single-point source measurements with multiple-point source simulations (Figures 9 and 11 for five and one computational sources, respectively). Namely, it seems that FRgWI is insensitive to which acquisition is made sparse (i.e., with multiple-point sources).

## NUMERICAL EXPERIMENT 2: SALT-BODY SEAM MODEL

In this experiment, we consider a 3D velocity model encompassing salt domes. The velocity is extracted from the seismic SEAM benchmark, and is $7\xd76.5\xd72.1\u2009\u2009km3$. The velocity model is depicted in Figure 14, and it varies from 1500 to $4800\u2009\u2009m\u2009s\u22121$. In this experiment, the density is heterogeneous, and it is pictured in Figure 15. Compared to a previous test case, it is of a different nature (salt domes), it is larger, and it has a heterogeneous density. Therefore, we expect this numerical experiment to be more challenging.

### Time-domain data

Similarly to the previous experiment, we work with noisy time-domain data and compute their Fourier transform to generate the harmonic data used in the iterative reconstruction. There is a total of 272 point sources in the observational acquisition, which is located at 10 m in depth. The sources are placed on a 2D plane with 400 m between each source along the $x$- and $y$-axes. For multiple point acquisition, we follow the structured combination illustrated in Figure 1b. In this experiment, we generate the time-domain data and we use a 20 dB S/N. The resulting seismograms are illustrated in Figure 16 for a single source, where we show the pressure and vertical velocity measurements. The data are acquired by a total of 2805 receivers. For the reconstruction, we perform a Fourier transform of the time-domain noisy data, and we use frequency content from 2 to 7 Hz.

### Reconstruction using FRgWI

For the reconstruction, we start with initial guesses that correspond to the smooth version of the true models; they are shown in Figures 17 and 18 for the starting velocity and density, respectively. We actually only focus on the reconstruction of the velocity model, and we keep the density as its initial representation of Figure 18. The density is known to be more poorly resolved than the velocity because of the lack of sensitivity in the data (Jeong et al., 2012), but it should not prevent us from recovering the velocity (Faucher, 2017).

We use the FRgWI algorithm with the minimization of equation 8 using sequential frequency content from 2 to 7 Hz. We perform 30 iterations per frequency for a total of 180 iterations. We analyze the two following configurations.

- 1)
We first use the dense observational acquisition composed of 272 point sources and multiple-point sources for the computational acquisition ($g$ in equation 8). The multiple-point sources use $npt=8$ points, resulting in $nstack=34$ sources. The reconstructed velocity after the iterations at 7 Hz is shown in Figure 19.

- 2)
Next, we consider the opposite situation: we assume a sparse observational acquisition of $nstack=34$ sources each composed of $npt=8$ points. Here, the computational acquisition is made dense, with 272 point sources. The recovered velocity is shown in Figure 20.

The minimization performs well in both situations: The salt dome is appearing with appropriate values (approximately $4500\u2009ms\u22121$). The upper boundary of the salt is recovered accurately (compare the right sides of Figures 19 and 20), whereas the deepest parts are more difficult to obtain. It is possible that the high velocity contrast prevents us from imaging below the salt. The horizontal section (the bottom of Figures 19 and 20) also shows the appropriate pattern, in particular with the decrease of velocity in the middle, and only the parts near the boundary are slightly less accurate. It confirms the results of our first experiment that FRgWI is relatively insensitive to whether an observational or computational acquisition is taken as sparse or dense: the reconstructions display similar resolution for the two cases. In terms of numerical cost, sparse computational acquisition reduces the computational time but, in the frequency domain with the use of direct solver, this gain remains marginal. To further improve the reconstruction, we shall include a regularization parameter in the minimization, for example, with the total variation approach, which can indeed be supported by our misfit functional; see also Faucher et al. (2020b).

## ANALYSIS OF RESULTS AND PERSPECTIVES

We have implemented the iterative minimization method based upon the reciprocity-gap functional in the frequency domain, with the HDG discretization to efficiently address the first-order system. To probe the method, we have carried out two 3D experiments: one with a velocity made of layers and another with a medium containing a salt dome, extracted from the SEAM benchmark. First, our experiments have shown that the FRgWI method performs better than FWI in the same configuration, that is, when we keep the observational acquisition for the numerical one, in particular by improving the illumination on the sides. These promising results motivate the implementation of the method for larger, field benchmarks, to probe the scalability of FRgWI. In this case, the time-domain formulation should be considered to continue the performance analysis that we have started.

### Flexibility in the computational acquisition

The main feature of FRgWI is its robustness regarding the absence of information on the observational acquisition, and the consequent freedom in the computational one. This is due to the misfit functional equation 8 that works with a product of a datum with a simulation, and that relates to the correlation-based family of methods. For instance, the wavelet used for the computational source does not have to match the observational one, as long as there is an overlap in their frequency contents, and we have not observed any other requirement on the choice of the source function. The use of less traditional sources such as plane waves is theoretically possible, and it should be investigated in the future.

Our main result comes from the positions of the sources used for the simulations that can be chosen arbitrarily. This flexibility is in the core of the misfit functional, which does not directly compare an observation with a simulation but tests the product of any observation with any simulation, hence enforcing the domain illumination. It is crucial to note that the measurements are not modified and are independently tested one by one against each of the (chosen) computations. In our work, we stay with illumination from one side (i.e., simulations and observations are generated near the surface), motivated by Alessandrini et al. (2019), which demonstrates the stability of the method. This choice also is motivated from the derivation of the variational formulation that we give in Appendix A, to avoid the appearance of additional integrals (see the discussion at the end of Appendix A). The use of probing sources from another part of the domain (e.g., deeper) has to be carefully investigated, and it is part of our ongoing research.

### Using sparse acquisitions

The use of multiple-point sources allows for the reduction of the computational time or of the field acquisition process. Here, the key is that only one of the two acquisitions is reduced, whereas the other remains dense. Once again, because every datum in the dense acquisition is tested with respect to each datum of the sparse one, it appears to overall produce the proper illumination of the domain. This is possible due to the misfit formula, which offers high flexibility, contrary to more traditional approaches in which both acquisitions must be the same.

We have experimented with the two configurations (the sparse computational acquisition with the dense observational one or the opposite), and it results in similar accuracy in the reconstruction. Because FRgWI independently tests all combinations of an observation with a simulation, it further appears robust with respect to shot stacking. With FRgWI, we do not exactly observe the crosstalk effect because the measurements are not modified before entering the misfit; instead, we observe some inherent difficulties when the acquisition is too sparse. This is unavoidable, for instance, if the original field data have been obtained from a few sources only: we cannot artificially enhance the information contained in the measurements. However, the difficulties that come from the summation of data (i.e., having multiple-point sources in a sparse acquisition) and that results in crosstalk are diminished by the misfit functional, which takes the product with each datum in the dense acquisition.

In the context of shot stacking, there is a “reference” solution, which is the use of the nonsummed fields. Here, we raise the following question for future applications of the FRgWI method: How do we optimally select the numerical acquisition to make the best use of the observed measurements?

### Data coverage

It is important to remember that the reciprocity-gap misfit functional relates, in terms of receivers, to the discretization of a surface integral, as depicted in Appendix A. In our experiments, we have a dense array of receivers, according to the standards in exploration geophysics. In the case in which the surface is not properly covered, we certainly need to appropriately weight the records, using, for instance, existing techniques from earth seismology, such as the one prescribed by Montagner et al. (2012). Similarly, the use of specific weight (e.g., from quadrature rules) could give a first indication on how to select the computational acquisition to ensure the equal illumination of the domain with minimal points.

### Toward practical applications

Our reconstruction algorithm has been carried out in the frequency domain, in which it is difficult to address the scales encountered in field applications due to the operations of linear algebra (mainly the memory required for the matrix factorization). Although the efficiency of the frequency domain is improved by the use of HDG discretization (to reduce the size of linear systems) and by new techniques developed for the solver (Liu et al., 2020), time- or hybrid-domain inversion should be implemented to further probe the method in field studies. In this case, the multiplication of fields in the misfit functional equation 8 turns into convolution products, maintaining the same features regarding the flexibility of the numerical acquisition. We do not expect a difference in the performance of the method when implemented in the time domain for the situation we have studied here, especially if one follows the (usual) progression of increasing frequency content in the data, as introduced by Bunks et al. (1995). However, it is crucial to experiment on larger field configurations to further evaluate the method compared to the existing techniques. Here, the impact of the (limited) data coverage has to be carefully investigated.

## CONCLUSION

We have performed waveform inversion using a reciprocity-gap misfit functional, which is adapted to the dual-sensor devices that capture the pressure field and the normal velocity of the waves. In this case, the measurements can be seen as Cauchy data, motivating our choice of misfit from the point of view of inverse scattering theory. Our method enables the use of different acquisitions for the observations and the simulations, by relying on a product of data. We have investigated its performance with sparse acquisitions made of multiple-point sources and have demonstrated its efficiency compared to the traditional FWI. We have used a sparse acquisition either for the observational one (to reduce the cost of the field acquisition) or for the computational one (to reduce the numerical cost of the iterative minimization), while not altering the resolution of the reconstruction.

We have implemented the method in the frequency domain and carried out two experiments of different natures. These represent the first steps of the validation, while time- and hybrid-domain inversions now should be considered. The implementation of the misfit with time signals does not result in any technicality and should not lead to a difference in performance for the scales that we have studied with our frequency-domain setup. However, it is necessary to further probe the behavior of FRgWI in larger practical test cases and to analyze how it competes with state-of-the art methods.

The FRgWI method allows for arbitrary computational source positions that now need to be further analyzed. For instance, the definition of criteria for the design of the computational acquisition and the study of the source positions (e.g., not just close to the near surface area, with quadrature rules to ensure equal illumination) is the subject of our future research. Note also that the concept of FRgWI extends readily to elasticity.

## ACKNOWLEDGMENTS

The authors thank the reviewers and editor for the valuable comments that have helped improve the manuscript. The authors would also like to thank J. Virieux and A. Fichtner for the thoughtful and encouraging discussions. FF is funded by the Austrian Science Fund under the Lise Meitner fellowship M 2791-N. The work of GA and ES was performed under PRIN grant no. 201758MTR2-007. The research of HB has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska Curie grant agreement no. 777778. The research of HB and FF also is supported by the Inria–TOTAL strategic action DIP. MVdH was supported by the Simons Foundation under the MATH + X program, the National Science Foundation under grant DMS-1815143, and by the members of the Geo-Mathematical Imaging Group at Rice University. The work of RG was supported by the Higher Education Authority Government of Ireland International Academic Mobility Program 2019. ES also has been supported by the Individual Funding for Basic Research granted by MIUR. The numerical experiments have been performed on the CEA’s cluster Irene, as part of the GENCI allocation AP010411013 and on the cluster PlaFRIM (Plateforme Fédérative pour la Recherche en Informatique et Mathématiques, https://www.plafrim.fr/fr). The code used for the experiments, hawen, is developed by the first author and is available at https://ffaucher.gitlab.io/hawen-website/. The second benchmark is extracted from SEAM; see https://seg.org/News-Resources/Research-and-Data/SEAM.

## DATA AND MATERIALS AVAILABILITY

Data associated with this research are confidential and cannot be released.

### RECIPROCITY-GAP FORMULA WITH EULER’S EQUATIONS IN SEISMIC

In this appendix, we motivate the reciprocity-gap misfit, constructed from the variational formulation of problem 1. In the context of seismic applications, we assume that the data (i.e., pressure and normal velocity) are acquired on a line $\Sigma $, which is slightly underneath the surface $\Gamma 1$. The domain is decomposed into above and below the area from the line of receivers, such that $\Omega =\Omega +\u222a\Omega \u2212$, illustrated in Figure A-1.

### GRADIENT FORMULATION USING ADJOINT-STATE METHOD

The gradient of the misfit functional is computed using the adjoint-state method, which comes from the work of Lions in optimal control (Lions, 1971), with early applications in Chavent (1974). The method is popular in seismic applications because it avoids the computation of the Jacobian matrix, thus requiring limited numerical effort; it is reviewed in Plessix (2006).