## ABSTRACT

The calculation of the gradient in full-waveform inversion (FWI) usually involves crosscorrelating the forward-propagated source wavefield and the back-propagated data residual wavefield at each time step. In the real earth, propagating waves are typically attenuated due to the viscoelasticity, which results in an attenuated gradient for FWI. Replacing the attenuated true gradient with a $Q$-compensated gradient can accelerate the convergence rate of the inversion process. We have used a phase-dispersion and an amplitude-loss decoupled constant-$Q$ wave equation to formulate a viscoacoustic FWI. We used this wave equation to generate a $Q$-compensated gradient, which recovers amplitudes while preserving the correct kinematics. We construct an exact adjoint operator in a discretized form using the low-rank wave extrapolation technique, and we implement the gradient compensation by reversing the sign of the amplitude-loss term in the forward and adjoint operators. This leads to a $Q$-dependent gradient preconditioning method. Using numerical tests with synthetic data, we demonstrate that the proposed viscoacoustic FWI using a constant-$Q$ wave equation is capable of producing high-quality velocity models, and our $Q$-compensated gradient accelerates its convergence rate.

## INTRODUCTION

Full-waveform inversion (FWI) is a data-fitting procedure used to construct high-resolution quantitative subsurface models by exploiting the full information in the observed data, which has been extensively developed in recent years (Virieux and Operto, 2009). To solve FWI in the framework of the local optimization approach, a model update needs to be computed at each iterative step and each model update involves calculating the gradient of the misfit function with respect to the model parameters (Tarantola, 2005). Due to the nonlinearity of FWI in practice, FWI may converge slowly and cause the algorithm to get trapped in a local minimum. To accelerate the convergence rate of FWI, one can calculate a more accurate model update by applying an approximate Hessian inverse to its gradient (Ma and Hale, 2012; Qin et al., 2015; Hou and Symes, 2016). Another more common approach is to precondition the gradient, for instance, using the energy of the source wavefield to normalize the gradient (Gauthier et al., 1986), weighting the gradient by depth-dependent functions (Shipp and Singh, 2002; Wang and Rao, 2009), or smoothing/filtering the gradient in the spatial frequency or other transform domains (Guitton et al., 2012; Xue et al., 2017). Because seismic waves propagating in viscoacoustic media are attenuated, a natural gradient preconditioning approach is to compensate for the energy loss caused by attenuation. Here, we limit our focus on intrinsic attenuation, although seismic scattering attenuation caused by heterogeneity also complicates the amplitude analysis (Browaeys and Fomel, 2009).

The classic approach of superposition of several standard linear solids (SLSs) has been widely used in the time-domain finite-difference method to approximate the constant-$Q$ seismic wave propagation for a specific frequency band (Robertsson et al., 1994). However, compensating $Q$ effects with SLSs is challenging because reversing the sign of memory variables that account for attenuation does not correctly compensate for the phase (Zhu, 2014, 2016; Guo and McMechan, 2015). To overcome this issue, a better choice is to compensate for amplitude loss by using a modified wave equation, which has separate control over amplitude loss and phase dispersion effects (Zhu and Harris, 2014). This idea has been applied previously in reverse time migration (RTM) to get more balanced illumination in seismic images (Zhang et al., 2010; Suh et al., 2012; Bai et al., 2013; Zhu et al., 2014; Sun et al., 2015; Zhu and Sun, 2017), but its application to velocity model building remains to be investigated.

In this paper, the viscoacoustic wave equation proposed by Zhu and Harris (2014) is adopted to formulate a preconditioned approach to $Q$-compensated FWI. This equation involves fractional Laplacians and has separate control over phase dispersion and amplitude loss effects. In this way, compensating for the amplitude loss while preserving the correct kinematics can be done by reversing the sign of the amplitude-loss factor and keeping the sign of the dispersion factor unchanged (Zhu, 2016). This property was used recently by Sun et al. (2016b) to accelerate the convergence rate of least-squares RTM. We use it in this paper to accelerate the convergence rate of FWI. In the following, we first present the theory related to the forward operator, adjoint operator, FWI gradient, and $Q$-compensated gradient. Then, we demonstrate the effectiveness of our preconditioning strategy for FWI using a synthetic example.

## THEORY

### Forward operator

The function $P(x,t)$ indicates the pressure wavefield, $Q(x)$ is the quality factor, and $c0(x)$ stands for the acoustic velocity model defined at a reference frequency $\omega 0$. Note that $\gamma (x)$ defined in equation 2 ranges from 0 to $1/2$ and is a dimensionless parameter. The term starting with $\beta 1$ parameter in equation 1 characterizes the velocity dispersion, and the term starting with $\beta 2$ describes the amplitude loss. Thus, $\beta 1$ and $\beta 2$ take values of $\xb11$ and act like on/off switches to control phase dispersion and amplitude loss effects, respectively.

Equation 1 involves fractional Laplacians and has separate controls over phase dispersion and amplitude loss, which is a big advantage as opposed to other formulations of viscoacoustic wave equation. As shown by Zhu and Harris (2014), this equation is also valid for very low $Q$ values, such as 10–20. In our numerical example, we set the reference frequency $\omega 0$ to be very high and assume that the intrinsic attenuation will delay the propagation of the seismic wave.

### Adjoint operator

### FWI gradient

### $Q$-compensated gradient

The gradient construction in FWI requires the forward-propagated source wavefield and the backward-propagated data residual wavefield. Without compensation, the source wavefield will be attenuated, and the data residual wavefield will be attenuated once again, which means that the crosscorrelation of the source wavefield and data residual wavefield will suffer from the intrinsic attenuation along the entire wave propagation path twice.

## NUMERICAL EXAMPLE

We use a portion of a modified Marmousi velocity model (Figure ^{1}) and a built $Q$ model (Figure ^{1}) to test the $Q$-compensated FWI. The models include a water layer from the surface down to 200 m. Figure ^{1} shows the initial velocity model, which is obtained by using a Gaussian function to convolve the original velocity model with a radius of 450 m, but keeping the water layer unchanged. A fixed-spread acquisition survey is generated, which consists of 26 shots spaced every 200 m. The source wavelet for generating the synthetic data is a Ricker wavelet centered at 10 Hz. The record length of the synthetic data is 2.2 s with a time step of 2 ms. To test the forward operator used in FWI, we compare the acoustic shot record generated by a finite-difference method and the shot record generated by our viscoacoustic low-rank one-step operator with $Q=10,000$. The comparison in Figure ^{2} demonstrates that the low-rank one-step operator can simulate the same shot record even though it is based on a different formulation of the seismic wave equation.

We perform two FWI experiments: one using the conventional velocity gradient and the other one using the proposed $Q$-compensated gradient. The multiscale technique (Bunks et al., 1995) is used to avoid the cycle-skipping problem. The seismic data are filtered into three frequency groups of increasing frequency content: 2–5, 2–9, and 2–15 Hz. The optimization method is nonlinear conjugate gradients. We performed 20 iterations for each of the frequency groups successively. Without $Q$-compensation, source and data residual wavefields get attenuated. Figure ^{3} shows the (source and data residual) wavefield snapshot and the corresponding gradient comparisons without and with $Q$-compensation. These wavefields have the same traveltime of 1 s and are extracted from the same iteration of FWI. It is apparent that the wavefields with $Q$-compensation have a larger amplitude, especially for the deeper part, in which the accumulated attenuation effects are stronger. With the compensated wavefields, we can construct a $Q$-compensated gradient, which has the same phase as the compensation-free gradient, but a $Q$-dependent amplitude recovery.

Figure ^{4} shows the FWI results after the inversions with and without $Q$-compensation for the three frequency groups. Comparing Figure ^{4} and ^{4}, we observe the obvious improvement in the central and deep parts of the velocity model brought by the $Q$-compensated gradient. This phenomenon is easy to understand. As indicated in Figure ^{3}, $Q$-compensated gradient has a larger amplitude, especially for the deep part. In the subsequent two stages, $Q$-compensated FWI keeps recovering more accurate velocity models at the cost of the same iteration numbers. Figure ^{4} and ^{4} shows the final inverted velocities without and with using $Q$-compensated gradient. The rock cap (indicated by arrows) in the velocity obtained by $Q$-compensated FWI has a larger value, and it is closer to the true model. Figure ^{5} shows the data misfit convergence comparison of FWI with and without $Q$-compensation at the three stages. At the initial several iterations of the first stage, $Q$-compensated FWI has a faster convergence rate, and then it keeps the smaller misfit as the iterations continue. Comparing the misfit difference through the three stages, we can find that the relative misfit difference becomes bigger as the data frequencies increase. This implies that the attenuation effect is more obvious for high frequencies, and therefore the $Q$-compensated gradient can play a more significant role for high-frequency data inversion.

## DISCUSSION

We use the proposed method of preconditioning to accelerate the convergence rate of time-domain FWI for velocity reconstruction. The synthetic example shows that the $Q$-compensated gradient can effectively accelerate the rate of convergence, although the low-frequency components of seismic data, which are less sensitive to attenuation, are mainly used in FWI. Previous studies about time-domain FWI in the presence of attenuation have shown that considering attenuation as a smooth background modeling parameter can significantly improve the velocity reconstruction (Kurzmann et al., 2013; Bai et al., 2014). Compared with those studies, our method not only incorporates the attenuation effects in the wave simulation process but it also aims to construct an amplitude-compensated and phase-preserved gradient based on the attenuation model. Similarly, the attenuation model used in this study is assumed to be known and it is not an inversion parameter. FWI inversion strategies for simultaneous and hierarchical velocity and attenuation inversion have been investigated by Kamei and Pratt (2013). Yang et al. (2016) provide a theoretical review on the formulation of 3D viscoelastic multiparameter FWI. Qu et al. (2017) implement viscoacoustic anisotropic FWI and test their method’s sensitivity to velocity, $Q$, and anisotropic parameters. A background $Q$ model can be estimated using either data-domain or image-domain tomographic techniques (Brzostowski and McMechan, 1992; Quan and Harris, 1997; Hu et al., 2011; Zhou et al., 2011; Valenciano and Chemingui, 2012; Shen et al., 2014, 2015; Shen and Zhu, 2015; Dutta and Schuster, 2016).

## CONCLUSION

We have implemented a viscoacoustic FWI using phase and amplitude decoupled constant-$Q$ wave equation. By reversing the sign of amplitude-loss term in this equation, we have constructed the $Q$-compensated gradient that has the correct phase information and a $Q$-dependent amplitude recovery. The low-rank wave extrapolation technique is used to formulate the exact adjoint operator in the construction of the FWI gradient. Our testing results show that the FWI convergence rate can be notably accelerated by the $Q$-compensated gradient, which helps to rapidly construct the deep structures of the velocity model.

## ACKNOWLEDGMENTS

We thank the assistant editor D. Draganov, the anonymous associate editor, reviewer J. Hou, and one anonymous reviewer for providing valuable suggestions. We thank the sponsors of the Texas Consortium for Computational Seismology for financial support. Z. Xue and J. Sun were additionally supported by the Statoil Fellows Program at the University of Texas at Austion. We also thank the Texas Advanced Computing Center for providing the computational resources used in this study. All computations presented in this paper are reproducible using the Madagascar software package (Fomel et al., 2013a).