While much of the workflow for geophysical data processing and analysis is automated, final quality assurance typically requires the decisions of skilled human analysts and interpreters. The quality and reliability of geophysical inverse models depends directly on the effectiveness of input data. The influence of spurious data on derived data products used in the inversion has been reduced by this effectiveness. Failure to identify and mitigate bias in data products can lead to costly errors. By applying supervised machine learning (ML), a neural network can be trained to recognize features in data that a human domain expert would identify as characteristic of poor data quality. In this study, we use magnetotelluric (MT) data as an example of a geophysical data set appropriate for such a training exercise. While MT data are used to estimate the resistivity structure of the subsurface, the concepts we discuss are universal to seismic, potential fields, and other geophysical data sets. We train a neural network, pyMAGIQ (Python-based magnetotelluric impedance qualifier), with multiple hidden layers and demonstrate that it successfully generates the nonlinear mapping function required to assess the quality of MT data. The training set is a large database of frequency-domain MT impedance tensors from the National Science Foundation-funded EarthScope MT project. A human-assigned quality index is associated with each impedance. We apply pyMAGIQ to unrated MT data from the United States and Canada and confirm that the ML-assigned quality factors are consistent with those assigned by trained human operators. We also apply sensitivity analysis to the trained neural network. This reveals that the human- and ML-assigned data quality index depends on the magnitude of the confidence limits on (1) the phases and (2) the continuity of the apparent resistivities and phases with respect to frequency.

You do not currently have access to this article.