After coal mining, mining fissures develop, which may lead to an overlying aquifer and water inrush. Objectively and accurately evaluating and predicting the water abundance of the bottom aquifer of the Cenozoic are of great significance for ensuring safe mining of shallow coal seams and for the protection of water resources. The water richness evaluation index data has characteristics of high dimensionality, nonlinearity, and nonnormal distribution, and therefore, results often cannot objectively and truly reflect the water abundance of aquifers. To overcome these problems, a water abundance evaluation method is proposed that is based on the projection pursuit model. Taking the Cenozoic bottom aquifer in the Xutuan mine of Huaibei mining area, China, as a research object, a water abundance evaluation index system is constructed based on the law of sedimentary characteristics controlling groundwater. This system consists of four factors including aquifer thickness, sand and gravel layer thickness, number of sand and gravel layer, and ratio of sand and gravel content to total aquifer content. The projection pursuit comprehensive evaluation method is introduced to the water abundance evaluation of the aquifer. The method is used to optimize and solve the optimal projection direction, and the comprehensive projection value is calculated according to the optimal projection direction. The size of the comprehensive projection value is then used as basis for characterizing the water abundance degree of the Cenozoic bottom aquifer in the study area. Moreover, the Jenks natural breaks classification method is used to grade and evaluate the comprehensive projection value of water abundance. Finally, the evaluation results of water abundance are verified by the value of unit water inflow and the distribution of known water inrush points. The results show that it is feasible to employ the projection pursuit model for water abundance evaluation of mine aquifers, and optimal evaluation results can be obtained. The model projects high-dimensional data onto a one-dimensional subspace for data analysis and finds the optimal projection direction that can identify the data characteristics and mine the data to the greatest extent. Furthermore, this model avoids interferences by subjectivity and human factors and improves the scientificity and accuracy of the evaluation results. This study provides a new method and concept for the evaluation of aquifer water abundance involving multiple factors.

Coal resources are a vital part of China’s energy structure [1, 2]. In the coming decades, coal will remain the main energy source in China [3]. However, because of the complex hydrogeological conditions of mines in China, various water disasters can easily occur, resulting in significant economic losses and environmental problems [47]. The coal seams of many mines in North and East China are covered with thick Cenozoic loose layers, which have a confined aquifer at the bottom (called the Cenozoic bottom aquifer). This confined aquifer forms a potential threat to the safe mining of shallow coal seams [810]. Because of the influence of coal mining activities, mining-induced fractures may extend to the Cenozoic bottom aquifer and cause water inrush accidents. The aquifer is the material basis of roof water inrush, and its water abundance directly determines the amount and duration of water inrush [11]. Therefore, the objective and reasonable prediction of the water abundance of the Cenozoic bottom aquifer constitutes an important premise for the prevention and control of underground water disasters in coal mines. This is also of great significance for ensuring safe mining under the water body of shallow coal seams and for the protection of groundwater resources [1214].

At present, considerable research has been conducted on the evaluation of water abundance in mine aquifers. Commonly employed evaluation methods include three categories: geophysical methods, hydrogeological tests, and multifactor comprehensive analysis methods [1517]. Among these methods several problems exist, e.g., the geophysical prospecting method has the problems of large workload and high cost. The geophysical prospecting method has multiple solutions, and the interpretation results depend on the experience and technical level of interpreters [15, 18]. In the actual production process, there are generally few water-pumping test boreholes and the control range is limited, which cannot meet the requirements of the water-rich distribution law of the aquifer [9, 15, 19]. In contrast, the multifactor comprehensive analysis method is simple and effective. It can comprehensively analyze the water yield of an aquifer by making full use of abundant existing geological exploration borehole data.

Therefore, it has been favored by Chinese scholars and has been employed by many studies [18, 20, 21]. These studies on multifactor comprehensive evaluation of aquifer water abundance mainly differ in the adaptability of evaluation index selection, weight calculation, and comprehensive calculation method for comprehensively evaluating water abundance [20, 2224]. The selection of evaluation indicators mainly follows the principles of dominance, comprehensiveness, and availability, such as selecting corresponding evaluation indicators from the perspective of sedimentary water control law or structural water control law [25]. Regarding the calculation method of index weight for comprehensive evaluation, the traditional calculation method is mainly a method that subjectively determines the weight such as in an analytic hierarchy process, and the obtained results are subjective and arbitrary. Subsequently, methods that objectively determine the weight (e.g., entropy weight method, principal component analysis, grey theory method, BP neural network, and independence weight coefficient method) are introduced and current research mainly considers the comprehensive determination method of subjective and objective weights [15, 26, 27]. In terms of comprehensive calculation methods, principal component analysis, comprehensive index method, and geostatistical analysis are mainly used in the early stage. Recently, the neural network model, fuzzy identification model, and distance function model have been introduced to conduct comprehensive evaluation research on aquifer water abundance [2830].

In summary, index weight calculation and comprehensive calculation methods are developing more objectively and quantitatively and gradually improve the objectivity and accuracy of comprehensive evaluations of aquifer water abundance. As many factors of aquifer water abundance law and distribution are controlled, and as control mechanisms and nonlinear characteristics are complex [9, 31], most scholars used methods for calculating the weight and comprehensive value of participating indexes. While this helps to establish the linear relationship between water abundance and evaluation indexes, this method disregards useful information on the nonlinear part between water abundance and evaluation indexes, which undermines the accuracy of evaluation results. Moreover, the evaluation index data affecting water abundance feature high dimensionality and nonlinearity and are not necessarily conforming to a normal distribution. Therefore, certain models and methods cannot be employed for research on and calculation of water abundance.

Projection pursuit comprehensive evaluation is an exploratory analysis method, which is particularly suitable for the analysis and processing of distributed evaluation index data with high dimensionality that is also nonlinear and nonnormally distributed [32, 33]. The model constructs the projection objective function based on the data, uses the optimization algorithm to identify the optimal projection direction, and establishes a one-to-one nonlinear relationship between the water rich sample value and the projection value. Thus, this model can better reflect the heterogeneity of water-rich properties and the structural characteristics of data and ensure the objectivity and scientificity of the obtained evaluation results [34, 35]. The model integrates feature extraction and data compression and offers the advantages of clear mathematical meaning, good model robustness, strong anti-interference ability, and high accuracy [35]. At present, it has been widely used in many fields, e.g., for slope stability evaluation [36, 37], debris flow risk evaluation [38], landslide risk evaluation [39], groundwater quality evaluation [40], and flood disaster evaluation [41], where it has achieved promising results. Based on this, the present study introduces the projection pursuit evaluation method into the evaluation of mine aquifer water abundance, to provide a new research method and concept for the comprehensive evaluation of aquifer water abundance involving multiple factors.

Taking the Cenozoic bottom aquifer of Xutuan mine in Huaibei mining area, China, as a research object, and based on the law of sedimentary characteristics controlling groundwater, this paper constructs a water abundance evaluation index system. This system is composed of the following four factors: aquifer thickness, sand and gravel layer thickness, the number of sand and gravel layers, and the ratio of sand and gravel content to total aquifer content. An aquifer water abundance evaluation model is put forward based on the projection pursuit model. This complex method is used to optimize and solve the optimal projection direction, according to which, the comprehensive projection value is calculated. The size of the comprehensive projection value is used as basis to characterize the water abundance degree of the Cenozoic bottom aquifer in the study area. Furthermore, the Jenks natural breaks classification method is used to grade and evaluate the comprehensive projection value of water abundance in the study area. Finally, the evaluation results of water abundance are confirmed by the value of unit water inflow and the distribution of known water inrush points. This study provides a scientific basis for the prevention and control of water damage in the Cenozoic bottom aquifer that helps to ensure the safe mining of shallow coal resources.

Xutuan mine is located in Mengcheng County, Anhui Province, China, as shown in Figure 1. It is adjacent to Tongzhuang mine field in the north, Banqiao fault in the South, and Jiegou coal mine in the northwest. The mine field is covered by an extremely thick layer of loose substrate, and the strata exposed by drilling include Ordovician, Carboniferous, Permian, Paleogene, Neogene, and Quaternary. A total of 11 coal seams can be mined in this mine, of which the main coal seam has (from top to bottom) No. 32 coal, No. 71 coal, No. 72 coal, No. 82 coal, and No. 10 coal. Among these seams, the No. 32 coal seam is currently being mined and has an average buried depth of about 500 m, identifying it as a shallow coal seam. The mine is located in the south of Linhuan mining area and at the south end of Tongting anticline. It has a monoclinal structure with near north-south strike and east tilt, and secondary folds have developed. Faults in the mine are relatively developed. Xutuan fault (reverse fault), located in the north of the mine, is one of the most important large faults in the study area. In the north of Xutuan fault, the stratum dip angle is generally 6–18°, in the south of Xutuan fault, the stratum dip angle is generally 8–24°, and the outline map of the mine structure is shown in Figure 1.

According to the water bearing conditions of formation lithology and the spatial distribution of water bearing occurrence, the groundwater system of Xutuan mine can be divided into a Cenozoic loose layer pore aquifer (including the first, second, third, and fourth aquifers from top to bottom), a Permian main mining coal seam sandstone fissure aquifer (including sandstone fissure aquifer between No. 3–4 coal seams, sandstone fissure aquifer between No. 5–8 coal seams, No. 10 coal roof, and floor sandstone fissure aquifer from top to bottom) and a limestone karst fissure aquifer of Carboniferous Taiyuan Formation and Ordovician limestone karst fissure, as shown in Figure 1. Because of its large thickness and stable distribution, the third water resisting layer in the study area effectively blocks both the surface water and groundwater of the first, second, and third aquifers of the Cenozoic loose layer. Therefore, these aquifers generally have no impact on mine water filling. The water filling source affecting the mining of shallow coal seams of this mine is mainly the fourth aquifer of the Cenozoic loose layer (i.e., the Cenozoic bottom aquifer). The lithology of the Cenozoic bottom aquifer is complex, its thickness changes greatly, and the sedimentary conditions are controlled by paleotopography. The sections of the western and southern edges have steep paleotopography because of fault uplift, and therefore, the sedimentary thickness is relatively thin, mostly eluvial diluvium. Its lithology is mainly gravel, clayey gravel, clay, or sandy clay mixed with gravel, followed by clay, sandy clay, and clayey sand. The gravel components are mostly limestone, quartzite, sandstone, and chert, with a gravel diameter of 1–10 cm and good roundness. Tongzhuang Syncline in the northwest forms a natural channel for water inflow into the mine. From the valley mouth to the middle and eastern part, proluvial alluvium is most prevalent. The lithology is sand gravel, medium fine sand, and clayey sand, mixed with a thin layer of brownish yellow clay or sandy clay. Blocked by the buried hill of the Tongting anticline in the north, it is the sediment of Huishui Bay, mainly brownish yellow gravel, and medium fine sand. The data of 11 hydrogeological boreholes in the mine show that the unit water inflow in the mine is between 0.00062 and 0.3358 L/(s·m), which roughly classifies the water yield of this aquifer as weak to medium. The Cenozoic bottom aquifer directly covers bedrock aquifers and is mainly supplied by regional interlayer runoff. In the natural state, it has a certain hydraulic connection with the underlying aquifers through outcrops. The valley mouth in the northwest is the main recharge area of the aquifer, and the tertiary ‘red bed’ denudation area in the southeast is the discharge area. The horizontal runoff and regional recharge of the Cenozoic bottom aquifer are weak. After the mining of No. 32 coal seam, the aquifer water infiltrates into the mine through the weathered fissure zone, goaf zone, and fissure zone for drainage. Therefore, the Cenozoic bottom aquifer is the main water supply source for mine water filling in the mining of 32 coal seam, which constitutes a potential threat to the safe mining of shallow coal resources.

3.1. Projection Pursuit Model

The projection pursuit method is a new statistical method for addressing multi index complex problems. It is a cross between statistics, applied mathematics, and computer technology and offers unique advantages in analyzing and processing data with high dimensionality that is also nonlinear and nonnormally distributed [43, 44]. Its main characteristic is that it can observe the characteristics of high-dimensional data from different angles if the weight coefficient is unknown, project such high-dimensional data onto the low-dimensional space, and study the characteristics of high-dimensional data by analyzing the projection characteristics of low-dimensional space. The goals are to find the optimal projection reflecting the characteristics of data structure and analyze the data structure in the low-dimensional space. This approach offers the advantages of good robustness, strong anti-interference ability, and high accuracy. The specific purpose is to obtain the optimal projection eigenvalue reflecting the characteristics of its comprehensive index through projection pursuit analysis. Then, the one-to-one corresponding functional relationship between the projection eigenvalue and the dependent variable is established, to complete the transformation from high-dimensional data to low-dimensional data. In this process, multiple evaluation indexes are integrated into a comprehensive evaluation index, which is then used for a more reasonable classification and evaluation of samples [32, 39, 45].

As a comprehensive evaluation method directly driven by sample data that can be used to deal with multi-index complex problems, this method seeks the best projection direction according to the data characteristics of the sample. This enables to assess the contribution of each evaluation index to the comprehensive evaluation goal and obtains the projection value through the linear projection between the best projection direction and the evaluation index. The comprehensive evaluation of the research object is realized according to the size of the projection value [46]. The obtained evaluation results are highly consistent with actual data and have strong interdisciplinary universality [47]. Therefore, the projection pursuit model is used for the study of water abundance of mine aquifer with multifactor comprehensive evaluation.

3.2. Modeling Steps of Projection Pursuit Model

According to the projection pursuit principle, the projection pursuit comprehensive evaluation model of aquifer water yield is established in six steps, and the specific algorithm steps are described in the following [32, 34, 48]:

  • (1)

    Establishing the evaluation index matrix

First of all, we need to select the main control factor that affects the water abundance of the aquifer in the mine, that is, the evaluation indicator for the water abundance. Suppose that we have a group of water abundance evaluation indicator samples S1,S2,,Sn and that each sample is composed of the same indicators x1,x2,,xp, with n as the number of samples and p as the number of water abundance evaluation indicators. The value of the jth evaluation indicator of the ith sample is xiji=1,2,,n;j=1,2,,p; then, all indicator data of all samples to be evaluated can be represented by matrix X:
(1)X=xijn×p.
  • (2)

    Normalization processing

Since the dimensions of the indicators of water abundance are different, it is necessary to eliminate the influence of different dimensions and unify the range of the index values, so the original data needs to be normalized. The evaluation indicator with the larger value and the index with the smaller value are normalized according to Formula (2) and Formula (3), respectively.
(2)xij=xijminxjmaxxjminxj,(3)xij=minxjxijmaxxjminxj,
where maxxj and minxj are the maximum and minimum values of the jth indicator, respectively, and xij is the normalized data.
Thus, the dimensionless standardized evaluation matrix X is obtained:
(4)X=xijn×p.
  • (3)

    Linear projection

Projection pursuit analysis can reflect the data characteristics to the greatest extent and fully mine the optimal projection direction of data information, thus realizing a reduction of data dimensionality. The essence of this method is to synthesize the p-dimensional data xiji=1,2,,n;j=1,2,,p into the projection value zi of the one-dimensional vector a=a1,a2,a3,,ap in the projection direction.
(5)zi=j=1paj·xij,
where Zi is the projection value, which represents the comprehensive water abundance.

Projection pursuit can be used to solve certain nonlinear problems such as other nonparametric methods. Although this method is based on the linear projection of the data, it is searching for the nonlinear structure in the linear projection and can thus be used to solve certain nonlinear problems.

  • (4)

    Constructing the projection objective function

According to the classification principle, the distribution characteristics of the projection value should meet the following requirements: local projection points should be as dense as possible, preferably condensed into several point clusters. The overall projection point clusters should be scattered as much as possible, i.e., the distance between different classes Sz and the density of the dots within each class Dz of the p-dimensional data scattered in the one-dimensional space should be maximized at the same time. Then, the projection objective function is expressed as the product of Sz and Dz.
(6)Qa=Sz·Dz,
where Sz is the standard deviation of projected eigenvalue zi (which is also known as the the distance between different classes) and Dz is the local density of projected eigenvalue zi (which is also known as the density of the dots within each class). Sz and Dz are calculated according to Formula (7) and Formula (8), respectively.
(7)Sz=i=1nziz¯2n1,(8)Dz=i=1nj=1mRrij×utRrij,
where z¯ is the average value of sequence zii=1,2,3,,n and R reflects the density window radius. Its value principle is not only to ensure that the average number of projection points contained in the window is not too small, thus avoiding an excessively large sliding average deviation, but also ensures that it does not increase excessively with increasing sample size. In practical operation, R=α·Sz can be taken, and α can be adjusted appropriately according to the distribution of projection points zi between regions, such as 0.1, 0.01, and 0.001 (mostly 0.1). rij represents the distance between the sample projection values, which is rij=zizj, and utRrij is a unit step function, utRrij=1 when t=Rrij0, utRrij=0 when t=Rrij<0.
  • (5)

    Optimizing the projection objective function and determining the best projection direction

Different projection directions reflect different data structure features and the best projection direction exposes a certain feature structure of high-dimensional data as much as possible. For a given sample set index value, the projection objective function only changes with changing projection direction. Therefore, under certain constraints, it can be optimized by maximizing the objective function, and estimating the best projection direction. The maximization objective function is shown in
(9)s.t.maxQa=Sz·Dz,j=1paj2=1,aj1.

Formula (9) is a complex nonlinear optimization problem with optimization variables. At present, the corresponding modules in mathematical software packages such as MATLAB and DPS can realize the optimization of the projection objective function [35]. For this study, DPS software was used to solve the optimal projection direction and projection value through a nonlinear optimization algorithm.

  • (6)

    Establishing the projection pursuit comprehensive evaluation model

According to Formula (9), the best projection direction aj can be obtained, and the value of aj can be arranged to obtain the contribution degree of each evaluation index to the sample. By substituting aj into Formula (5), the best projection value zj of each sample can be obtained, and the original evaluation object can be comprehensively evaluated according to the distribution characteristics and size of the projection value zj. The size of the comprehensive projection value reflects the aquifer water abundance degree, where the greater the comprehensive projection value, the stronger the water abundance.

4.1. Water Abundance Evaluation Index System Based on Sedimentary Characteristics

4.1.1. Analysis of the Sedimentary Characteristics of the Cenozoic Bottom Aquifer

Xutuan mine is a fully concealed Carboniferous-Permian coal field. After the formation of the Ordovician strata, the entire North China platform was uplifted, and after long-term weathering and denudation, it provided a sedimentary base onto which Carboniferous and Permian strata were continuously deposited (coal seams mainly formed in the Permian era). This was followed by geological structuring such as folds and faults, which formed a high and low paleotopography, i.e., the paleotopography before the deposition of the loose layers of the Cenozoic. On this basis, a set of current loose layer sediments of the Cenozoic were deposited [49]. The Cenozoic bottom aquifer has a greater impact on the mining of shallow coal seams. Based on the current understanding of the lithologic structure and the sedimentary characteristics of the Cenozoic bottom aquifer in North-China-type concealed coalfields, the water abundance of the Cenozoic bottom aquifer (which affects the mining of shallow coal seams in the mine) is rarely controlled by the fault structure over a large area. Moreover, the influence of the fault can be almost ignored in a specific coalfield, which is mainly related to the sedimentary characteristics and paleotopographic conditions of the Cenozoic bottom aquifer [9, 50].

The base elevation of the Neogene in the study area ranges from -269.40 m to -353.40 m (as shown in Figure 2). The paleotopography is generally high in the west and low in the east, as well as high in the north and low in the south. In the area north of the Xutuan fault (approximately bounded by the Xutuan fault), with an average basement elevation of -299.47 m, the terrain is generally higher than in the south. The average basement elevation of the southern part of the Xutuan fault is -317.47 m. There is a large difference in base elevation between the south and north of the mine. During the formation of Cenozoic bottom sediments, the study area was high in the north and low in the south, with large ground slope, which constitutes the main material source of Cenozoic bottom sediments other than in-situ bedrock. The Cenozoic bottom aquifer is widely distributed in the mine, but its thickness varies greatly. As shown in Figure 3, the sedimentary thickness is 0–66.35 m, with an average of 14.22 m, and a general tendency to gradual thinning from north to south. The undulation of bedrock topography and provenance flow direction are important factors controlling the sedimentary thickness distribution of the Cenozoic bottom aquifer. The sediments are thicker in the relatively low-lying part of the paleotopography and thicker near the source [50].

The material composition of the Cenozoic bottom aquifer in the study area is complex. According to its composition, it can be divided into three categories, namely, gravel (mainly including gravel and sand gravel), sand (mainly including coarse sand, medium sand, fine sand, silt, and clayey sand), and clay (mainly including gravelly clay, sandy clay, clay, and calcareous clay). The diameters of the sediments at the bottom of the Cenozoic differ widely, particles are poorly sorted, and different particle sizes are mixed. A layer of gravel sediment or clay layer containing large gravel particles is common above the bedrock surface at the bottom. Gravel sediment can be divided into gravel layer or sand gravel layer, which are dominated by loose angular and sub angular particles, and gravel layer or sand gravel layer, which are dominated by loose sub round and round particles. The clay layer containing large gravel is gravelly consists of clay. The clay layer, sandy clay layer, clayey sand layer, silt sand layer, fine sand layer, and medium sand alternately appear above this layer. The entire aquifer has a composite structure in the vertical direction, and the lithology of each single layer is very unstable in the lateral extension. This set of sediments fully reflect the composite product of the combination of alluvial fan facies, fluvial facies, and slope facies, indirectly indicating that the corresponding sedimentary environments at that time were a piedmont alluvial environment, fluvial environment, and slope environment, respectively [51]. Between different sedimentary microfacies, sediments differ strongly in plane and profile, such as alluvial and proluvial channel sedimentations, with large thickness, good sorting, and good water abundance. The sediments of the slope facies are mixed with poor water abundance [49]. Regarding water bearing conditions of lithologic sediments, the water bearing degrees of gravel, sand, and clay sediments decrease in turn.

4.1.2. Constructing the Evaluation Index System for Water Abundance

The Cenozoic bottom aquifer holds a great threat potential to the safe mining of shallow coal seams; therefore, the prediction and evaluation of its water abundance is an important premise for the prevention and control of underground water disasters in coal mines. However, many factors affect the water abundance of the Cenozoic bottom aquifer. The reasonable and appropriate selection of the main controlling factors affecting the water abundance of the aquifer directly affects the establishment of the water abundance evaluation model and the accuracy of the evaluation results [20]. The sedimentary facies of the Cenozoic bottom aquifer control the sediment structure, which is closely related to the water abundance [9]. Based on the analysis of the sedimentary characteristics (e.g., the development thickness and lithologic structure of the Cenozoic bottom aquifer) in the study area and based on the law of sedimentary characteristics controlling groundwater, the aquifer thickness, the sand and gravel layer thickness, the number of sand and gravel layers (the total layers of gravel and sand), and the ratio of sand and gravel content to total aquifer content were selected as main controlling factors of the water abundance of the Cenozoic bottom aquifer. An evaluation index system composed of the above four main controlling factors was constructed to quantify the water abundance of the Cenozoic bottom aquifer. The relationship between the main controlling factors and the water abundance of the aquifer is explained in the following. The aquifer thickness, the sand and gravel layer thickness, and the ratio of sand and gravel content to total aquifer content are positive indicators of the water abundance of the aquifer. The number of sand and gravel layers is a negative indicator of the water abundance.

  • (1)

    Aquifer thickness

The aquifer thickness is an important factor affecting the water abundance of the aquifer, and it is also closely related to the occurrence of groundwater. When other factors are known, the water abundance of the aquifer is directly proportional to its thickness. The greater the thickness of the aquifer, the larger the water storage space, and the stronger the water abundance [26].

  • (2)

    Sand and gravel layer thickness

The thickness of the sand and gravel layer refers to the cumulative thickness of the two water bearing media gravel and sand. As a prerequisite for determining the water abundance of an aquifer, the thickness of the gravel and sand layer is the main storage space and runoff channel of groundwater. In sections with large thickness, the storage space of groundwater per unit area is large and the water abundance is high [9].

  • (3)

    Number of sand and gravel layers

When the thickness of the aquifer is known, the large number of sand and gravel layers indicates that the thickness of single-layer sand and gravel layer is thin, which weakens the hydraulic connection of groundwater in the aquifer and reduces the water yield of the aquifer. In contrast, if the number of sand and gravel layers is small, the thickness of the corresponding single-layer sand and gravel layer increases, and connected sand bodies easily form on the plane, thus increasing the water abundance of the aquifer [9].

  • (4)

    Ratio of sand and gravel content to total aquifer content

The ratio of sand and gravel content reflects the relative content of sand, gravel, and clay in the stratum. If the contents of sand and gravel are large, the aquifer has large water storage space, good permeability, and strong water abundance. If the opposite is found, the water abundance of the aquifer is low [52].

4.2. Evaluation Model of Water Abundance Based on Projection Pursuit Comprehensive Evaluation

4.2.1. Establishing the Model

As the aquifer water abundance distribution is controlled by many factors, and because of the complexity of the nonlinear relationship between control factors and water abundance distribution, this paper introduces a method that can effectively address multifactor and nonlinear problems. At the same time, this model can overcome the subjectivity in weight determination of previous evaluation methods and can thus be used to analyze the water abundance of the Cenozoic bottom aquifer in the study area [53]. The aforementioned construction of the water abundance evaluation index system was employed according to the principle of the projection pursuit model.

In this study, a total of 377 sets of drilling data within the study area were obtained, and each set of data includes four index data. After normalization processing 377 data sets in the study area, the comprehensive evaluation method of the projection pursuit in the Data Processing System (DPS) software was used to establish the water abundance evaluation model. The complex method was used to optimize the projection objective function, which achieves global convergence and can quickly calculate the direction value of the best one-dimensional projection index, which reflects the characteristic structure of high-dimensional data. This is the most effective algorithm for addressing medium and small-scale optimization problems that can be used as an auxiliary optimization tool for projection pursuit evaluation models. As shown in Figure 4, after optimization, the maximum projection index function Qa=66.4869 was obtained, and the best projection direction was a=0.7196,0.0099,0.6910,0.0681. The absolute value of each component in the optimal projection direction essentially reflects the impact of each evaluation index on the water yield of the aquifer. The higher the absolute value of each component, the greater the impact of the corresponding evaluation index on the water yield of the aquifer, and vice versa [54]. Although essential differences exist between the best projection direction and the weight of the index, there are also connections and commonalities between them. This implies that the best projection direction and weight are links that reflect the internal relationship of the data itself as well as the impact of the index on the evaluation results. The only difference is that the sum of the best projection directions is not 1, but the sum of the weights is 1.

The best projection direction a=0.7196,0.0099,0.6910,0.0681 in this study indicates that the influences of aquifer thickness, sand and gravel layer thickness, ratio of sand and gravel content to total aquifer content, and number of sand and gravel layers on the water abundance of the aquifer in the study area decrease in turn. The aquifer thickness and the sand and gravel layer thickness are the most important factors affecting the water abundance of the Cenozoic bottom aquifer in the study area.

Substituting the obtained optimal projection direction a into Formula (5) obtains the comprehensive projection value that reflects the information of multiple evaluation indexes, as shown in Formula (10). This visualizes the comprehensive projection value and the contour map of the comprehensive projection value of water yield of the Cenozoic bottom aquifer in the study area is shown in Figure 5. The size of the comprehensive projection value reflects the water abundance degree of the Cenozoic bottom aquifer in the study area, where the greater the comprehensive projection value, the stronger the water abundance [34, 39]. By taking the comprehensive projection value as basis for evaluating the water abundance degree of the Cenozoic bottom aquifer in the study area, the water abundance of each region in each study area can be intuitively compared, which is scientific and feasible [55]. Figure 5 shows that the comprehensive projection value of water abundance of the Cenozoic bottom aquifer is large in the north of the study area and small in the south, following a gradually decreasing trend from south to north.
(10)zi=0.7196xi1+0.0099xi2+0.6910xi3+0.0691xi4i=1,2,,377.

4.2.2. Water Abundance Zoning Evaluation

The comprehensive projection value of the water abundance of the Cenozoic bottom aquifer in the study area ranges from 0.0099 to 1.2492. This indicates that the water abundance of the Cenozoic bottom aquifer in the study area is unevenly distributed with clear differences. To further objectively and quantitatively study the spatial distribution characteristics of the water abundance of this aquifer, the Jenks natural breaks classification method was used for classifying and evaluating the comprehensive projection value of the water abundance of this aquifer [21]. This is conducive for characterizing the level of water abundance of the aquifer, refining the internal differences in water abundance of the aquifer, and delineating key target areas for the prevention and control of water hazards in the Cenozoic bottom aquifer.

The comprehensive projection value that characterizes the water abundance of the aquifer reflects its natural attributes. The Jenks natural breaks classification method is based on the inherent natural grouping of data. It identifies the natural breaking point by selecting the maximum similarity value within the group or the maximum difference between groups, emphasizing the natural breaking point and grouping, and requiring less human intervention. Therefore, it is reasonable and scientific to use the Jenks natural breaks classification method for grading [56, 57].

In mine production practice, the water yield grade of a mine aquifer is generally divided according to the size of unit water inflow [21]. Analyzing the data of 11 hydrogeological boreholes in the Cenozoic bottom aquifer in the study area shows that the unit water inflow value ranges from 0.00062 to 0.3358 L/(s·m), and the water abundance grade ranges from low to medium. Boreholes with weak water abundance account for 54.5%, and boreholes with medium water abundance account for 45.5%. No hydrogeological borehole data with very strong water abundance and strong water abundance are found. However, the existing unit water inflow values vary widely. To reflect the relative strength of the water abundance of the Cenozoic bottom aquifer in the study area more objectively, and to simplify the water abundance zoning prediction results for mine water prevention and control work, the level of low water abundance is further subdivided into low water abundance and lower water abundance. This means that the comprehensive projection value of water abundance of the aquifer was divided into three levels according to the natural discontinuity classification method [21]. The classification results are medium water abundance zone (comprehensive projection value of 0.5859–1.2492), low water abundance zone (comprehensive projection value of 0.2494–0.5859), and lower water abundance zone (comprehensive projection value of 0.0099–0.2494). Figure 6 shows the water abundance zoning map of Cenozoic bottom aquifer in the study area. The medium water abundance zone of the Cenozoic bottom aquifer in the study area is distributed north of the mine, the middle of the mine is mainly a low water abundance zone, and the south of the mine is mainly a lower water abundance zone. Therefore, when mining shallow coal seams, the north of the mine should be the key target area to prevent and control water damage of the Cenozoic bottom aquifer.

4.3. Verifying the Water Abundance Evaluation Model

Unit water inflow is the most direct basis for assessing the water abundance of aquifers [58]. The reliability of the water abundance evaluation model was tested using unit water inflow data of 11 hydrogeological boreholes in the Cenozoic bottom aquifer. Figures 6 and 7 show that the unit water inflow of boreholes 62-4, 2010-SHS 1, 66-68-4, 2011-SHS 2, and 70-71-3 are relatively large, with a variation range of 0.1369–0.3358 L/(s·m). Only 70-71-3 borehole was located in the low water abundance zone, while all other four boreholes were distributed throughout the medium water abundance zone. The boreholes 2011-SHS 1, 2010-SHS 2, 2006-SHG 1, and 2019-SHS 1 were distributed in the relatively low water abundance zone, and their unit water inflow values had a variation range of 0.02047–0.03317 L/(s·m). The unit water inflow values of the two boreholes 74-75-6 and 2009-SHG 2 were very small, with a variation range of 0.00062–0.00253 L/(s·m). These were distributed in the lower water abundance zone. Through the verification of hydrogeological borehole data, the unit water inflow values of hydrogeological boreholes included in the medium water abundance zone, low water abundance zone, and lower water abundance zone differed in their orders of magnitude (only borehole 70-71-3 was misjudged). The differences between the evaluation models for relative water abundance agree well with the water abundance grade results that were determined according to unit water inflow. This shows that the evaluation result of the model is good.

In addition, the occurrence of water inrush in the mining area and the volume of water inrushing during the mining process of the coal seam are directly indicative of the water abundance of the aquifer [26]. The stronger the water abundance of the aquifer, the greater the possibility that water inrush accidents happen, i.e., the locations where water inrush accidents occur are mostly the locations where the comprehensive projection value of water abundance is large. Therefore, the position of the known water inrush points can be superimposed on the water abundance zoning map of Cenozoic bottom aquifer and the evaluation model can be further tested accordingly. The Xutuan mine is currently mining the No. 32 coal seam in the No. 32 mining area in the north-central part of the mine, which extends north towards the Xutuan fault. During the mining process, as the DF206 normal fault was exposed during the cutting excavation of the No. 3222 working face, the fault surface suffered continued water inrush for an extended period, and the water inrushing volume stabilized at ~20 m3/h. Water inrush water samples were taken and subjected to water quality analysis, which showed that the source of the inrushing water was the bottom aquifer water of the Cenozoic. The water inrush point T1 is located in the medium water abundance zone according to this model, as shown in Figure 6. This further corroborates that the evaluation effect of this model is good.

In this paper, the projection pursuit model was used for evaluating the water richness of the Cenozoic bottom aquifer in a mine in China, and reasonable evaluation results were obtained. The following aspects are worth pointing out: (1) The model is not affected by evaluation index data with high dimensionality that is also nonlinear and nonnormally distributed and overcomes the limitation of traditional comprehensive evaluation methods, which can only be used to process normally distributed data. Although it is based on the linear projection of data, it searches for the nonlinear structure in the linear projection and can be used to solve a certain degree of nonlinear problems [39]. (2) The model converts high-dimensional index data into a one-dimensional space for data analysis and finds the optimal projection direction based on the structural characteristics of the data. Consequently, subjectivity in the weight determination is avoided and interference of human factors is reduced. The evaluation results can reflect the characteristics of the original data to the greatest extent, and the scientificity and accuracy of the research results are improved [32, 54]. (3) The model can thus eliminate interference of variables that are irrelevant for data structure and characteristics or have a weak relationship [55]. (4) Using the comprehensive projection value as basis for the evaluation of water abundance degree, the water abundance degree of each region can be intuitively compared. (5) Using data of the unit water inflow of mine pumping test boreholes for determining the water abundance grade, the Jenks natural breaks classification method was used (based on geostatistics in the ArcGIS software) and the comprehensive projection value was graded. This approach has certain scientificity and operability. The prediction results of the water abundance zone are more convenient for mine water prevention and control work. The model can accurately evaluate the water abundance of the Cenozoic bottom aquifer in the study area, and it is relatively simple, intuitive, and easy to understand. Therefore, the model has strong operability, effectiveness, and good application value in the field of aquifer water abundance evaluation. This study provides a new research method and concept for multi factor aquifer water abundance evaluations.

In addition, it should be noted that the water abundance of the aquifer represents the ability of the rock stratum to release water, which is mainly determined by the recharge, storage, and hydraulic conductivity of the aquifer [23]. In this study, many factors were found to affect the water abundance of the Cenozoic bottom aquifer, which are controlled by hydraulic characteristics, aquifer thickness, lithology, and sedimentary facies, driven by complex control mechanism and nonlinear characteristics [9, 10, 20]. However, there is still a lack of unified theoretical guidance for the selection of evaluation indicators. Limited by the research level and data availability, this study was only based on the law of sedimentary characteristics controlling groundwater, starting with the sedimentary characteristics and lithologic structure of the Cenozoic bottom aquifer. A water abundance evaluation index system was constructed that is composed of aquifer thickness, sand and gravel layer thickness, number of sand and gravel layers, and ratio of sand and gravel content to total aquifer content. The index system failed to cover all indexes that should be included in the process of water yield evaluation of the Cenozoic bottom aquifer. Considering the uncertainty associated with the impact evaluation index selection has on the comprehensive evaluation results, a more comprehensive evaluation index system should be constructed to further improve the evaluation performance of this model in the future. In addition, different projection direction optimization algorithms may also affect the objectivity of the evaluation results [34]. For example, the impact particle swarm optimization algorithm [59] and genetic optimization algorithm [32, 60] have on the evaluation results and the comparative analysis of the optimization results of complex method merit further study.

Based on the analysis of mine geology and hydrogeology data, this study established an evaluation index system of aquifer water abundance based on sedimentary characteristics. Moreover, the projection pursuit model was introduced into the evaluation of mine aquifer water abundance, and the water abundance of the study area was evaluated according to the comprehensive projection value of each index. This approach effectively overcomes the subjectivity of weight calculation and solves interference problems caused by evaluation index data with high dimensionality that is also nonlinear and nonnormally distributed to the evaluation results. The reliability of the evaluation model is verified by using limited geological borehole data and the known distribution of water inrush points. The main conclusions are summarized in the following:

  • (1)

    The bottom aquifer of the Cenozoic in the study area is a composite product of the combination of diluvial fan facies, fluvial facies, and slope facies, which indirectly reflects that the corresponding sedimentary environments at the time of formation were piedmont alluvial diluvial environment, fluvial environment, and slope facies. Based on the law of sedimentary characteristics controlling groundwater, starting with the sedimentary characteristics and lithologic structure of the Cenozoic bottom aquifer in the study area, a water yield evaluation index system was constructed. This system consists of four factors including aquifer thickness, sand and gravel layer thickness, the ratio of sand and gravel content to total aquifer content, and the number of sand and gravel layers

  • (2)

    By establishing a projection pursuit comprehensive evaluation model, the multi-index problem was simplified to a one-dimensional projection problem, and the projection objective function was optimized by using the complex method. The obtained optimal projection direction showed that the influence of aquifer thickness, sand and gravel layer thickness, ratio of sand and gravel content to total aquifer content, and number of sand and gravel layers on the water abundance of the aquifer in the study area decreased in turn. The aquifer thickness and sand gravel layer thickness are the most important factors affecting the water abundance of the Cenozoic bottom aquifer in the study area

  • (3)

    By taking the comprehensive projection value as the basis to characterize the water abundance degree of the Cenozoic bottom aquifer in the study area, the comprehensive projection value of water abundance of the aquifer was divided into three natural levels using the Jenks natural breaks classification method. These levels were medium water abundance zone (0.5859–1.2492), low water abundance zone (0.2494–0.5859), and lower water abundance zone (0.0099–0.2494). When mining shallow coal seams, the north of the mine should be the key target area to prevent and control water damage of the Cenozoic bottom aquifer

  • (4)

    The reliability of the water yield evaluation model is tested according to the unit water inflow values of 11 pumping test boreholes and the distribution of known water inrush points of the Cenozoic bottom aquifer in the study area. The results show that the differences between the evaluation models for relative water abundance are in good agreement with the results of water abundance grades determined according to unit water inflow. The evaluation results of the model are good

  • (5)

    The model has strong operability and effectiveness and promising application value in the field of mine aquifer water abundance evaluation. In the future, it is necessary to construct a more comprehensive evaluation index system and further optimize the algorithm for identifying the optimal projection direction of the projection pursuit model. These measures can further improve the evaluation performance of the model

The data used for calculation in this paper can be obtained from the corresponding author upon request.

The authors declare no conflict of interest.

The study was supported by the National Natural Science Foundation of China (Grant No. 41272278) and the Scientific Research Platform Innovation Team Construction Project in Universities of Anhui (Grant No. 2016-2018-24).

Exclusive Licensee GeoScienceWorld. Distributed under a Creative Commons Attribution License (CC BY 4.0).