## Abstract

Accurately predicting the development height of the water-conducting fracture zone (H_{W}) is imperative for safe mining in coal mines, in addition to the protection of water resources and the environment. At present, there are relatively few fine-scale zoning studies that specifically focus on predicting the H_{W} under high-intensity mining conditions in western China. In view of this, this paper takes the Yushen mining area as an example, studies the relationship between the water-conducting fissure zone and coal seam mining height, coal seam mining depth, hard rock scale factor, and working face slope length, finally proposing a method to determine the development height of the H_{W} based on multiple nonlinear regression models optimized using the entropy weight method (EWM-MNR). To compare the reliability of this model, random forest regression (RFR) and support vector machine regression (SVR) models were constructed for prediction. The findings of this study showed that the results of the EWM-MNR model were in better agreement with the measured values. Finally, the model was used to accurately predict the development height of the hydraulic conductivity fracture zone in the 112201 working face of the Xiaobaodang coal mine. The research results provide a theoretical reference for water damage control and mine ecological protection in the Yushen mine and other similar high-intensity mining areas.

## 1. Introduction

With the increase of energy demand and mining intensity, the change of geological conditions of coal seam roof overburden and the development of mining fissures caused by coal mining activities are the direct cause of damage to key underground aquifers and the root cause of ecological degradation in mining areas. Coal mining will lead to the destruction of the overlying rock layer, forming a water-conducting fracture zone consisting of fractured and caved zones. Once the H_{W} communicates with the aquifer, overlying old air water, or surface water body during coal seam mining, forming water channel and causing water damage accidents, along with ecological and environmental damage issues such as soil erosion, vegetation death, and ground collapse [1–4].

In recent years, with developments in mining equipment and coal mining technology, high-intensity mining methods have gradually received increasing attention. With the shift of the development centre for coal resources to the northwest of China, its fragile ecological and geological environment, along with water shortages, must be focused on [5–7]. Large amounts of high-intensity mining will inevitably lead to issues regarding green and safe coal mining, in addition to groundwater resource protection [8]. Therefore, the accurate prediction of H_{W} is still an important issue in current research. For several years, researchers have worked on the prediction of the development height of hydraulic fracture zones, achieving certain results. According to previous research results on the development height of water-conducting fissure zones, the main influencing factors of H_{W} are the coal seam mining height (M), coal seam inclination angle (A), mining depth (D), working face width (W), advancing speed (S), and proportion coefficient of hard rock (b) [9, 10].

Currently, the methods used to determine the development height of hydraulic conductivity fracture zones are mainly field measurements [11–13], empirical equation calculations [14], theoretical analyses [15, 16], and numerical simulation methods such as UDEC [17, 18], FLAC [19, 20], PFC [21], RFPA [22], and physically similar material simulations [23–25]. Among them, actual field measurements have the highest accuracy, but are time consuming, laborious, and costly. The theoretical calculation method is too idealized and deviates greatly from the actual complex geological conditions. Similarly, material simulation methods require a high accuracy of material proportioning, which is difficult to achieve for complex geological conditions. The accuracy of numerical simulations is closely related to the geological parameters for which the model is built, which hinders the accuracy of the results. At present, H_{W} calculation in China is mainly based on the “Code for the Preservation and Pressing of Coal Pillars in Buildings, Water Bodies, Railways, and Main Wells and Roadways.” The factors considered in the empirical formula proposed in this specification are only the mining height of the coal seam and the hardness of the overlying rock seam; the mining height of a single coal seam does not exceed 3 m, which is a single influencing factor to consider and is insufficient to reflect the comprehensive effect of multiple influencing factors, as shown in Table 1.

In recent years, some machine learning methods, such as decision trees (DT), support vector machines (SVM), random forest regression (RFR), artificial neural networks (ANN), and multiple regression analysis (MNR), have gradually become mainstream for predicting the development height of hydraulic fracture zones and have improved the accuracy of prediction to a certain extent. For example, He et al. [26] predicted the height of an HW under longwall mining conditions using a multiple regression approach, which effectively reflected the relationship between the H_{W} and different mining conditions. Further, Zhao and Wu [27] proposed a prediction method for H_{W} based on RFR. However, the significant diversity and complexity of geological conditions in China’s mining areas lead to differing degrees of influence of mining on the development height of H_{W} in different regions; the large range of study areas used in previous prediction models have reduced the prediction accuracy to a certain extent, and the adaptability of the prediction methods is low. Therefore, the current prediction model needs to be improved to carry out fine zoning predictions with improved accuracy [26, 28].

In this study, the relationship between H_{W} and M, W, D, and b was analysed, and the EWM-MNR prediction model was established, taking the Yushen mining area in the north of Ordos as the study area.

To verify the validity of the EWM-MNR model, it was applied to the Xiaobaodang coal mine in the Yushen mining area. In addition, RFR and SVR models were constructed to facilitate a comparison with the EWM-MNR model. The results show that the EWM-MNR model has good prediction performance, and the prediction results are in good agreement with the field measured data, which verifies the feasibility and accuracy of the proposed method.

## 2. Study Area and Data Collection

### 2.1. Overview of the Study Area

The Yushen mining area is located in the middle of the Jurassic coalfield in northern Shaanxi and is one of the main mining areas of the northern Shaanxi coal base. The total area of the mine is about 5,265 km^{2}, which borders the Maowusu Desert and the Loess Plateau. It has an average annual rainfall of about 400 mm and an average annual evaporation of about 2,000 mm, with a fragile ecological environment and a shortage of water resources (Figure 1(a)).

The coal seam in the Yushen mining area has good conditions, excellent coal quality, large reserves, a simple geological structure, and the dip angle of the seam is <10°. The burial depth of the main mining seam in the area is generally greater than 100 m, and the further to the west, the greater the burial depth; the highest depth reaches more than 500 m, and the average recoverable coal thickness is 6.50 m. The surface water system in the area is relatively developed and primarily comprises the Kuye River and its tributaries in the northeast, the Tuwei River and its tributaries in the middle, and the Yuxi River and its tributaries in the southwest. According to the groundwater fugacity conditions and hydraulic characteristics, the water-bearing rock formations in the study area can be divided into two categories: the Sala Wusu group and the sandstone aquifer, as shown in Figure 1(b).

The Sala Wusu group aquifer is the only groundwater resource with a large-scale water supply and ecological significance in the study area under natural conditions. The region bears multiple strategic responsibilities for energy supply, water conservation, and ecological protection and restoration.

### 2.2. Measured Data Collection

_{W}and M, W, D, and $b$ after coal seam mining in the Yushen mining area (Figure 2), among them, the proportion coefficient of hard rock ($b$) refers to the ratio of total thickness of hard rock strata to the estimation height of WCFZ above the coal seam [26]. The calculation equation is as follows:

## 3. Prediction Method

_{W}; $x1,x2,\cdots xn$ are the independent variables, corresponding to M, D, W, and b, respectively; $\alpha n$ is the regression coefficient; $\beta $ is the random error.

^{th}evaluation attribute, and $Pij$ is the weight occupied by the value of the $i$

^{th}evaluation indicator under the $j$

^{th}indicator. $xij$ is the value of the $j$

^{th}coefficient of group $i$ in the data of Table 2. There are 20 data sets and four influencing factors in this study; therefore, the maximum value of $i$ is 20, and the maximum value of $j$ is 4. In addition, we specify that, when $Pij=0$ or $Pij=1$, $Pij\u2009lnPij=0$.

^{th}influencing factor.

## 4. Results and Discussion

In this study, a multiple nonlinear regression model was used to determine the relationship between the development height of the hydraulic fracture zone and other influencing factors; finally, the corresponding relationship equation was derived.

To establish the prediction model, the data in Table 2 were divided into two groups: 80% of the data was used as training samples to establish the prediction model, and 20% of the data was used as test samples to test the prediction model. Nos.1-16 in Table 2 are used as training samples, and Nos.17-20 are used as training samples.

Finally, the accuracy of the prediction model established in this study was verified through a comparative analysis and field engineering applications.

### 4.1. EWM-MNR Model

To explore the correlation between H_{W} and $M,$$W$, $D$, and $b$, a single-factor regression analysis was conducted using SPSS software to establish 11 basic models for H_{W} and other influencing factors, as shown in Figure 3. There is a more significant direct relationship between H_{W} and the regression variables, except for $b$.

To eliminate the interference of $M$ on this factor of $b$, the ratio of H_{W} to coal seam mining height (H/M) is introduced. The larger the value of H/M, the larger the value of H_{W} under the same coal seam mining height conditions.

Figure 3 shows the relationship between H_{W} and M, W, D, and *b*. As shown in Figures 3(a)–3(c), H_{W} increases with M, D, and W, respectively; however, the rate of increase decreases and tends to level off. As shown in Figure 3(d), there is no direct correlation between H_{W} and $b$. Figure 3(e) shows that H/M is positively correlated with $b$, and the rate of increase gradually increases after the hard rock lithology coefficient reaches about 0.65.

_{W}with other single factors in each of the 11 models. Based on the $R2$ and sig of each model, the optimal relationship between H

_{W}and other single-factor regression variables was obtained. The specific relationship between H

_{W}and each single-factor regression analysis is shown in Equation (8).

_{W}and one-way regression analysis, the EWM-MNR model was finally proposed, as shown in

### 4.2. Error Analysis

Predictive model accuracy assessment is an important step to complete prior to model application. To further verify the accuracy of the EWM-MNR model, the RFR model and the SVR model were established using the same training samples in a Python environment. Figure 4(a) shows the reliability of each model using the canonical equations shown in Table 1, EWM-MNR, RFR, and SVR models to obtain the predicted values of different methods and evaluate the reliability of each model using two evaluation metrics: the coefficient of determination ($R2$) and root mean square error (RMSE).

The RMSE and $R2$ of different models were calculated according to Equations (10) and (11), respectively, and the calculation results are shown in Table 4.

The EWM-MNR model has an $R2$ value of 0.97 and 0.96 for the training and validation samples, and an RMSE of 5.51 and 5.09, respectively. The RFR model has an $R2$ value of 0.73 and 0.89 for the training and validation samples and an RMSE of 15.29 and 6.90, respectively. The SVR model has an $R2$ of 0.82 and 0.85 for the training and validation samples and an RMSE of 12.59 and 8.10, respectively.

Figure 4(b) shows the residual values for the different methods. It can be seen that the error values for the EWM-MNR model range from -12.70 m to 7.01 m, with an average absolute error value of 4.45 m. The error values of the RFR model range from -37.53 m to 37.91 m, with an average absolute error value of 9.13 m.

The error values for the SVR model range from -33.13 m to 25.43 m, with an average absolute error value of 8.31 m. The error values for the corresponding medium-hard first formula in Table 1 range from -116.92 m to -3.25 m, with an average absolute error value of 64.57 m. The corresponding error values for the medium-hard second formula in Table 1 range from -104.73 m to 0.26 m, with an average error value of 58.57 m.

The abovementioned results show that the predicted values of the EWM-MNR model proposed in this study are very close to the measured values of the training and validation samples, with lower RMSE and higher $R2$ values, which indicate a better prediction performance than the RFR model and SVR model. This shows that the model is more suitable for H_{W} prediction under high-intensity mining conditions in the Yushen mine. In addition, the prediction model proposed in this study will be continuously updated in the future, with a view to make the model more widely applicable.

## 5. Case Study: 112201 Working Face of the Xiaobaodang Coal Mine

### 5.1. Overview of Working Face 112201 and Two Investigation Drillholes

The 112201 working face of the Xiaobaodang No. 1 Coal Mine in Yushen Mining District is the first mining face of this coal mine. The length of the working face is 4660 m, the width of the working face is 350 m, the dip angle of the coal seam is 1°, the mining height of the coal seam is 6 m, and the coal is recovered using the comprehensive mining long wall method. 2^{-2} coal in the working face is located at the top of the fourth section of the Yan’an Group, which is the thickest recoverable coal seam in the area. The ground elevation is 1283–1330 m, the burial depth of the 2^{-2} coal is 300–400 m, and the bottom elevation of the coal seam is +930 m–+970 m. To accurately detect the height of the WCFZ, two holes were drilled in the mined area of 112201 working face. Figure 5 shows the locations of drill holes D1 and D2.

### 5.2. Application of the Prediction Model

Predicting the height of the WCFZ before mining is necessary. To better verify the accuracy of the prediction model, the EWM-MNR model, RFR model, and SVR model were used to predict the development height of the hydraulic conductivity fracture zone of D1 and D2 drill holes before the recovery of the 112201 working face. The prediction results of the different models are shown in Table 5.

### 5.3. Field Measurements

During mining, the height of WCFZ at the 112201 working face was observed using a combination of drilling fluid loss monitoring and downhole color TV records.

Figure 6 shows the flushing fluid leakage and in-hole TV detection results during the construction of different boreholes. From the D1 flushing fluid loss in Figure 6(a), it can be seen that the top boundary of the H_{W} is 134.80 m, and the in-hole TV detection results show that 138.30 m is the top boundary of the H_{W} in this borehole. The D2 flushing fluid loss in Figure 6(b) shows that the top boundary of the H_{W} is 143.58 m, and the in-hole TV detection result shows that 142.18 m is the top boundary of the H_{W} in this borehole.

When recording the flushing fluid leakage in segments during drilling, there is an error in terms of observation lag, which leads to certain deviations in the recorded flushing fluid leakage location. The TV detection in the borehole involves the continuous real-time observation of the entire borehole section, and it can intuitively locate and quantitatively describe the fracture development and distribution position inside the rock body with high accuracy. Therefore, the H_{W} of the in-hole TV detection was finally adopted.

_{W}is calculated as follows:

_{W}to the ground.

Finally, the $H1$ of the H_{W} top boundary height of the D1 borehole is found to be 123.30 m. The $H1$ of the H_{W} top boundary height of the D2 borehole is 142.18 m. The D1 borehole $E1$ is 1281.63 m, and the corresponding $E2$ is 981.26 m, which can be substituted into Equation (12) to obtain the D1 borehole $HW=1281.63\u2212981.26\u2212123.30=177.07\u2009m$. The $E1$ of drill hole D2 is 1288.76 m, and the corresponding $E2$ is 987.80 m, which can be substituted into Equation (12) to obtain the measured value of the drill hole D2 $HW=1288.76\u2212987.80\u2212142.18=158.78\u2009m$. A comparison of the field measured results of the D1 and D2 boreholes with the prediction results of different models is shown in Table 6.

The comparison between the field measured results and those from different models show that the relative errors between the predicted and field measured values of the EMW-MNR model proposed in this paper are 7.93% and 1.00%, the relative errors between the predicted and field measured values of the RFR model are 23.75% and 15.06%, and the relative errors between the predicted and field measured values of the SVR model are 20.00% and 8.52%, respectively. This indicates that the results of the proposed EMW-MNR prediction model agree with actual models more than the results of other prediction models.

## 6. Conclusions

To ensure the safe operation of coal mines, this study proposed a H_{W} prediction method based on the EWM-MNR model that is applicable to the high-intensity mining conditions in the Yushen mining area, and then applied the model to the Xiaobaodang coal mine in the Yushen mining area. The main conclusions of this study are as follows:

- (1)
The EWM-MNR model was proposed using 20 sets of measured data from the Yushen mining area, and the indicators affecting H

_{W}in this model mainly include four regression variables, $M$, $D$, $W$, and $b$. This model improves the prediction accuracy and stability of the H_{W} - (2)
The RFR and SVR models were used to compare the accuracy of the EWM-MNR model using the same training data. The RMSE of the training and test samples of the EWM-MNR model were lower, at 5.51 and 5.09, and the $R2$ values were higher, at 0.97 and 0.96, with RMSEs of 5.51 and 5.09. In contrast, the $R2$ of the coefficients of determination of the RFR model were 0.73 and 0.89, with RMSEs of 15.29 and 6.90, respectively. The corresponding $R2$ values for the SVR model were 0.82 and 0.85, and the RMSEs were 12.59 and 0.85

- (3)
The prediction model proposed in this paper was applied to the 112201 working face of the Xiaobaodang coal mine, and the predicted and field measured values for the H

_{W}were 164.51 m and 177.07 m for hole D1, respectively, representing a relative error of 7.10%; the predicted and field measured values for hole D2 were 157.16 m and 158.78 m, respectively, reflecting a relative error of 1.00%. The accuracy and applicability of the prediction model in the high-intensity mining area were further verified - (4)
The field measurement results of the 112201 working face show that the EWM-MNR model proposed in this paper can better predict the H

_{W}under high-intensity mining conditions. The prediction model has important guiding significance for the synergistic issues of water damage control and groundwater resource protection in western high-intensity mining areas

## Data Availability

The experimental test data used to support the findings of this study are available from the corresponding author upon request.

## Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

## Authors’ Contributions

D. F, E. H, and X. X conceptualized and designed the study. D. F and P. H contributed for the critical interpretation of the data. Manuscript drafting and critical revision prior to the submission were performed by all authors.

## Acknowledgments

This study was sponsored by the National Natural Science Foundation of China (No. 42177174), the Basic Research Program of Natural Science of Shaanxi Province (2020ZY-JC-03), and the Shaanxi Province Joint Fund Project (2021JLM-09).