Land reclamation not only provides valuable space for urban development, but also creates an upper aquifer in fill materials. Analysis of groundwater level (GWL) fluctuations in coastal aquifer formed due to land reclamation can provide important insight into the groundwater system (GWS) evolution, including the connectivity between the GWL and influencing variables (ocean tide and rainfall). This study presents wavelet analysis, multichannel SSA-wavelet analysis (MSSA-WA), and lag correlations to analyze the response of GWL to ocean tide and rainfall in the reclamation area of Zhoushan Island, China. The MSSA-WA results and the lag correlations show that the MSSA-WA provides better analysis results, specifically, clay layer and rainfall filtered information. The influence of the influencing variables on the upper GWL is relatively greater than the clay layer, and rainfall has a relatively stronger impact on GWLs than tides. The GWLs of the upper layer, SW18 and SW21, which are heavily influenced by influencing variables, can be predicted through variations in influencing variables. Finally, the analysis of the results shows that the lithology of different aquifers, offshore distance, preferential flow path, and pressure load can be factors between tides and GWLs. For rainfall and GWLs, different lithology of aquifers, properties of vadose zone, and topography can be influential factors. The combination method provides an optimization method for GWL fluctuations in coastal reclamation area with combined MSSA and wavelet analysis for correlation analysis between GWL and influencing variables (ocean tide and rainfall) and analysis of corresponding causes and influencing factors.

Coastal land reclamation, a practice that is used to alleviate land pressure caused by increased population and economic driving factors, is widely applied around the world [16]. Land reclamation may have significant influences on groundwater system (GWS) [2, 7]; for example, the groundwater level (GWL) rises and the salt water-fresh water interface (SFI) moves seaward [3, 8]. Groundwater fluctuation in a coastal aquifer could potentially be related with ocean tides, precipitation, and other human activities [9, 10]. Earth tides or atmospheric tides create water head oscillations [11], and the pressure of tidal wave causes the periodic fluctuation of GWL [12]. Precipitation pushes the coastline position seaward, leading to the refreshment of the aquifer in the original coastal areas and some reclamation areas, which can eventually lead to corrosion of buildings and highway foundations [13]. Especially in urban reclamation areas, the potential damage is immense during heavy rainfall [14]. Groundwater and GWL dynamics may lead to groundwater flooding [15], and water table buildup is likely to lead to the settlement of buildings in the area [14]. However, these effects are a slow and inconspicuous process, and hence may easily be overlooked [3, 4]. Therefore, enhanced research on the evolution of GWS is needed for the evaluation of the potential engineering hazards and protection of the ecological environment in coastal reclamation areas.

There have been extensive studies about the evolution of GWS in coastal reclamation area. A summary of these studies is given in Table 1. Mahmood and Twigg (1995) applied statistical analysis to study the evolution of GWS in coastal reclamation area [16]. However, previous research methods and contents are relatively simple. From 2000 to 2008, a research team of the Department of Geosciences of the University of Hong Kong mostly used analytical solutions to study GWL, the groundwater divide (GWD), submarine groundwater discharge (SGD), and the SFI. However, most of these analytical solutions are steady state, so they can be applied to study the long-term effect of land reclamation on GWF [4, 8, 13, 1720]. The solution of Jiao et al. (2001) ignored SFI, which can be used as a simple and fast estimation [18]. Guo and Jiao (2007) did not consider the seepage surface on the coastline, so the cumulative water level was higher than the actual situation [8], and then, they used transient analytical solutions [2, 7] and groundwater simulation model (FEFLOW, PHREEQC 2.0) [3, 21] to study the evolution of GWS in coastal reclamation sites. However, when the groundwater escapes in the form of spring or leakage, the water level rise calculated by the transient analytical solution will be overestimated [2, 7]. The model of Hu and Jiao (2010) had the limitations of inadequate calibration and overestimated water level, which accounted to the lack of data and the neglect of the slow consolidation process of the soil in the reclamation site [3]. In 2015 and 2016, there have been many studies based on monitoring data. Salem et al. (2015) used 23 groundwater samples and GWLs measured in 40 observation wells in summer 2008 [22]. Liu et al. (2016) designed submarine groundwater monitoring wells and water samples based on field data to obtain GWL, groundwater electrical conductivity (EC), temperature, and hydrochemistry data [23]. Zhang et al. (2019) used wavelet analysis to simply study the response of GWL to tide in coastal reclaimed land and considered the short-term change of GWL. However, the accuracy of results in clay layer is relatively low [14]. Recently, fresh groundwater lenses (FGL) in coastal reclamation islands have received more attention. Sheng et al. (2020) and Sheng et al. (2021) respectively, used a numerical model (a numerical model of variable density flow and solute transport) and a combination of various methods (sand-tank experiments, sharp-interface analytical solutions, and numerical modeling) to study the influence of land reclamation on FGL and GWL in the South China Sea. However, the numerical model is limited by field data, and the combination of various methods ignores the influence of tidal fluctuation which affects the vertical flow and the release or compression of water [24, 25]. Ling et al. (2021) used a three-dimensional numerical model of variable density flow and solute transport to study the impact of island urbanization on FGL on a small coral island in Sansha City, Hainan Province, China. However, it overlooks environmental factors such as tides and natural disasters [26]. Zhang et al. (2021) developed new analytical solutions for the steady-state heads and nondispersive SFI, which considered extended situations where reclaimed soils were more permeable than the original aquifer. However, it neglects number of real-world characteristics in a real reclamation site such as tides, spatiotemporal recharge, and aquifer heterogeneities [27].

In general, the change of GWL mainly depends on land reclamation, tidal changes and rainfall in reclaimed sites [14]. Table 1 presents the impact of land reclamation on the variation of GWL in coastal areas. Mahmood and Twigg (1995) used statistical analysis to investigate the impact of land reclamation on GWL changes in coastal areas [16]. Since 2000, a research group of the Department of Geosciences of the University of Hong Kong has mostly used analytical solutions and groundwater simulation models to study GWL fluctuations in coastal reclamation areas [24, 7, 8, 1720]. However, most of these studies ignore the short-term changes of GWF, the analytical solutions are mostly stable, the calibration of groundwater simulation models is not strict enough, and the simulated water level is higher than the observed one. Recently, Zhang et al. (2019) considered the short-term variations of GWL [14]. Sheng et al. (2021) and Zhang et al. (2021), respectively, used new analytical solutions and a combination of various methods to study the response of GWL change to land reclamation in coastal areas. However, new analytical solutions ignore the real characteristics of the actual reclamation site and the combination of various methods overlookes the impact of tidal fluctuations [25, 27]. And then, analytical solutions have been mostly used to study the response of groundwater to tides. Ferris (1951) provided the most popular analytical solution (hereafter referred to as “Ferris solution”) for estimating the amplitude and phase of groundwater caused by ocean tides [34]. Williams et al. (1970) and Townley (1995) studied the analytical solution of tidal propagation in aquifers with circular boundaries. However, this analytical solution only considers the straight coastline and does not evaluate Ferris solution of curved coastline [35, 36]. Solόrzano-Rivas & Werner (2020) explored the accuracy of the Ferris solution to curved coastlines. However, the solution adopted the same basic assumption as Ferris’ solution, that is, homogeneous, isotropic, and confined aquifer [37]. In addition, according to different aquifer types, researchers have also conducted a large number of studies. The early analytical solutions were mainly concentrated on a single horizontal aquifer [34, 38]. Guo et al. (2010) derived the analytical solutions to the influence of tidal fluctuation on GWL in a two-zone aquifer consisting of two regions with markedly different hydraulic characteristics [39]. Fang et al. (2018) derived an analytical solution to describe the head fluctuations in an estuarine—coastal aquifer system assumed to be horizontal, but as certain real world, this complex leakage aquifer system is inclined [40]. Zhao et al. (2022) developed a new solution to the influence of double tides on groundwater in a multilayered island leaky aquifer system. However, it will need to consider more complicated models and factors in the future study [41]. Finally, many researchers have recently studied the relationship between groundwater and rainfall based on wavelet analysis [42, 43] and time series analysis [44, 45]. Dong et al. (2022) used methods based on a wavelet technique to investigate the coherence and phase lag between precipitation, river levels, and groundwater [42]. Gu et al. (2022) used partial wavelet coherency (PWC) to evaluate the bivariate relationships between GWL and precipitation or surface water. However, PWC may be used to an unclear research point about distinguishing the response of GWLs to climate change and human activities in future work [43]. Hussain et al. (2022) applied correlation method to investigate the influence of associated rainfall on GWL and established groundwater numerical modeling using WASH123D simulation which should be performed for long period to better plan and manage groundwater resources [44]. Le et al. (2021) used a vector autoregression (VAR) process for bivariate time series data at the annual time scale to interpret the interactions between local precipitation variability and GWLs in shallow and deep aquifers of the Vietnamese Mekong Delta (VMD). However, the measurement of groundwater in the delta is limited in this study [45]. In addition, He et al. (2021) applied ArcGIS to simulate the variability in GWF and GWL attributed to rainfall seepage. However, this study made a unified assumption for the actual GWL of each slope element, which may lead to deviation in the calculated value and inaccuracy of the model [46]. Kong et al. (2022) investigated the variation in the GWL using four trend analysis methods and its response and lag corresponding to rainfall by the HARTT model in the Gnangara region. However, the GWL needs no missing value and is continuous in time to meet the requirements of the model [47].

In order to study the responses of GWL to influencing variables in costal reclamation areas, data-driven models can be used. The wavelet analysis is an analysis method that provides time-frequency representation, and has a wide range of applications in the geophysical processes. Wavelet analysis is a useful method, suitable for the analysis of natural phenomena due to its unique properties [48]. Wavelet coherence is used to analyze the impact of GWL on tides in coastal reclamation areas. However, the effects of the analysis may not be obvious in the clay layer with low permeability [14]. The multichannel SSA (MSSA) extracts potentially useful information from a substantial number of historical data sets, while taking into account the correlation between different time series [49]. Based on the theoretical basis of the MSSA and wavelet analysis, it can be concluded that the multichannel SSA-wavelet analysis (MSSA-WA) mixture model can have the advantages of both methods. The performance of the analysis is improved by means of decomposing the time series into its spectral components, in order to identify useful information and reconstruct the time series [49]. Thus, the combination model is a good option for researching the evolution of groundwater in the coastal reclamation area.

Zhoushan Island (ZSI), the largest residential island city, has a long history of land reclamation from the ocean [50]. After the Three Gorges Dam (TGD) was completed, several changes have taken place to the coasts of the Zhoushan Islands due to the reclamation of artificial siltation, coastal engineering, and harbor dredging materials [51]. Previous studies have shown that the dynamic changes of the GWL in the process of coastal reclamation are greatly affected by the reclamation activities. The factors that affect the change of water level include tides and rainfall [14]. Based on the results of Zhang et al. (2019) [14], here we develop the MSSA-WA model, with tides and rainfall as influence variables, in order to further evaluate the effects of influence variables on the GWL fluctuations in the Zhoushan coastal reclamation area, such as the correlations, lags, and influence factors between the influence variables and GWL. With this composite method, potential useful information is extracted from the original time series containing noise while maintaining the correlation of the GWLs, rainfall, and tides, so that the response of the GWLs to influence variables in coastal reclamation areas could be studied as clearly as possible. In addition, in the process of wavelet analysis, the interference of other factors could be reduced, and the covered signals could be extracted as much as possible, such as the hidden information in the clay layer and the hidden information between rainfall and GWLs. Finally, the interferences of human factors are reduced to a minimum, different types of data are statistically compared, and a lag correlation is used to quantify the correlation changes with lags.

2.1. Site Description

Zhoushan Island (ZSI) is the major island in the Zhoushan Archipelago, located in the coastal area of the eastern Zhejiang province of China [52, 53]. Its terrain is mainly composed of central hill range, surrounding plains, and tidal flat wetlands. As an island city, the demand that urbanization has for land often causes island cities to cross waters and reach nearby islands and continents, which results in long-term coastal land reclamation activities [54]. The location of the investigated area is in the southeast corner of the ZSI (Figure 1).

2.1.1. Hydrogeology

The study area is in a typical subtropical marine monsoon climate, with annual precipitation of 921.6-1318.8 mm. The tide is semidiurnal, with an average tidal range being 1.91 m–3.31 m, and the maximum tidal range being 3.73 m–4.96 m. The geological cross-section profiles of the boreholes in the area are shown in Figure 2. The stratum mainly consists of three layers, namely, the reclamation layer, the clay layer, and the weathered bedrock under the clay layer. The upper reclamation layer of the nearby mountains mainly consists of granite blocks and gravel, and the clay layer mainly consists of muddy silty clay. The reclamation thickness varies from 0 to 11.8 m, which near the hills is large, and near the seawall is small. The groundwater in the pores of the reclamation layer and the clay layer are two types of groundwater in the area. The area is an artificial reclamation area, and the original GWF and recharge and discharge conditions have been destroyed. At present, the area is recharged by piedmont runoff in the west. When the river water level exceeds the GWL, the groundwater is recharged; otherwise, the groundwater is recharged to the river. However, the largest recharge source in the area is atmospheric precipitation attributed to the small runoff in front of the mountain. Due to the barrier of the dam, the natural discharge of groundwater to seawater is blocked in the area. The main discharge channel is the drainage pipe established in the dam. When the GWL is high, it can be discharged to seawater through the drainage pipe. At the same time, evaporation is also an important way of groundwater discharge in this area.

2.1.2. Monitoring Borehole

The wells in the study area are shown in Figures 1 and 2. SW13 monitors the upper filled layer, SW14 monitors the lower filled layer, SW21 monitors the upper clay layer, and SW22 monitors the lower clay layer. SW15, SW17, SW19, and SW25 are used to monitor the filled layer. SW16, SW18, SW20, and SW26 are used to monitor the clay layer. SW25 and SW26 are the control points in the north. Short-term and long-term monitoring of these wells were performed in order to obtain water levels at half-hour intervals.

2.1.3. Data Collection

The data used in this study include groundwater, tide, and rainfall data. The groundwater data are time series of the recorded GWLs in the upper filled layer (SW13, SW14, SW15, SW17, SW19, and SW25) and the lower clay layer (SW16, SW18, SW20, SW21, SW22, and SW26) from the wells in the study area. The GWLs in these wells are divided into 6 groups, namely, (1) SW13, SW14, (2) SW15, SW16, (3) SW17, SW18, (4) SW19, SW20, (5) SW21, SW22, and (6) SW25, SW26. The GWLs are monitored every 30 minutes, and the daily cumulative sum of GWLs (DAGWL) are calculated. Tide time series are monitored at every hour, and the daily accumulation of the tides (DAT) are calculated. Time series of rainfall are the daily accumulative sums of the rainfall. These hydrogeological time series are divided into two groups: Group 1, including the time series of tides and the hourly time series of groundwater (6 groups), is called the hourly hydrogeological time series. Group 2, including the rainfall time series, DAT, and DAGWL (6 groups), is called the daily cumulative hydrogeological time series.

2.2. Methods

In this paper, the hydrogeological time series included GWLs, rainfall, and tides in the Zhoushan coastal reclamation area. The composite method used in this paper included outlier processing, a systematic data processing method, a multichannel SSA (MSSA), a wavelet analysis, and lag correlation analysis. Among these methods, the quartile method was used to deal with the outliers. The outliers that conform to the trend were kept, and values that change suddenly were mostly processed. The systematic approach in this study included the cumulative departure transformation, detrending, and standardization. The wavelet analysis included continuous wavelet transformation (CWT) and the cross-wavelet transformation (XWT). A simple flowchart of the methodology is shown in Figure 3.

2.2.1. Wavelet Analysis

In this study, the continuous wavelet transform (CWT) with the Morlet wavelet was applied to each individual hydrogeological time series in order to examine the periodicities in the time-frequency space. The Morlet, chosen as the mother wavelet, achieves a good compromise between time and frequency accuracy [5557]. By translating the wavelet in time, the series is localized in time, and by scaling it by frequency, the sequence is localized in frequency:
where s is the scaling factor that controls the width of the wavelet, and τ is a translation parameter that indicates the position of the mother wavelet. s1/2 normalizes the transformed time series in order to make the energy identical on each scale. ψt is the mother wavelet, and ψ¯t refers to the complex conjugation of ψt.
The CWT can be considered as an inner product of the sequence which needs to be analyzed xt and the basis function ψτ,st:
where Wτ,s is the calculated CWT coefficients that indicates the similarity of the time series and the wavelet at the current frequency. Wτ,s2 refers to the power spectrum of the wavelet. The above equation shows that the CWT is a measure of similar frequency content between the wavelets and the time series.
Subsequently, the cross-wavelet transform (XWT) was used in order to detect significant common power and local relative phase between the two hydrogeological time series in the time-scale plane. Following Hudgins et al. (1993) [58], the XWT of two time series xt and yt is defined as
where Wxτ,s and Wyτ,s are the wavelet transforms of the time series xt and yt, respectively, and W¯yτ,s is a complex conjugation of Wyτ,s. The arg (Wx,yτ,s) denotes the phase relationship of the two sequences in the time-frequency domain. The cross-wavelet power is defined as Wx,yτ,s. This definition describes the local covariance of two time series at different times and frequencies, which implies a common high energy region and a phase relationship in the time-scale plane [56, 59].

2.2.2. Multichannel SSA-wavelet analysis (MSSA-WA)

The wavelet-based method can explain temporal (or spatial) changes in the spectral features [60]. The MSSA pays more attention to the common effect of a set of time series [61]. The MSSA-WA combines the MSSA and the wavelet analysis. The MSSA-WA can simultaneously extract useful components from the multivariate time series while maintaining the spatial correlation, examine the periodicities, and detect significant common power and local relative phase between two time series in the time-scale plane.

As an extension of the SSA, the MSSA approach does not require prior information on the function, nor a stochastic model [62]. It simultaneously analyzes multivariate time series with common structural characteristics [63] and extracts components related to eigenelements from the analyzed multivariate time series [64]. Accordingly, the MSSA was chosen in order to simultaneously extract useful components from the multivariate time series while maintaining the spatial correlation.

Let
be an ensemble of the original time series with L being channels with the length N. Here, the hydrogeological time series were selected as the original time series.
The starting point of the MSSA is to consider a matrix Xl~ composed of the M time-lagged copies of the raw multivariate time series Xln [65]:
where N=NM+1 and M denotes the length of the window. Usually, the MSSA is used for several principal PCA components of the spatial data, and the choice of M is large enough to obtain detailed temporal and spectral information from the original time series [66].
A grand lag covariance matrix T~ can be obtained from a full augmented trajectory matrix X~=X~1,X~2,,X~L[66]:
where Tll is the covariance matrix of channel l and l time series.
Then, the process of the diagonalization is performed after the singular value decomposition (SVD),
gives eigenvalues λk and eigenvectors Ekk=1,2,,L×M of T~. The eigenvectors Ek are called the spatiotemporal empirical orthogonal functions (ST-EOFs).
The associated spatiotemporal principal components (ST-PCs), denoted as Akn, are calculated by projecting the time series Xln onto Ek:
where k and l are the kth eigenvector and lth channel, respectively. The sum of j can be explained as a classical finite impulse response (FIR) that filters operations on channel l. The combination of the summation over j and the filtered channel represents a classic PCA [67].
Using the following equation, the partial signal Rlkn reconstructed by the ST-PCs and the ST-EOFs is defined as

In essence, this partial reconstruction can help in the process of separation of the mixture of signals of different frequencies in the original time series [62].

Finally, the equation of MSSA-WA could be obtained to set the reconstructed time series as xt and it was substituted into the above formula of wavelet analysis.

2.2.3. Lag Correlation Analysis

This research applied a systematic approach developed by Hanson et al. (2004) [68] which takes as a first step the preprocessing of the hydrogeological time series, thus enabling the reduction of the anthropogenic effects [6870] and red noise from the raw time series [71], and the comparisons of different types of data, frequencies, and hydrological environments [68]. Following Dickinson et al. (2014) [72], the cumulative departure transformation, detrending, and standardization are described as follows. The cumulative departure transformation, which can eliminate the influence of the zero value, is calculated as the sum of the difference between the continuous value and the mean value of the time series. The detrended time series is calculated as the difference between the fitting polynomial and the original time series at time t. Standardization, which is used for statistical comparison of different data types, converts a group of variables with normal distribution into a new variable with normal distribution with the sample mean value of 0, and the sample standard deviation of 1. The preprocessing method was applied to the reconstructed GWL (RGWL), the reconstructed rainfall (RR), and the reconstructed tide (RT), which were named as the preprocessed reconstructed GWL (PRGWL), the normalized departure reconstructed rainfall (NDRR), and the preprocessed reconstructed tide (PRT), respectively. Secondly, a lag correlation, which was obtained from the USGS Hydrologic and Climate Analysis Toolkit, can quantify the correlation changes with lags, especially the lags that have the strongest correlation between the dependent time series [72]. The strongest lag correlations between the NDRR or PRT and PRGWL from all the wells were calculated in order to measure the lag relationship between the time series and to better understand the hydrogeological systems, without the use of dynamic numerical models [73].

3.1. Optimization Results from the MSSA-WA

Given that they are central to the study area, the time series of GWLs from wells SW17 and SW18 were used. The MSSA-WA results of hydrogeological time series corresponding to wells SW17 and SW18 are shown from Figures 47.

Figure 4(a) shows the first 20 partially reconstructed components (RC) of the hourly hydrogeological time series, which correspond to wells SW17 and SW18 in Group 1, among which the first 12 correspond to the eigenvalues shown in Figure 4(b). In the first 12 order RCs, the same period appears in orders 2-3, 5-6, 7-8, and 11-12, which can be viewed as a group of periodic signals, respectively. From the 13th onward, the RCs are high-frequency signals or noise, which were not analyzed in this study. In Figure 4(b), most of the variances of time series are included in the first 12 eigenvalues, which decrease and gradually form a plateau after the 12th eigenvalue. The eigenvalues 2-3, 5-6, 7-8, and 11-12 form pairs, respectively. Thus, the hourly hydrogeological time series corresponds to wells SW17 and SW18 in Group 1, which were reconstructed using the first 12 RCs. The daily cumulative hydrogeological time series corresponds to wells SW17 and SW18 in Group 2, which were reconstructed using the first 9 RCs, in which RCs 2-3, 6-7, and 8-9 are sets of periodic signals, respectively (Figures 5(a) and 5(b). Figures 4(c) and 5(c) show reconstructions of the hydrogeological time series corresponding to wells SW17 and SW18 in Group 1 and Group 2, respectively. This illustrates that the time series are well recovered, given the correlation between different time series (Figures 4(c) and 5(c)).

Figure 6 shows the wavelet analysis and the MSSA-WA of the hourly hydrogeological time series (the tides, and the hourly GWL for wells SW17 and SW18 in Group 1). For the wavelet analysis, there are intermittent significant regions on the half-day and one-day time scales in the CWT of the GWL for well SW17. In addition, few significant regions were observed in the well SW18 (Figure 6(a)). The XWT between the tides and GWLs for the well SW17 has continuous significant regions at the half-day time scale, across the entire transect, and intermittent significant regions on a one-day time scale (Figure 6(c)). The XWT of the tides and GWLs for the well SW18 within the 5% significant regions are mostly concentrated before February 2016 (Figure 6(c)). Compared to the wavelet analysis, the MSSA-WA can show substantially more information, especially for the well SW18 in the clay layer (Figure 6). The MSSA-WA has continuous significant regions on half-day time scales, and one-day periodicity was found with the naked eye over the whole study period. In addition, the power of the SW18 (relatively moderate) is weaker than that of the SW17 (relatively strong) in Figures 6(b) and 6(d), respectively. As shown in Figure 7, the MSSA-WA shows continuous significant regions at larger than 8-day time scales, when compared to the wavelet analysis. The calculation of the daily accumulation may affect the periodic relationship between the tides and GWL compared to Figure 6. The hourly hydrogeological time series of Group 1 are chosen for the relatively accurate study of the relationship between the tides and GWLs [74]. However, because this study only had the data on the daily accumulated rainfall, the daily accumulated hydrogeological time series of Group 2 were used in order to analyze the relationship between rainfall and GWL. Overall, Figure 6 shows that half-day and one-day periods are the main periodic components. The influence of tides on the GWL in the SW17 is greater than that in SW18. Furthermore, the findings showcased in Figures 6 and 7 that the MSSA-WA can display more useful components hidden by the noise.

3.2. MSSA-WA between Influencing Variables and Groundwater Level Responses

Figure 8(a) shows the reconstruction of the hourly GWL of Group 1 and the MSSA-WA from the hourly hydrogeological time series, including the tides and the hourly groundwater time series from wells (6 groups) of Group 1, as shown in Figure 9. For the upper filled layer, there are generally two periods during the entire study period: half-day and one-day (Figure 9(a)) which correspond to the regular half-day and one-day changes in the time series of GWLs of the filled layer shown in Figure 8(a). The MSSA-WA between the SW19 and the tides has the highest power at a half-day period from 2015 to 2017 (Figure 9(a)). This is related to the presence of relative peaks in the groundwater time series of the SW19 from 2015 to 2017 (Figure 8(a)). In Figure 9(a), the strong power of the SW13 and SW14 with tide at half-day is lower than SW19, because the large peaks in the groundwater time series of SW13 and SW14 are less (Figure 8(a)). As shown in Figure 9(a), the MSSA-WA of the SW17 with tides has a relatively strong power at half-day that is approximately similar to the SW15. However, it has a stronger power at one-day when compared to the SW15, which is consistent with the GWL time series from SW15 and SW17 (Figure 8(a)). The MSSA-WA of the SW25 shown in Figure 9(a) indicates relatively weak power, which corresponds to the variations in their reconstructed series (Figure 8(a)).

The MSSA-WA of the hourly hydrogeological time series in the clay layer also has period variations at half-day and one-day, but with weaker degrees compared to the upper filled layer (Figure 9(b)). The power of the SW22 is moderate at half-day and one-day, and that of the SW21 is almost moderate, but only slightly weaker (Figure 9(b)). SW17 has similar variations as SW18 (relatively moderate at half-day) but has sharper degrees (Figures 8(a) and 9(b)). The MSSA-WA of the SW16 and the SW20 with tides, showcased in Figure 9(b), both have relatively week power, which agrees with the variations in the SW16 and the SW20 in Figure 8(a). The power of the SW26 with tides at the half-day and one-day periods is the lowest and is not in the 5% of the significant regions of the entire study period (Figure 9(b)), which corresponds to the very weak variations of their reconstructed series in the half-day and one-day periods (Figure 8(a)).

As illustrated in Figure 9, there are two main relative phase relationships between the tides and the GWLs over the periods of half-day and one-day, as shown by arrows: (1) the in-phase pointing right at half-day: SW15, SW17, and SW19; (2) the antiphase pointing left at half-day: SW13, SW14, SW18, and SW25; (3) the in-phase pointing right at one-day: SW19; and (4) the antiphase pointing left at one-day: SW13-SW16, SW20, SW25, and SW26. The direction of the variable arrows at different times indicates that the GWLs have different response times for tides at different times.

Figure 8(b) illustrates the reconstruction of the daily cumulative GWL of Group 2 and the MSSA-WA of the rainfall and daily GWLs (6 groups) in Group 2 (Figure 10). In Figure 10, the power of SW13, SW14, SW17, SW18, SW25, and SW26 with rainfall at 5% significance levels all have relatively strong to strong power in a period of about 8 to 64 days: (1) the MSSA-WA of SW13, SW14, SW17, and SW18 correspond to approximately the same variations of their reconstructed series in Figure 8(b), and (2) the MSSA-WA of SW25 and SW26 are consistent with the findings that the variations in the SW25 and the SW26 are roughly similar, but have different degrees (Figure 8(b)). As shown in Figure 10, there are common power in the SW15, SW16, SW19, and SW20 at the 5% significance level, i.e., around 6 to 8 days and 10 to 16 days, respectively, but have different degrees: (1) relatively moderate and relatively strong in the SW15; (2) relatively strong and strong in the SW16; (3) relatively moderate and relatively strong in the SW19; and (4) relatively strong and strong in the SW20. SW15, SW19, and SW20 also have other large period components within the 5% significance level: (1) about 36 to 76 days in the SW15 (strong); (2) about 24 to 65 days in the SW19 (strong); and (3) about 32 to 44 days in the SW20 (strong) (Figure 10). In addition, the common power of the SW21 (relatively moderate) and the SW22 (relatively strong) at the 5% significant level are from 5 to 8 days and from 13 to 15 days. There are also two other periods with strong power in the SW21: around 33 to 44 days and more than 50 days. Those variations in their reconstructed time series which correspond to these findings can also be seen in Figure 8(b).

For SW13, SW14, SW17, SW18, SW25, and SW26 in Figure 10, the relative phase relationships between the rainfall and GWLs over the periods in the 5% significance regions are in-phase. For the SW15 and the SW19, phase relationships in the 5% significance regions are almost in-phase, and the reverse phase occurs in few regions (Figure 10). In Figure 10, the SW16 and the SW20 are in-phase with the rainfall during the period of around 6 to 8 days; the phase relationship of the SW16 is changing from the in-phase to the antiphase in other significant regions, and that of the SW20 is in the antiphase during the period of about 32 to 44 days. The SW21 and the SW22 have the same phase relationships with rainfall at the 5% significant regions. The SW21 and the SW22 are in the antiphases with the rainfall at about 5 to 8 days period, and the other significant regions are in-phase (Figure 10). The GWLs have different response times for rainfall at different times (directions of the variable arrows at different times).

3.3. The Lag Analysis between Influence Variables and Groundwater Level Responses

The cumulative departure transformation was used to reconstruct the rainfall time series with zero values in order to eliminate the effect of zero value and make the reconstructed rainfall data consistent with the reconstructed GWL. Further, the detrending was performed by subtracting a regression-fitted low-order (cubic) polynomial from the cumulative departure series of the reconstructed rainfall and other reconstructed hydrogeological time series. Detrending removes periods with potential impacts of human activities and the lowest frequency cycles. Finally, the historical mean and variance were used to standardize the detrended time series for the statistical comparison among various data types, which is called the normalized departures (unitless) [75]. The preprocessed time series in the study area are illustrated in Figure 11.

The lag correlation was applied after the preprocessing of the composite RCs from the hydrogeological time series. Ocean tides increased or decreased the load of the confined aquifer and made the GWL rise and fall accordingly. This makes the positive correlation between them interesting for the study. Additionally, as documented by Tremblay et al. (2011) [76], increased rainfall resulted in the rising of the GWLs, and hence, the positive correlation between these two variables was considered. The composite RCs from the groundwater time series were almost related to the corresponding influencing variables at the 95% confidence interval. Figure 12(a) shows the maximum positive lag correlation coefficients and the corresponding lag time between the GWL time series and the tide time series in 0-1 days, 0-1 months, and 0-2 months. For the GWLs and tides in the upper aquifer, the GWL in the SW19 has the strongest lag correlation (0.516 in 0-1 days, and 0.530 in 0-1 months and 0-2 months) with the tide (Figure 12(a)) which can be attributed to the SW19 being closest to the sea (Figure 1). For other wells outside of the control point, the lag time decreases because of it being closer to the coast on the time scale of 0-1 days, and the lag correlation coefficient shows almost the opposite tendency with decreasing-distance from the shore (Figure 12(a)). For wells SW13 and SW14, the lag correlation coefficients (0.193) and the corresponding lag time (4 hours) are approximately the same in three time periods (Figure 12(a)), which may be a consequence of a recharge from the West River. We then concluded that the correlations of the GWL with tides in the clay aquifer have longer phase lags than those in the upper aquifer. The phase lags in the clay aquifer have statistically lower positive correlation coefficients compared to the upper aquifer (Figure 12). In Figure 12(a), the lag correlation coefficient of the GWL in the SW18 with the tide is the highest (0.117 for 0-1 days, and 0.120 for 0-1 months and 0-2 months), and the lag time is relatively short (4 hours for 0-1 days, and 53 hours for 0-1 months and 0-2 months), which corresponds to a similar trend of changes in the GWLs in the SW18 and the SW17 (Figure 8(a)). Compared to the GWL in other wells of the clay layers, the lag correlation coefficient in the SW21 is relatively large (0.073 for 0-1 days, and 0.074 for 0-1 months and 0-2 months), and the lag time is relatively short (0 hours for 0-1 days, and 178 hours for 0-1 months and 0-2 months) (Figure 12(a)). One of the explanations for this occurrence is the fact that the well SW21 monitors the upper clay layer and is close to the coast (Figure 2). SW22 is not above the 95% significance level in the time scale of 0-1 days (Figure 12(a)); i.e., groundwater in this time period reacts poorly to tides. Out of all the wells located in the clay aquifers, the GWL in the SW22 with the tides has the longest lag time (538 hours in 0-1 months and 1319 hours in 0-2 months), and the lag correlation coefficients between them are relatively weak (0.040 in 0-1 months and 0.053 in 0-2 months) (Figure 12(a)). This could be explained by the fact that the SW22 monitors the lower clay layer, and the thickness of the top filled layer at the SW22 is the thinnest (Figure 2). Although the SW25 and the SW26 are located at the impervious boundary, they have lag correlation and delay with the tides (Figure 12(a)), which indicate that tidal fluctuations may affect groundwater change through the dam pipe.

The lag correlation coefficient and the phase lag time between the rainfall and the GWL in 0-1 months and 0-2 months in the area were calculated. Correlations with the rainfall have shorter phase lags than those with tides. In addition to being shorter, the rainfall phase lags have almost statistically higher maximum positive correlation coefficients (Figure 12). As for the filled layer, the response time of the GWL is relatively short, ranging from 2 days to 4 days, and the correlation coefficient with rainfall is relatively high, ranging from 0.831 to 0.948 (Figure 12(b)), which shows that the filled layer has good permeability. For most of the clay layer, the maximum positive correlation coefficient of the GWL with the rainfall is smaller, and the corresponding phase lag time is longer when compared to the GWLs in the upper layer (Figure 12(b)). The lag correlation coefficient (0.954) and the lag time (3 days) of the SW18 with rainfall are approximately the same as those of the filled layer (Figure 12(b)), which coincides with a similar trend of changes in GWLs in SW18 and SW17 in Figure 8(b). SW26 also has a lag correlation coefficient (0.864) and lag time (4 days) similar to those in the upper layer (Figure 12(b)), which is consistent with the similar change of the GWL in the SW26 as that in the SW25 with a smaller amplitude, as shown in Figure 8(b). For the SW21 with rainfall, the lag time (4 days) is relatively short, while the lag correlation coefficient (0.668) is relatively high (Figure 12(b)), which may be explained by the findings showing that the upper layer is thin and that the position of the screen is close to the top plate of the clay layer (Figure 2). For the SW22 with rainfall, the lag time (26 days in 1 months and 60 days in 2 months) is relatively long, and the lag correlation coefficient (0.313 in 1 months and 0.388 in 2 months) is almost minimum (Figure 12(b)), which is probably the result of SW22 monitoring the lower clay layer (Figure 2).

4.1. Advantages of MSSA-WA Mixture Model

The wavelet analysis is poor for the GWL in the clay layer or the control point that is weakly affected by the tide, which usually shows weak or almost no periodic relationship. In addition, for rainfall and GWLs, most of the results of wavelet analysis have poor analysis effect, which is most likely a consequence of the chaotic nature of the precipitation events [76]. However, these situations, which cannot be observed in the presence of noise and other factors, could be analyzed through the MSSA-WA and are shown as the continuous periodic components corresponding to influence variables (tides or rainfall) in Figures 9 and 10, respectively. The MSSA draws more attention to the common influence among multiple time series [61], and the wavelet analysis decomposes the time series into time-frequency domain in order to determine the main change patterns and the changes of these patterns in time [57]. To sum up, the MSSA-WA can remove most of the noise based on maintaining the correlations of multiple variables and can better reconstruct the original time series. Therefore, the MSSA-WA can not only show a correlation that is difficult to capture using wavelet analysis in the time-frequency domain, but also can better analyze the period and phase relationships between different hydrogeological time series.

4.2. Analysis of the Results of MSSA-WA and Lag Correlation

The MSSA-WA, the wavelet analysis and lag correlation show that, in general, both tide and rainfall have correlations and time delays with GWLs in the study area. The MSSA-WA results show that the main oscillations between the tides and the GWL occur in half-day or one-day periods, and there are periods greater than four days between rainfall and GWL. In the filled layer, the lag correlations reveal that the GWLs with tides almost reach the maximum values of lag correlation coefficients in the time scale of 0-1 days (from 0.016 to 0.530), and there is little change in the time scale of 0-1 months or 0-2 months. The maximum lag correlation coefficients between rainfall and GWL for 2-4 days range from 0.831 to 0.948, respectively. In the clay layer, the lag correlations show that tides and GWL almost reach the maximum lag correlation coefficients in the time period of 0-1 months, and the SW22 reaches that in 0-2 months, ranging from 0.037 to 0.120. GWLs with rainfall almost reach the maximum lag correlation coefficients in 0-1 months, and the SW22 reaches that in 0-2 months, ranging from 0.388 to 0.954. This indicates that the correlation coefficients of most of the GWLs have a high correlation coefficient with rainfall, and that the lag time is shorter. The MSSA-WA results of GWL and rainfall have a higher power spectrum compared to the tide level in the study area. This in turn may reflect the fact that the impact of the rainfall on the GWL is stronger than the impact of tides, so it may be concluded that rainfall is the most important factor in this area. In addition, compared to the GWL in the clay layer, the influence variable has a greater impact on the filled layer. A small correlation coefficient between the GWL and rainfall in the clay layer may indicate that the rainfall signal is reduced by entering the system and passing through the clay layer, which corresponds to the results of the Lee and Lee (2000) on the recharge mechanism of the fractured bedrock aquifer system [77]. In most clay layers, filtration of the aeration zone may be the reason for a long response time of the GWL to rainfall in the study area. The MSSA-WA period correlations and the lag correlations of the GWLs with influence variables (tides and rainfall) are higher in the upper layer and in the wells SW18 and SW21 of the clay layer, which shows that the change patterns of the influence variables and the corresponding GWLs are almost identical. Therefore, the GWLs in the filled layer and the wells SW18 and SW21 can be predicted by changing influence variables in this area.

4.3. Analysis of Influencing Factors between Groundwater Level and Influencing Variables (Ocean Tide and Rainfall)

Permeability varies significantly on different types of lithology. The lithology of a coarse particle with good permeability favors infiltration recharge. On the contrary, infiltration recharge is not conducive. Therefore, lithology may be a major factor influencing GWL response to influence variables of different aquifers. In the filled layer, (1) the layer consists of granite blocks and gravel of the nearby mountain, which possess good permeability; and (2) in general, the dynamics of the groundwater near the coast are more affected by tides. Therefore, for tides and the GWL, the lithology of the filled layer and the distance from the shore are the main factors that affect the relationships between two variables in the area. As for the rainfall and the GWL, the lithology of the upper layer and the nature of the vadose zone may be important factors that influence the relations between them in the region. In the clay layer, (1) the layer consists of muddy silty clay with low permeability; (2) the results of the MSSA-WA and lag correlation for the SW18 with the influence variable in the clay layer are approximately the same as those for the SW17 in the filled layer, which can be explained by Zhang et al. (2019) [14] for the same area, which noted the existence of the preferential flow paths between the filled and clay layers. As documented by Jan et al. (2013) [78], macropores and priority flow paths have an important impact on rapid response; and (3) based on the MSSA-WA and lag correlation, it was observed that the SW26 has a weak response to tides, which may be attributed to the SW26 being relatively far from the coast and located in the clay layer. The response of the SW26 to rainfall is similar to that of the filled layer, which may be due to the screens closing to the gentle top plate of the clay layer on the SW26. Therefore, for GWLs with tides, these may indicate that the lithology of the clay layer, the distance from the shore, the preferred flow path, and the pressure load are factors that affect the response of the GWL to tides in this area, and as for the rainfall and GWL, the lithology of the clay layer and the topography may be the key influencing factors.

4.4. Limitations

In the coastal areas, the main influencing factors of the variations in the GWL and the evolution relationships of the GWS were analyzed, so as to better predict and alert the residents to the changes of the GWL in accordance with the main factors of the study area. The results of this study can also be used for the management of the GWS. However, the variations in the groundwater were not only influenced by one factor, but also influenced by multiple factors. Therefore, in order to manage the groundwater in coastal aquifers, many factors were considered in great detail. Moreover, in this paper, the length of the original rainfall time series is relatively small. Had there been more data, more information could have been inferred, and other results could have been analyzed, which could not be seen due to the presence of noise.

MSSA extracts multiple time sequences while maintaining spatial correlation, which not only has higher signal processing efficiency, but also extracts as much information as possible and minimizes noise interference. Wavelet transform extends time series to time-frequency space, in which the CWT provides the period characteristics and the XWT can analyze the resonance period and phase relationship between two time series. However, to some extent, the relatively short data length or the calculation of daily accumulation will affect the wavelet analysis results between influencing variables and GWL. The filtering effect and the chaos of rainfall will also lead to the poor effect of wavelet analysis. For the poor effect of wavelet analysis, the results of MSSA-WA analysis are superior to those of wavelet analysis.

Variations in GWL have correlations and lags to the changes of tide and rainfall. In MSSA-WA, the similar periodicities indicate that the GWLs are largely subjected to the variations of tide and rainfall in the study area. The lag correlations determine the lag with the greatest correlation in a well, which can indicate how long tide and rainfall will not have a significant impact on the GWL in the well. The results of MSSA-WA and lag correlations reveales that there are stronger power spectrums, larger lag correlation coefficients, and shorter lags between most GWL and influence variables (rainfall and tide) in the filled layer than those in the clay layer, that is, the influence of rainfall and tide on GWL in the upper layer is relatively high and stronger than that in the clay layer. Similarly, the effect of rainfall on GWL is higher than that on tide level. Moreover, GWLs in the upper layer and those of wells SW18 and SW21 are greatly affected by rainfall and tide in this area, which indicate that these can be simply predicted by the change of rainfall and tide.

Combined with MSSA-WA, lag correlations, and the response mechanism of GWL to influencing variables (tide and rainfall), it can be seen that different aquifer lithology, offshore distance, preferential flow path, and pressure load may be the factors affecting the relationships between tide and GWL. In addition, the lithology of different aquifers, the nature of vadose zone, and topography may be the influencing factors between rainfall and GWL.

The data used to support the findings of the study are available within the article.

The authors declare no conflicts of interest.

This research was partially supported by the Key Research and Development Program of China National Natural Science Foundation of China (2019YFC1804300) and the Natural Science Foundation of Jiangsu Province (No. BK20211208).