## Abstract

The empirical interpretation of cone penetration test (CPT) cone factors (*N*_{k}) can be subject to considerable variability for clays derived from weathered mudstones, leading to significant deviations in the estimation of undrained shear strength (*S*_{u}). This paper presents a comparison of triaxial and CPT data from a site investigation in clays derived from weathered mudstones in central England. Corrected cone factors (*N*_{kt,UU}) were derived from a one-to-one comparison of 94 pairs of unconsolidated, undrained triaxial and CPT data from equivalent depths. The performance of the cone factors was evaluated using a training set (75 pairs) and a test set (19 pairs). A parametric study was used to explore the variability of *N*_{kt,UU}, quantified using the coefficient of variation (COV_{Nkt,UU}), for varied separation distance thresholds (*D*_{s}) between individual triaxial and CPT data. The absolute deviation between the laboratory shear strength (*S*_{u(Lab)}) and that predicted from CPT profiles (S_{u(CPT)}) was not sensitive to *N*_{kt,UU} values in the range 25 < *N*_{kt,UU} < 31. The parametric study showed that *D*_{s} could be increased from 50 to 250 m, to include more data pairs for estimates of *N*_{kt,UU}, without substantially increasing COV_{Nkt,UU}.

The cone penetration test (CPT) is an *in situ* test that produces continuous measurement profiles to assess and characterize subsurface conditions. Empirical relationships can be used to relate CPT measurements to the engineering properties of soils and weak or weathered rocks, and extensive research has focused on developing and calibrating such relationships (Mayne and Kemper 1988; Robertson 2009; Rémai 2013; Cheshomi 2018; Mayne and Peuchen 2018; Bol *et al.* 2019; Pieczyńska-Kozłowska *et al.* 2021). The cone factor, *N*_{k} or *N*_{kt} (uncorrected or corrected) has been used to relate triaxial measurements of undrained shear strength, *S*_{u}, to the cone tip resistance of CPTs in normally and overconsolidated clays.

Laboratory-based calibration procedures for determining cone factors reveal several variables that affect *N*_{k} values including testing and sampling methods, the direction of loading, strain rate, boundary conditions, stress level and disturbance effects (Kulhawy and Mayne 1990). Shear strength measurements from unconsolidated, undrained (UU) and isotropically consolidated undrained (CIU) triaxial compression tests can be affected by sample disturbance, shearing rate and anisotropy. It is therefore preferable to use shear strength data from anisotropically consolidated undrained (CAU or CK_{0}U) triaxial compression tests for detailed design (Ladd and DeGroot 2003). For high-quality samples, UU measurements can be 25–50% above the CAU-measured average undrained shear strength. For low-quality samples, UU measurements can be 25–50% below the CAU-measured average undrained shear strength (Germaine and Ladd 1988; Ladd and DeGroot 2003). Hence published cone factors are often calibrated using shear strength data from anisotropically consolidated undrained (CAU or CK_{0}U) triaxial compression tests (Allievi *et al.* 2018; L'Heureux *et al.* 2018; Mayne and Peuchen 2018), although they can be limited in number.

Unconsolidated, undrained (UU) triaxial tests are widely used for routine site investigation and preliminary site characterization. These are generally lower in quality than CAU triaxial compression tests and they require interpretation or calibration (e.g. to account for sample quality) to obtain an average undrained shear strength that is useful for detailed design. However, they are often far more numerous in a typical site investigation than CAU triaxial tests. The use of UU triaxial data offers an opportunity to derive cone factors from a relatively large number of triaxial tests (e.g. >80 tests in the study by Bol *et al.* 2019) and explore the variability of the derived cone factors, while acknowledging that both UU and CAU triaxial tests do not fully characterize the *in situ* strength of fissured clay samples (e.g. Skempton *et al.* 1969; Marsland 1971; Vitone and Cotecchia 2011).

Cone factor variability depends on the soil type, with cone factors derived for stiff fissured clays showing greater variability than for intact, non-fissured materials (Mayne and Peuchen 2018). Clay mixtures derived from weathered mudstone include silty clays, clays and stiff clays, and are likely to have highly variable cone factors. This study considers CPT and UU triaxial data from a large ground investigation, along a 28 km length, in clay mixtures derived from weathered mudstones. The aims of this paper are as follows: (1) to derive cone factors (*N*_{kt,UU}) from *in situ* CPTs and unconsolidated, undrained (UU) triaxial tests; (2) to quantify the coefficient of variation of the cone factors (COV_{Nkt,UU}) for a range of separation distance thresholds (*D*_{s}) between pairs of CPTs and triaxial tests; (3) to measure the predictive performance of the undrained shear strength derived from CPT profiles (*S*_{u(CPT)}), for a range of cone factors. The large number of UU test data and their lateral extent provide an insight into the sensitivity and variability of cone factors across an outcrop of weathered mudstone materials. This can be used to inform the cone factors obtained from ground investigations and site characterization at other similar outcrops. However, it does not include cone factors for average undrained shear strength that might be obtained from comparison of CPT profiles and CAU triaxial tests.

## Materials and method

### The ground investigation data

A large ground investigation was undertaken in weathered clays and mudstones from the Whitby and Charmouth Mudstone Formations (Lias Group). The Whitby Mudstone and Charmouth Mudstone Formations were formerly known as the Upper Lias Clay and Lower Lias Clay and were formed 174–183 and 183–199 myr ago, respectively (Cox *et al.* 1999). They were formed predominantly from argillaceous sediments deposited within shallow seas, leading to remarkably uniform sediment sequences in many areas (Fig. 1; Hobbs *et al.* 2012). However, there was significant ground disturbance at the near surface. Briggs *et al.* (2022) showed that the Charmouth Mudstone is weathered as a result of glacial and periglacial conditions (up to *c*. 12 mbgl (metres below ground level)) and contemporary weathering (up to 4 mbgl). Weathered materials were described according to the Norbury (2020) weathering classification as partially weathered (Class Bb), distinctly weathered (Class C), destructured (Class D) and reworked (Class E) clays. They ranged from very stiff fissured, sheared bluish grey clays (Class Bb) to stiff light grey and light brown mottled clays (Class E), with a minimum discontinuity spacing ranging from very close (20–60 mm) to extremely close (<20 mm). The close discontinuity spacing of the weathered samples (Classes Bb–E) resulted in their predominantly ductile failure during UU triaxial tests. This led to less variability in the undrained shear strength measurements of the weathered clay samples than in those for the weathered and unweathered mudstone samples (Classes A and Ba), which had a wider discontinuity spacing (60–2000 mm) and a predominantly brittle failure mode (Briggs *et al.* 2022).

Data were collected from a commercial site investigation for the High Speed Two (HS2) railway near Banbury, England. Figure 1 shows the location of the CPT profiles and triaxial samples that were selected for analysis, after pre-processing. The site investigation included the whole 28 km cross-sectional length of the Whitby and Charmouth Mudstone Formation (Lias Group) outcrop at this location. Standpipe piezometer data (not shown) indicated a groundwater level of 0.5–1.0 mbgl across the site.

Unconsolidated, undrained (UU) triaxial shear strength in compression data were obtained from 213 tests conducted following BS1377-7:1990. The samples were collected from exploratory cable percussive drilling (118), rotary coring (91) and windowless sampler drilling (four). The samples were obtained from weathered clay layers and described in the borehole strata descriptions in accordance with BS EN 14689-1:1998 as high-strength stiff and very stiff clays, with extremely close (<20 mm) discontinuity spacing (Briggs *et al.* 2022). The tests were selected from those conducted on undisturbed Class 1 and Class 2 samples (BS EN 1997-2:2007 and BS EN ISO 22475-1:2021) from cores and open drive samplers with a sample diameter of ∼100 mm and a 2:1 length–diameter ratio. The samples were tested at cell pressures corresponding to 1–2 times the estimated *in situ* total vertical stress. The triaxial data were filtered to include clay soils (*S*_{u} <300 kPa) and exclude mudstones (*S*_{u} >300 kPa) (BS EN ISO 14689-1:2018; BS EN ISO 14688-2:2018; Norbury 2020). Soil classification data, including moisture content (*w*, %), liquid limit (*w*_{LL}, %) and plastic limit (*w*_{PL}, %), were obtained from 115 of the triaxial samples. These data plotted above the A-line on a Casagrande Chart (Fig. 2), showing that the material is a clay/silt of intermediate to high plasticity and compressibility, in agreement with data from other weathered clays in the Charmouth and Whitby Mudstone Formations (Hobbs *et al.* 2012; Briggs *et al.* 2022).

*c*. 14 mbgl. The CPTs were conducted using a 20.5 tonne track-truck mounted CPT unit (UK3) equipped with a 17 tonne capacity hydraulic ramset. This used an electric penetrometer conforming to the requirements of BS EN ISO 22476-1:2012. The measurements included the cone tip resistance (

*q*

_{c}), friction sleeve resistance (

*f*

_{s}) and dynamic porewater pressure (

*u*

_{2}), sampled at a 10 mm resolution (also known as CPTu). Empirical correlations are generally used to interpret soil parameters such as strength, permeability and stiffness from CPT data. The undrained shear strength (

*S*

_{u}) of soils and rocks can be estimated using the corrected cone resistance

*q*

_{t}, refining the Mayne and Kemper (1988) relationship to account for the corrected cone resistance as

*S*

_{u(CPT)}is the undrained shear strength derived from the CPT,

*q*

_{t}is the corrected cone resistance,

*N*

_{kt}is an empirical cone factor and $\sigma vo$ is the

*in situ*vertical total stress. Bol

*et al.*(2019) reported cone factors derived from UU triaxial tests as between 15 and 21 for normally consolidated clays and between 22 and 30 for stiff, fissured overconsolidated clays. Mayne and Peuchen (2018) derived typical cone factor values from CAU triaxial tests of 25 for fissured, overconsolidated clays and 14 for overconsolidated, intact clays. It should be noted that the Mayne and Peuchen (2018) cone factor values from CAU triaxial tests would differ from those derived from UU triaxial tests.

### Pre-processing of the ground investigation data

For consistency, the CPT data were filtered to extract profiles of exactly 10 m length and ensure that each profile contained the same number of data points. Based on the soil classification chart (Robertson 1990) and the soil behaviour type index, *I*_{c} (Robertson and Wride 1998), the CPT profiles showed soil types varying from silt, sands and gravels (i.e. non-clay) to clay mixtures including silty clay, clay and very stiff clay. Clay mixtures in the CPT profiles were identified by *I*_{c} values greater than 2.6, for clay-like soils (Robertson 2010). The CPT data were also pre-processed to select homogeneous clay layers within the Charmouth and Whitby Mudstone Formations to ensure direct comparability with the clay triaxial samples (Sowers 1979). The CPT profiles were considered homogeneous when the coefficient of variation of the soil behaviour type index (COV_{Ic}) was less than 10% (Harr 1987; Uzielli *et al.* 2004; Tian and Sheng 2020). For the clay mixtures, 66 CPT profiles (with 31 586 data points) of 10 m length that met the homogeneity criteria were extracted for analysis. Figure 3 shows a typical CPT and *I*_{c} profile from the ground investigation. This shows that both the corrected cone tip resistance (*q*_{t}), and the sleeve friction (*f*_{s}) increased with depth, with a slight change in the profile gradients at approximately 5 mbgl. The *in situ* equilibrium porewater pressure measurement (*u*_{0}) showed a near-hydrostatic profile and the shoulder porewater pressure measurement (*u*_{2}) showed a negative profile that is typical for stiff clays. The soil behaviour type index (*I*_{c}) profile shows a mixture of fine-grained soil with layers of coarse-grained materials.

*I*

_{c}) was calculated as follows (Robertson and Wride 1998; Robertson 2009):

*q*

_{t}is the corrected cone tip resistance,

*f*

_{s}is the sleeve friction, and $\sigma vo$ and $\sigma vo\u2032$ are the

*in situ*vertical total and effective stress, respectively. The corrected cone tip resistance,

*q*

_{t}, is derived from the uncorrected cone resistance,

*q*

_{c}, using the expression

*q*

_{t}=

*q*

_{c}+ (1 –

*a*

_{n})

*u*

_{2}

*,*where

*a*

_{n}and

*u*

_{2}are the tip net area ratio (

*a*

_{n}= 0.79) and shoulder porewater pressure, respectively. The values

*Q*

_{t}and

*F*

_{r}are the normalized values of tip resistance and sleeve friction.

Table 1 summarizes the descriptive statistics for the pre-processed CPT and triaxial data. Both the undrained shear strength (*S*_{u}) and the corrected cone tip resistance (*q*_{t}) increased with depth, as shown in Figure 3.

Table 1 shows that the undrained shear strength (*S*_{u}) values varied from 36 to 270 kPa, corresponding to classifications for medium to very high shear strength fine soils (BS EN ISO 14688-2:2018) and medium to stiff clays (Das 2021). The results also showed a coefficient of variation (COV_{Su}) of 43%. This was towards the upper bound of COV_{Su} measurements in clay materials tested in undrained triaxial tests, where COV_{Su} is ∼11–49% (Phoon and Kulhawy 1999). The corrected tip resistance (*q*_{t}) showed a coefficient of variation of 45%. This was similar to the variation in the *S*_{u} values from the UU triaxial tests.

### Determination of the cone factor, *N*_{kt,UU}

*N*

_{kt}) are generally derived from CPT and triaxial measurements located at the same depth and at a close separation distance. However, this is not always possible during commercial ground investigations where the distance between CPTs and borehole samples may be large. The value of

*N*

_{kt,UU}was calculated from CPT and UU triaxial measurements by rearranging equation (1) as

*S*

_{u(Lab)}) was paired with the corrected cone tip resistance (

*q*

_{t}) from the nearest CPT profile (Cheshomi 2018; Bol

*et al.*2019). The CPT and UU triaxial data were combined to generate a single, one-to-one dataset comparing the triaxial tests with all CPT data points at the same corresponding depth (mbgl). Where multiple pairs occurred at the same depth, the single closest pair was selected for the one-to-one dataset. The triaxial and CPT data were paired by the shortest Euclidean distance. The one-to-one dataset consisted of 94 pairs of CPTs and UU triaxial tests in weathered clay mixtures (

*S*

_{u}<300 kPa), located between the ground surface and 10 mbgl.

The one-to-one dataset was randomly split into training and test sets using the holdout validation approach, with an 80:20 ratio. The holdout validation is an out-of-sample evaluation in which data are partitioned into a training set to fit a model and a test set, or holdout set, to validate the model (Sammut and Webb, 2017). Therefore, 80% of the dataset (i.e. 75 pairs) was used as a training set to determine the values of *N*_{kt,UU} and for a parametric study* _{.}* The remaining 20% of the dataset (i.e. 19 pairs) was used as a test set to measure the performance of the selected

*N*

_{kt,UU}values. This provided an unbiased estimate of the learning performance of the estimates (Ramasubramanian and Moolayil 2019).

### Influence of separation distance threshold, *D*_{s}, on the variability of *N*_{kt,UU}

A parametric study was conducted using the training set to explore the influence of the separation distance threshold (*D*_{s}) on the coefficient of variation of *N*_{kt,UU} (COV_{Nkt,UU}). The coefficient of variation (COV) is the ratio of standard deviation to the mean. It represents a measure of relative dispersion around a central tendency estimator (Uzielli *et al.* 2007). However, because it is based on the sample mean and standard deviation, outliers can adversely affect the COV, particularly when dealing with small (<30) sample sizes (Arachchige *et al.* 2022). The COV_{Nkt,UU} was compared for *D*_{s} between 10 and 800 m. A low threshold (e.g. *D*_{s} = 10 m) minimized both the separation distance of the pairs of CPTs and triaxial measurements and the number of pairs available for comparison. A greater threshold (e.g. *D*_{s} = 800 m) increased the allowable separation distance between the pairs of CPTs and triaxial measurements and increased the number of pairs available for comparison.

### Performance of *N*_{kt,UU}

*N*

_{kt,UU}) were derived from the training set of CPT and UU triaxial data using equation (5). A range of

*N*

_{kt,UU}values was used to derive individual undrained shear strength (

*S*

_{u(CPT)}) values from the CPT measurements using equation (1). Derived values were compared with the measured values using equation (6) for both (1) the training set (i.e. 75 pairs) and (2) the test set (i.e. 19 pairs). Equation (6) is written as

*S*

_{u(Lab)}is the undrained shear strength measured in the UU triaxial test,

*S*

_{u}_{(CPT)}is the undrained shear strength derived from selected values of

*N*

_{kt,UU}and |

*d*|% is the absolute deviation (%) in terms of

*S*

_{u(Lab)}. The variation of the absolute deviation (|

*d*|%) was calculated using probability thresholds, for a range of

*N*

_{kt,UU}values. These showed the probability of |

*d*|% falling into one of two categories, representing error margins of 30% and 50% (Bol

*et al.*2019). These were (1) |

*d*|% <30% and (2) |

*d*|% <50%.

## Results and discussion

### Derived cone factor, *N*_{kt,UU}, values

Figure 4a shows the *N*_{kt,UU} values for each of the 75 pairs of CPTs and UU triaxial tests in the training set (*D*_{s} = ∞). The results show scatter around the mean *N*_{kt,UU} = 26 with a high coefficient of variation of 40% (Harr (1987) defined high variability as COV >30%). The minimum and maximum individual *N*_{kt,UU} values were seven and 63, respectively. Although the range of individual *N*_{kt,UU} values is large, the results in Figure 4b show that the lowest and highest values of *N*_{kt,UU} have a low probability of occurrence, and that 90% of the data fall within the range 11 < *N*_{kt,UU} < 43 of the probability histogram.

### Influence of separation distance threshold, *D*_{s}, on the COV_{Nkt,UU} of *N*_{kt,UU}

Figure 5 shows the *N*_{kt,UU} and COV_{Nkt,UU} values for the comparison of triaxial and CPT data in the training set, at various separation distance thresholds (*D*_{s}). Results are shown for the whole training set (*D*_{s }= ∞) and for *D*_{s} between 10 and 800 m. Figure 5 shows that the COV_{Nkt,UU} increases as *D*_{s} increases, but it did not vary significantly for the range of *D*_{s} that was considered (10 m ≤ *D*_{s} ≤ 800 m). The lowest COV_{Nkt,UU} was for *D*_{s} of 10 m. However, small samples (*n* < 30) of CPT and triaxial pairs were obtained for this and other lower thresholds (*D*_{s} <200 m), making these more sensitive to outliers than the larger samples obtained with higher thresholds (*D*_{s} ≥ 200 m). A fairly constant COV_{Nkt,UU} was attained for *D*_{s} between 50 and 250 m. The magnitude of the derived *N*_{kt,UU} was also relatively constant (between 24 and 26) for the range of *D*_{s} that was considered. Figure 5 shows the coefficient of determination, **R*^{2}, associated with each average line (the regression through the origin, shown as a dashed line). The results show that the goodness of fit (**R*^{2}) decreases as the separation threshold (*D*_{s}) increases and reduces below **R*^{2} = 0.9 at *D*_{s} >500 m.

### Performance of *N*_{kt,UU}

Figure 6 shows the absolute deviation, |*d*|%, for the training set (75 pairs) and test set (19 pairs) for *N*_{kt,UU} values in the range 11 < *N*_{kt,UU} < 43, representing ∼90% of the *N*_{kt,UU} values shown in Figure 4.

Figure 6a shows that the mean absolute deviation (|*d*|%) of the undrained shear strength derived from the CPT profiles (S_{u(CPT)}) in the training set is lowest for *N*_{kt,UU} values between 25 and 31. Figure 6b shows that the mean absolute deviation (|*d*|%) is most likely to remain within the 30% and 50% error margins for *N*_{kt,UU} values between 25 and 31. Therefore Figure 6 shows an optimal range of *N*_{kt,UU} values between 25 and 31 for the training set.

The performance of *N*_{kt,UU} values in the optimal range (25–31) was evaluated using the test set (19 pairs). Figure 6a shows that the mean absolute deviation (|*d*|%) of the test set was ∼30% for *N*_{kt,UU} values between 25 and 31. Figure 6b shows that the probability of |*d*|% remaining within the 30% and 50% error margins using the test set is similar for *N*_{kt,UU} values between 25 and 31. These values are comparable with the mean absolute deviation (|*d*|%) and probability values derived from the training set (Fig. 6a).

## Implications for CPT interpretation in design

If the constraint S_{u(CPT)} < *S*_{u(Lab)} is desirable for design, the probability that this condition is met can be considered for a range of *N*_{kt,UU} values. Figure 7 shows the probability of S_{u(CPT)} < *S*_{u(Lab)} for a range of *N*_{kt,UU} values using the training set (75 pairs) and the test set (19 pairs). These values relate to UU triaxial tests.

Figure 7 shows close agreement between results from the training and testing sets. The higher *N*_{kt,UU} values have a greater probability that the undrained shear strength derived from the CPT profiles (S_{u(CPT)}) will be less than the undrained shear strength measured in the laboratory UU triaxial test (*S*_{u(Lab)}). It is worth noting that probability values close to 100%, although desirable, represent very conservative scenarios, whereas probability values close to 0% represent overestimations of undrained shear strength from UU triaxial tests.

## Conclusions

Cone factors (*N*_{kt,UU}) derived from a one-to-one comparison of UU triaxial *S*_{u} and CPT *q*_{t} data showed a mean *N*_{kt,UU} value of 26 in clay mixtures derived from weathered mudstones of the Charmouth and Whitby Mudstone Formations. The results were greater than the values generally reported for normally consolidated clays (15 < *N*_{kt,UU} < 21) and towards the upper limit of the reported range of 22 < *N*_{kt,UU} < 30 for stiff overconsolidated clays. Similar *N*_{kt} values (*N*_{kt} = 25) were derived by Mayne and Peuchen (2018) for fissured clays using CAU laboratory tests, although those researchers acknowledged the challenge of measuring the strength of such materials. For individual one-to-one comparisons, there was a wide range of *N*_{kt,UU} values (7 < *N*_{kt,UU} < 63), but most of the *N*_{kt,UU} values (∼90%) fell within the range 11 < *N*_{kt,UU} < 43. This reflects the varied composition of the clay mixture, including fissured clays, that form the weathered mudstone profile.

When deriving cone factors (*N*_{kt,UU}) from pairs of UU triaxial *S*_{u} and CPT *q*_{t} data located at equivalent depth, it is preferable to compare measurements in close proximity.

However, the results from a parametric study showed that the values of *N*_{kt,UU} and their coefficient of variation (COV_{Nkt,UU}) were not sensitive to greater separation distance thresholds (*D*_{s}), in the range 50 m ≤ *D*_{s} ≤ 250 m. The coefficient of determination (**R*^{2}) associated with the mean *N*_{kt,UU} was also greater than 0.9 for this range of separation distance thresholds. Therefore, when closely spaced data are not available, a greater separation distance between pairs of UU triaxial *S*_{u} and CPT *q*_{t} data can be used to derive values of *N*_{kt,UU}, without significantly increasing the variation in the data (COV_{Nkt,UU}).

A performance assessment showed that the absolute deviation between the laboratory UU triaxial shear strength (*S*_{u(Lab)}) and that predicted from CPT profiles (S_{u(CPT)}) was not sensitive to *N*_{kt,UU} values in the range 25 < *N*_{kt,UU} < 31 in these materials*.* The results were consistent for both the training and test sets. This shows that a range of *N*_{kt,UU} values can be used to derive S_{u(CPT)} from CPT profiles for clay mixtures derived from weathered mudstones while minimizing the absolute deviation between S_{u(CPT)} and *S*_{u(Lab)}.

## Acknowledgements

The authors thank C. Reale and the reviewers for their helpful comments and suggestions.

## Author contributions

**KMB**: conceptualization (lead), methodology (lead), writing – original draft (equal); **YTG**: data curation (lead), formal analysis (lead), investigation (lead), software (lead), writing – original draft (equal); **WP**: supervision (equal), writing – review & editing (equal); **SB**: writing – review & editing (equal); **NS**: supervision (equal), writing – review & editing (equal).

## Funding

This work was supported by the Royal Academy of Engineering and HS2 Ltd under the Senior Research Fellowship scheme (RCSRF1920\10\65). The data were provided by HS2 Ltd. This paper is an output from ACHILLES, an Engineering and Physical Sciences Research Council (EPSRC) programme grant led by Newcastle University (EP/R034575/1).

## Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

## Data availability

The data presented in this paper are available online via the University of Bath Research Data Archive and may be accessed at https://doi.org/10.15125/BATH-01162.