Marine sediments preserve archives of glacier behavior from many proxies, with lithofacies analysis providing direct evidence of glacial extent and dynamics. Many of these lithofacies have corresponding physical and geochemical properties that may be identified through quantitative, nondestructive logging properties. This study applies supervised and unsupervised classification to downcore logging data to attempt to model temperate glacimarine facies, which are independently identified via visual lithofacies analysis based on core photographs, digital X-radiography, and computed tomography scans. We test the limits of these methods by modeling both broad glacial and interglacial and small-scale variations in Late Pleistocene (<60,000 yr) glacier extent leading into the Holocene deglaciation for a temperate ice stream at Integrated Ocean Drilling Program Site U1419 in the Gulf of Alaska. Multi-meter–scale mud and diamict lithofacies interpreted as non-glacial versus glacial conditions can be modeled with both methods using downcore physical property logging data (b* color reflectance, magnetic susceptibility, and natural gamma-ray activity) augmented with scanning X-ray fluorescence (XRF) elemental abundance (Ca, Zr, Si, K, Rb, and Al). Physical properties are most useful for delineating decimeter-meter–scale variations in composition and clay content, whereas scanning XRF elements best capture differences in sand versus clay content and composition at decimeter-centimeter scales. Neither classification technique can model the observed small-scale variations in diamict facies using elemental abundance from higher-resolution scanning XRF or from physical properties. Comparison of unsupervised cluster model results with observed lithofacies allows for identification of three different glacial conditions at Site U1419—ice-proximal, fluctuating, and retreating. For small-scale variations in glacial extent, cluster model results are best used as complementary data to image-based lithofacies identification rather than as a replacement.

Variations in ice sheet and glacier extent are a dynamic component of Earth’s climate system. Study of glacial dynamics during major climate transitions helps establish ice sheet “tipping points” (Lenton et al., 2008; Kriegler et al., 2009) and for predicting how modern ice sheets and glaciers may respond to our warming climate. The marine sediment record provides a highly detailed archive of glacial dynamics through a combination of glacial extent proxies, such as lithofacies, biogenic carbonate chemistry, productivity, and mass accumulation rates (Andersen et al., 1996; Andrews and Principato, 2002; Naish et al., 2009; Ó Cofaigh et al., 2013). Of these proxies, lithofacies analysis provides one of the most direct records of glacial extent because spatial and temporal variations are recorded as spatial transitions in facies. The transition between glacial and interglacial states is interpreted from lithofacies across the globe (e.g., Henrich et al., 1989; Peterson et al., 2000), with the most obvious examples from glaciated continental margins (Ó Cofaigh and Dowdeswell, 2001). Open-water, nonglacial conditions most often result in accumulation of homogeneous, bioturbated mud with a varying biogenic component, whereas glacial conditions are recognized by heterolithic facies with abundant lonestones. In particular, temperate glacimarine strata contain evidence of abundant ice rafting, gravity flows, and meltwater-derived sediment that provide a wide potential range of distinctive lithofacies throughout a glacial-interglacial cycle (Powell and Cooper, 2002) (Fig. 1).

Glacimarine lithofacies are typically recognized in outcrop and core through visual lithostratigraphic logging, which can be augmented using more quantitative methods. Visual core descriptions (VCDs) are based on sediment texture, composition, color, and physical and biogenic sedimentary structures. Computed tomography (CT) scan images and X-radiographs contribute details on subtle features within cores (Cnudde and Boone, 2013). Although CT imagery is ideal for identification and description of a myriad of glacimarine sedimentary features (Boespflug et al., 1995; Gagnoud et al., 2009; St-Onge and Long, 2009; Davies et al., 2011; Tarplee et al., 2011), this technology is expensive and/or of restricted availability. Fortunately, many of the same visual features also have corresponding physical and geochemical properties that may be identified through quantitative nondestructive logging properties such as natural gamma radiation, color reflectance, bulk density, magnetic susceptibility, elemental concentration, etc. Logging data have been used to identify major end-member lithologies through supervised (e.g., support vector machine [SVM]; Hall, 2016) and unsupervised classification techniques (e.g., cluster analysis) for continental shelf environments (Inwood et al., 2013), trench sediments (Tudge et al., 2009), and glacial sediments (Williams et al., 2012; Hunze et al., 2013). Classification techniques using multivariate data are powerful analytical tools in the geosciences because they provide a quantitative, objective, and reproducible approach to lithofacies analysis (Davis, 2002; Dubois et al., 2007).

In this study, we attempt to model temperate glacimarine lithofacies using supervised and unsupervised classification techniques applied to downcore logging data. Model results are validated through comparison to lithofacies identified from a visual facies analysis of core photographs, CT scans, and X-radiographs. The goal is to test the limits of these classification techniques in identifying lithofacies representing both glacial and interglacial, and small-scale variations in glacier extent within heterogeneous diamict for a temperate glacial setting in the Gulf of Alaska, Integrated Ocean Drilling Program (IODP) Site U1419 (Fig. 2). Based on the shipboard age model, IODP Site U1419 extends back to the Late Pleistocene (<60,000 yr; Jaeger et al., 2014) and contains a last glacial maximum (LGM) to Holocene deglaciation sequence based on comparison with a neighboring well-dated core (Davies et al., 2011). These modeling techniques may identify subtle lithofacies variations and provide an alternative to using costly CT scans and/or X-radiographs to highlight features not otherwise observable from split core surfaces during VCD. Integration of scanning XRF data also informs on downcore compositional changes that are geochemical rather than visual and are typically observed in smear slides. Classification analyses also present a reproducible approach to categorizing core properties. A secondary objective is to test how different types of core logging data (i.e., physical properties versus elemental composition) and classification routine (e.g., supervised discriminant and SVM; unsupervised hierarchical versus model clustering) affect the classification analysis results. Previous studies using cluster analysis to identify glacial lithofacies used wireline downhole logging data (Williams et al., 2012; Hunze et al., 2013), which are not always collected in association with coring. This study tests the ability of logging data that are commonly collected on sediment cores (i.e., magnetic susceptibility, natural gamma radiation, color, scanning X-ray fluorescence) to predict lithofacies. When applying classification techniques to geologic data, there is still ambiguity over which method provides the most valid results (Templ et al., 2008); thus this study applies common classification techniques to evaluate their utility in glacimarine studies.

CT Imagery and Digital X-Radiography

Computed tomography (CT) scan images and X-radiographs of sediment cores allow for a detailed, nondestructive view of the entire core (e.g., St-Onge et al., 2007); this view is useful for identifying textural features such as sedimentary structures (Van Daele et al., 2014), bioturbation (Gagnoud et al., 2009), variations in sediment density (Ashi, 1997), and ice-rafted debris (IRD; Andrews and Principato, 2002; Jennings et al., 2018). Computed tomography scan images are cross-sectional slices of the core volume (Ketcham, 2005); while X-radiographs present an integrated view of the whole volume imaged.

U-channels from Site U1419 composite splice sections (Jaeger et al., 2014) were scanned at the Institut National de Recherche Scientifique–Eau Terre Environnement (INRS-ETE, Quebec City, Canada) using a Siemens SOMATOM Definition AS+ 128 CAT-scanner with a current range of 300 mA and a voltage of 140 kV. X-radiograph images were collected on the archive halves of Site U1419 at the IODP Gulf Coast Repository at Texas A&M University using a digital X-ray panel and veterinary X-ray unit (Walsh et al., 2014) on core sections that required additional imaging of a wider field of view to better discriminate features.

Downcore Logging

Downcore logging data are collected directly on whole and half-round core sections. This study uses downcore logging data commonly collected during and after IODP expeditions: physical properties and scanning X-ray fluorescence (XRF) elemental data.

Physical Properties

Physical properties were rapidly measured shipboard to relate with core lithology (Jaeger et al., 2014). Three of these properties—b* color reflectance, volume-normalized magnetic susceptibility (MS), and volume-normalized natural gamma radiation (NGR)—are used in this study as cluster model inputs. Magnetic susceptibility in a sediment core is a measure of how magnetized the sediment becomes when exposed to an external magnetic field (Hatfield et al., 2013). It is often used as a proxy for changes in sediment composition and lithology (Blum, 1997; Jessen et al., 2010). Regionally in the Gulf of Alaska, MS has been shown to covary with grain size; low MS values are found in clay-rich sediment and high MS values with sandy and silty sediments (Fig. 3; Cowan et al., 2006; Jaeger and Kramer, 2014; Walczak et al., 2017). NGR is naturally occurring gamma-ray radiation derived from K, U, and Th (Blum, 1997). It is a common tool used for interpreting lithology because K and Th are generally concentrated in clay minerals; thus, high NGR values indicate a more clay-rich lithology (Blum, 1997). Both MS and NGR data sets used in this study are volume normalized to account for variable filling of core liners (Walczak et al., 2015). The b* color reflectance represents sediment color on the International Commission on Illumination (CIE) color space; blue (negative) to yellow (positive) scale (Blum, 1997). Within Site U1419, sediment with higher b* values appears greener in core photographs. Variations in color reflectance have been correlated with lithologic changes primarily due to changes in productivity (Vidal et al., 1998; Peterson et al., 2000; Nederbragt et al., 2006; Pan and Chen, 2013), with higher b* values (greener) associated with periods of increased diatom productivity (Nederbragt and Thurow, 2001; Debret et al., 2006; Debret et al., 2011). At Site U1419, quantitative color from RGB values of digital line scan images of core EW0408 85JC shows this relationship of greener sediments with higher biogenic silica concentrations (Fig. 3; Addison et al., 2012). Magnetic susceptibility and NGR are measured on an 8-cm- and 10-cm-long whole-core volume and b* every 2 cm on the split core surface (Jaeger et al., 2014).

Scanning X-Ray Fluorescence

Scanning XRF elemental data were collected approximately every 2 cm, post-cruise on the Site U1419 composite spliced core (Penkrot et al., 2017a). To account for core surface inhomogeneity in grain size and porosity, which affects X-ray attenuation (Tjallingii et al., 2007), the scanning XRF data were calibrated with the normalized median-scaled method (NMS; Lyle et al., 2012). Ninety-five discrete bulk samples representing the full range of lithologies within Site U1419 were analyzed for major, minor, and trace elements via fused bead XRF analysis (Yamada, 2010; Watanabe, 2015) and an Element 2 inductively coupled plasma–mass spectrometer (ICP-MS) (Kamenov et al., 2009) at the University of Florida. Analytical uncertainty for all elements is better than ±5%. Normalized scanning XRF data for each element used are available in Supplemental File S11. Elements used for classification analysis are restricted to those most sensitive to grain-size variations (Al, Ca, K, Si, Rb, and Zr; Rothwell et al., 2006; Rothwell and Croudace, 2015). Elemental ratios of these elements successfully capture major lithologic changes in core EW0408 85JC (Fig. 3; Barron et al., 2009). Elements sensitive to changes in provenance (e.g., light rare-earth elements [LREEs], Sc, Th, etc.; McLennan, 1989; Kaiser et al., 2007) and digenesis (e.g., Mo, V, Fe, etc.; Piper, 2001; Barron et al., 2009) are avoided.

Bulk Mineralogy

Bulk-sample X-ray diffraction patterns were collected shipboard for mineralogy (Jaeger et al., 2014). Quantitative analyses of these scans were performed using the RockJock program in standardless analysis mode (Eberl, 2003).

Lithofacies Classification Analyses

Access to high-level computing and high-resolution sedimentary property data have driven many approaches that use machine learning to classify sedimentary facies from multivariate data sets. Most attempts have focused on classifying bulk sedimentary facies (e.g., mudstone, sandstone, marl, and limestone) and have had varying success when models are validated against observed lithofacies (Dubois et al., 2007; Hall, 2016; Sahoo and Jha, 2017). Successful attempts occur when there are numerous samples of classified data to use in training sets and lithofacies criteria are primarily compositional and/or mineralogical, which allows for effective classification by proxies such as natural gamma-ray logs (Benaouda et al., 1999; Dubois et al., 2007; Tang and White 2008).

The two modeling approaches used are supervised and unsupervised classification. In the former, algorithms are supplied with classified training data that let the model learn the relationships between the measurements and the classes (lithofacies), which are then assigned to unclassified measurements (Benaouda et al., 1999; Davis, 2002; Dubois et al., 2007; Tang and White, 2008). Unsupervised classification of lithofacies generally groups, or clusters, samples together based on common traits with a goal of maximizing the difference between groups. A primary limitation of this approach is that the number of groups is not known in advance, and interpretation of the clusters in terms of controlling factors can be ambiguous (Templ et al., 2008; Ellefsen et al., 2014). A potential benefit of applying both approaches is that the supervised classification can objectively highlight the parameters or variables that best correlate with individual lithofacies, which can later help interpret the unsupervised clustering results.

Supervised Classification-Discriminant Analyses

Linear and quadratic discriminant analyses (LDA and QDA) result in classification boundaries based on the calculation of discriminant scores that assign observations to a specific group (Davis, 2002). The LDA scores are calculated by finding linear combinations of the independent variables providing linear decision boundaries that assume equal covariance of variables in all groups. The QDA relaxes the requirement of equal covariance and is able to provide more accurate nonlinear, quadratic classification boundaries. Both techniques require Gaussian distributions of parameter values within groups; so outliers need to be eliminated and the distribution of each variable analyzed for each group (lithofacies) to test for normality. In general, QDA is less restrictive than LDA, but because it has a larger number of parameters to fit in its calculation, it requires larger data sets for the given number of groups and/or variables being analyzed. Both techniques use a Bayesian approach to provide a probability of group membership.

Supervised Classification-Support Vector Machines

Support vector machines (SVMs) have evolved to be the most flexible and used machine learning approach for supervised learning. This popularity arises because it requires fewer parameters to be fitted in the model and is less restrictive than LDA or QDA. Although as with all regression approaches, outliers in the data will reduce model accuracy. Within a SVM, data points are projected into higher-dimensional space using the kernel approach, where the points effectively become linearly separable into different classes. The optimal hyperplane between the groups (lithofacies) establishes the edges between the closest points of the classes, with points falling on the wrong side of the classification weighted down to minimize their importance of setting boundaries. A subset of training points provides the probability of group membership through a computationally expensive cross-validation approach, which in the past has been a primary restriction on the use of SVMs.

Unsupervised Cluster Analysis

Clustering is a common multivariate statistical method that is useful for grouping observations into homogeneous clusters. Two different clustering routines are applied to the downcore logging data—hierarchical clustering and model clustering. Both routines have been applied to geochemical and logging data (Templ et al., 2008; Hunze et al., 2013; Ellefsen et al., 2014), with no consensus on the ideal clustering technique for these data types.

Hierarchical clustering is based on the multivariate distance between observations that can be either agglomerative or divisive. In this study, we apply an agglomerative approach using Euclidean distances and the Ward method (Ward, 1963) of clustering (Templ et al., 2008). This analysis was performed in the R programming language using the “hclust” function (R Core Team, 2017).

Model clustering is based on specific models that describe the multivariate distribution of the data points forming clusters rather than on the distance between observations (Fraley and Raftery, 2002). In this study, we use a modified mixture-model clustering method from Ellefsen et al. (2014) for geochemical data which is based on a finite mixture model that was first presented by Templ et al. (2008). This analysis was performed in the R programming language using the “mclust” function (Fraley et al., 2012).

Supervised Classification Data Preparation

The accuracy of classification data analysis is highly dependent upon the intergroup variance of the training data, which requires a robust evaluation of how the data fit the assumptions that are part of each analytical technique. Our data preparation procedures follow Gorman Sanisaca et al. (2017), who use multivariate sediment data to assign membership to distinctive sources. All data preparation is performed using the statistical software R (R Core Team, 2017) and is presented in File S1 (footnote 1) for evaluation and implementation by others.

Data preparation involves screening data and then determining the optimal variables for use in classification algorithms. Because we use geochemical data in our model, we use preparation techniques appropriate for these closed data (i.e., expressed in units that sum to a constant; Davis, 2002; Templ et al., 2008). Data outliers are removed using the outCoDa function in the robCompositions package in R, where outliers are detected based on (robust) Mahalanobis distances after an isometric log-ratio transformation of the data (Filzmoser and Hron, 2008; Templ et al., 2011). This routine identified 28% of data as outliers (4661 used/6490 total). We test for any constant or almost constant predictors across samples using the function nearZeroVar from the caret package in R (Kuhn et al., 2017). Multivariate data sometimes contain predictors that take a unique value across samples (e.g., zero ppm concentrations), which add little discriminatory power to models and/or can complicate model result interpretations. A nonparametric Kruskal-Wallis test is used on each variable to see if the different lithofacies have unique values that allow for potential discrimination between lithofacies. Data are interrogated for collinearity (r >0.9) between variables, which can lead to serious stability problems in discriminant function analyses (Davis, 2002). A Levene test is used to test for homogeneity of variance (Gorman Sanisaca et al., 2017), which showed that several parameters (e.g., Ca and Zr) do not meet the LDA requirement of homogeneity; therefore, only QDA techniques are used. Closed compositional data are opened for classified models using a center-log ratio transform (Templ et al., 2008). Physical property data were standardized (z-score) to account for differences in measurement units when calculating distance coefficients (Milligan and Cooper, 1988; Kynčlová et al., 2016). Finally, for supervised classification models, we use a forward stepwise linear discriminant function approach to establish which parameters were most important in separating the lithofacies types using the function “greedy.wilks” in the R package klaR (Weihs et al. 2005). These parameters are then grouped for analyses in QDA and SVM models. Supervised model accuracy is evaluated using a cross-validation approach within the caret package in R (Kuhn et al., 2017). The number of training samples among the different lithofacies is balanced to prevent accuracy bias in the prediction model toward the most abundant lithofacies.

Unsupervised Clustering Procedure

Three physical properties (b*, MS, and NGR) and six elements (Al, Ca, Si, Rb, K, and Zr) were selected to be used as our multivariate cluster model inputs. Both clustering routines were run with three different groups of input data: group 1 (G1)—physical properties and elemental data inputs (b*, MS, NGR, Al, Ca, Si, Rb, K, and Zr); group 2 (G2)—physical properties only (b*, MS, and NGR); group 3 (G3)—elemental data only (Al, Ca, Si, Rb, K, and Zr).

Prior to clustering, the input data were rescaled at a 2 cm resolution and treated to avoid artifacts. An isometric log-ratio (ilr) transformation was performed using the robCompositions R package (Templ et al., 2011) on the NMS-normalized scanning XRF elements in order to open the data and remove the effects of a compositional closed data set (Templ et al., 2008). Physical property data were normalized (z-score). Finally, the input data were transformed into robust (i.e., no outliers) principal components (Templ et al., 2011) before clustering to reduce dimensionality, which helps to reduce uncertainty encountered while clustering (Fraley and Raftery, 2002), remove correlation between input variables, and improve cluster stability (Ellefsen et al., 2014). These principal components were then used in hierarchical and modified mixture-model clustering routines written in R (File S1 [footnote 1]). For both clustering routines and each input variable group, the spatial relationship of the resultant clusters is plotted downcore, creating a cluster lithology column.

Characterization of Observed Lithofacies

Twelve lithofacies are defined for Site U1419 based on shipboard visual core descriptions, smear slides, and core photographs supported by post-expedition digital X-radiography and CT scans. Lithofacies are delineated based on sediment texture, color, degree of bioturbation, and sedimentary structures (Fig. 4 and Table 1). In general, these lithofacies match those identified shipboard (Jaeger et al., 2014). However, CT and X-radiograph imagery allows for better identification and subclassification of diamict lithofacies than visual analysis of split core surfaces.

Diamict lithofacies are the most commonly observed lithofacies in Site U1419 (Fig. 4). Site U1419 diamicts are defined as poorly sorted facies containing >1% by area of clasts greater than 2 mm floating in a fine-grained matrix (Jaeger et al., 2014). Clast-poor diamicts contain 1%–5% clasts, and clast-rich diamicts contain >5% clasts. Mud with <1% clasts is classified as mud with dispersed clasts or lonestones. Stratified diamicts are defined by interlayered low-density (assumed to be mud-rich) stratification that ranges from a few mm to ∼1 cm thick, has diffuse contacts, and is only visible in CT scans. Sandy diamict lithofacies are higher density than clast-poor diamicts and tend to have sharp lower contacts and gradational upper contacts. Due to their density difference, they are most easily recognized in the CT scan images, and they tend to be relatively thin, ranging from a few cm to ∼20 cm. Interstratified massive diamicts are defined by 3–20 cm intervals of clast-poor diamict interstratified with clast-rich diamict and silt stringers. Clast-rich diamicts containing folded and/or crosscut silt stringers are occasionally observed.

Non-diamict lithofacies comprise the remainder of the composite splice. Interbedded mud and sand with dispersed clasts are recognized by alternating very thin to thin, mud and fine sand beds that have sharp bottom contacts and gradational upper contacts and <1% clasts. Massive sandy mud with dispersed clasts is bioturbated and contains <1% clasts. Laminated mud and ooze with dispersed clasts contain mud and/or silt laminae, appear greener in color than the surrounding gray diamicts, and contain abundant diatoms (Jaeger et al., 2014). Laminated mud lithofacies contain mud and silt laminae with gradational contacts. The massive mud facies are green in color, composed mainly of silt and clay, have abundant diatoms (Jaeger et al., 2014), and are structureless except for a few preserved burrows that are visible only in CT scans. Neither the laminated mud nor massive mud lithofacies contain clasts.

Site U1419 Lithostratigraphy

We use these 12 lithofacies to generate a lithostratigraphic section at cm-scale resolution for the composite Site U1419 record (Fig. 5A). For ease of comparison with our other data sets and cluster analysis testing, we condense this high-resolution record into a bimodal mud and diamict lithostratigraphy (Fig. 5B) and a diamict-only lithofacies (Fig. 5C). For the bimodal mud-diamict lithofacies section, any lithofacies that contain clasts are grouped into a general diamict facies, and those lacking lonestones are grouped into a general mud facies (Table 2). The mud lithofacies represents 7% of the composite splice, while the diamict portion is 93% of the core (Table 3).

We further subdivide the diamict portion of the lithostratigraphy at a vertical scale that allows for testing of whether the subtle, finer vertical-scale lithofacies can be identified using cluster analysis of downcore logging data. We focus on the diamict interval because diamict facies comprise a majority of the spliced section and contain ten out of the 12 visually observed lithofacies. In addition, the diamict facies likely contain much of the information on spatial variations in glacier extent (e.g., Cowan et al., 1997). We combine the ten clast-bearing lithofacies into five simplified lithofacies based on texture and concentration of lonestones (clasts)—clast-poor massive diamict, clast-poor stratified diamict, clast-poor sandy diamict, clast-rich diamict, and mud with lonestones, creating a diamict-only lithofacies (Table 4; Figs. 4B–4F and 5C 

Many of the ten clast-bearing lithofacies display fairly similar logging properties (Fig. 6; Fig. S1 [footnote 1]), and combining the most texturally similar facies allows for greater differentiation of the facies that are modeled through classification analysis. The facies that are condensed into overarching groups vary subtly, often by features not easily identified by the logging data (e.g., bioturbation), and they are interpreted to be deposited under similar glacier conditions. Additionally, some of these facies are only seen within very small intervals a few cm thick (i.e., sandy clast-poor diamict), which alone would be aliased by the resolution of the logging properties, which range from 2 to 10 cm.

Downcore Logging Data

The downcore logging data resolve core properties at a range of vertical scales that generally correspond with lithostratigraphy (Fig. 7). Sediment color reflectance (b*) and MS appear to generally be inversely correlated, with highest b* and lowest MS values occurring from 0–6.32 m CCSF-A, and the lowest b* and highest MS from 6.32–96.64 m CCSF-A. In the lower interval, several peaks in b* occur, with values similar to those found in the upper 6.32 m CCSF-A of core. Natural gamma radiation values mirror MS with the lowest values found in the upper 6.32 m CCSF-A, with higher and relatively consistent values observed below.

Aluminum and Si generally covary with average values increasing downcore. Both K and Rb show highest values in the upper 6.32 m CCSF-A of the core and appear to be correlated. A major low in K is observed between 83.2 and 91.4 m CCSF-A that is not seen in Rb. Calcium displays the most downcore variability of any of the elements and appears to roughly correlate with MS below 6.32 m CCSF-A. Average Ca concentrations decrease toward the bottom of the composite splice, with the most variability observed between 80.7 and 94.4 m CCSF-A. Zirconium is roughly inversely correlated with K and Rb, with values showing a stepwise increase at 6.32 m CCSF-A, and then steadily decreases until 73.6 m CCSF-A. Between 73.6 and 80.9 m CCSF-A, Zr displays high variability, and at 80.9 m CCSF-A, Zr displays a stepwise decrease in values.

Logging Properties of Lithofacies

To relate changes in logging properties with visually observed lithofacies, we compare the standardized (z-score) value of each logging property to the lithofacies (Fig. 6). The median and interquartile ranges of z-scores are calculated for each logging parameter within each lithofacies (File S2 [footnote 1]). Logging properties having positive median z-score values are relatively enriched, and those with negative values are relatively depleted, with the absolute value a measure of the degree of deviation from the mean of that property for the entire analyzed section (i.e., a value of +1 indicates enrichment at 1σ).

For the bimodal mud-diamict lithostratigraphic grouping (Figs. 6A and 6B), the diamict facies has nearly average concentrations of all the logging properties (i.e., value ∼0). In contrast, the mud facies is average for Zr, very enriched (∼+2σ) in b* (very green) and Rb, moderately enriched (∼+1σ) in Ca and K, moderately (∼–1σ) depleted in Al and Si, and very depleted (∼–2σ) in NGR and MS (Table 4). Because diamict is 93% of the total described section (Table 3), it is understandable its logging properties have standardized values close to zero; whereas the mud lithofacies displays greater variability.

For the diamict-only lithostratigraphic grouping, there is more variability in the range of standardized logging values within each sublithofacies. In this diamict-only interval, clast-rich diamict is the dominant lithofacies, representing 62.2% of the core, while stratified diamict makes up 14.7%, sandy diamict 10.5%, clast-rich diamict 9.7%, and mud with lonestones 3.0% (Table 3). Clast-poor diamicts have a comparatively average composition with slight enrichment in Al, K, and Si and slight depletion of Rb, Ca, and Zr (Fig. 6C). Stratified diamicts are defined by moderately enriched K, Rb, and Zr (∼+0.5σ) and moderately depleted Si, MS, and NGR (∼–0.5σ). Both sandy and clast-rich diamicts are compositionally similar, being moderately depleted in b*, Al, K, Rb, and Ca. However, sandy diamict is moderately enriched in Zr, while clast-rich diamict is moderately enriched in Si, moderately depleted in Al, and very depleted in K. Mud with lonestones is moderately enriched in b*, NGR, Si, and Ca, and moderately depleted in Al and Zr.

Principal Components

For the unsupervised cluster analysis, we reduce multivariate space by using only the first three principal components (PC). Regardless of samples or input parameters used in the factor model, PC1 represents 38%–54% of the variability, PC2 25%–34%, and PC3 15%–28%, with the three overall representing 90%–100% of the total variance (Fig. 8). Physical properties (b*, MS, and NGR) appear most useful for delineating variations in composition (b* and MS) and clay content (NGR). Elemental data appear to best capture differences in sand versus clay content (Ca, Zr, and Si versus K, Rb, and Al) and composition (Ca and Si).

Mud-Diamict Lithostratigraphy

For data groups G1 and G2, PCs 1 and 2 are dominated by parameters that tend to follow the relative parameter importance shown in Figure 6A. Magnetic susceptibility, NGR, and b* values dominate the variance explained by PC1 and PC2 (Figs. 8A and 8B), suggesting that these PCs reflect the relative difference in grain size and composition between the mud and diamict lithofacies. For the element-only data group G3, there is a greater variability on the composition of each PC (Fig. 8C). PC1 of G3 is largely influenced by Al and Si, which are both moderately depleted in the massive mud facies (Fig. 6A). For PC2, Ca, Si, and Zr group together and may represent a sandier component; whereas Al, Rb, and K group together and may represent a muddier component. PC3 is mostly represented by Ca alone, and may represent a nontextural (i.e., biogenic) Ca component.

Diamict-Only Lithostratigraphy

For data groups G1 and G2, PC1 is dominated by NGR and b* (Figs. 8D and 8E), suggesting its variance is explained by differences in grain size and composition. Magnetic susceptibility is the dominant variable for PC2 in G1 and PC3 in G2, and may represent a difference in sediment composition. PC3 in G1 and PC2 in G2 are mainly influenced by a b* and NGR grouping, representing a muddy, greener, diatom-bearing component. For G3, an Al and K versus Zr grouping dominates PC1, suggesting it represents a muddier versus coarser variant (Fig. 8F). Silicon, possibly representing a sand component, largely controls PC2. PC3 is predominately influenced by Ca but not with Si, suggesting it is representing biogenic Ca rather than Ca-rich plagioclase.

Supervised Classification Model Results

Quadratic discriminant analysis and SVM model results differ greatly in accuracy between the simpler mud-diamict lithostratigraphy and the more complex diamict-only lithostratigraphy. For the mud-diamict case, both QDA and SVM models are 100% accurate with a Cohen’s Kappa value of 1. The Kappa statistic is a metric that compares an observed accuracy with an expected accuracy given the relative abundance of that group or lithofacies (Fleiss et al., 2013). It is a more accurate measure of model performance when there is a strong imbalance in the number of observations in a particular class (e.g., diamict). In decreasing order of importance in the accuracy of the models are NGR, Rb, b*, Al, Si, MS, and Ca (File S1 [footnote 1]).

Neither supervised modeling approach is accurate with the diamict-only lithofacies. The Greedy Wilks analysis identified Si and Zr as the parameters that explained most of the variance in the diamict-only data set (File S1 [footnote 1]). The QDA model is only 44% accurate (Kappa = 0.21), and the SVM model is 42% accurate (Kappa = 0.22). In general, both models struggle with the nonmassive diamict facies; they successfully classify massive diamict correctly but also tend to classify the other lithofacies as massive diamict (File S1 [footnote 1]). In these models, the relative importance of parameters varies for each lithofacies, with b*, Si, and Al being relatively the most important in successfully distinguishing between the lithofacies groups.

Unsupervised Clustering Model Results

Clustering using the hierarchical and modified-mixture model techniques results in highly variable cluster lithologies that depend on the clustering routine and input data used (Figs. 9 and 10). A total of 12 cluster-based lithologies are produced—one for each of the input data groups (G1, G2, and G3) and clustering routine (hierarchical and modified-mixture model), for both the mud-diamict and diamict-only lithostratigraphies. Note that the cluster identifier (e.g., cluster 4), is arbitrarily set by the analysis code and is not consistently associated with a particular cluster-based lithology. For this study, we renamed the clusters in rank order so that cluster 1 indicates the most common cluster and Cluster 5 the least common.

Mud-Diamict Lithostratigraphy

Modeled cluster groupings have varying degrees of similarity with the observed mud-diamict lithostratigraphy. Although each clustering approach can create more than two cluster groups, for this first scenario, we limited models to two clusters to best compare with the bimodal mud-diamict lithostratigraphy. We compare the clustering results with the observed lithofacies by comparing the z-score of the variables within each cluster to those of the observed lithofacies. This helps identify which cluster is most similar to the lithofacies we are attempting to model (Fig. 11 and Fig. S2 [footnote 1]).

Data groups G1 and G2 have similar cluster modeling results (Fig. 9). Both the hierarchical and mixture-model clustering results contain a cluster that is related to the mud lithofacies. This cluster contains enriched b*, K, Rb, and Ca, and depleted NGR, MS, and Si values that are indicative of the greener, fine-grained mud lithofacies. The second cluster is composed of average physical property values and elemental concentrations that represent the volumetrically dominant diamict facies. Both clustering routines show similar logging property enrichment and depletion patterns within the clusters for G1 and G2. However, the hierarchical clustering routine reveals more highly enriched or more highly depleted logging properties, while the mixture-model clustering results show only moderately enriched or moderately depleted properties (Fig. S2 [footnote 1]).

G3 cluster results differ greatly from G1 and G2. There is no cluster for this model that definitively represents the distinctive mud facies (i.e., enriched b*, K, Rb, and Ca and depleted NGR, MS, and Si) (Fig. 9; Fig. S2 [footnote 1]). Both hierarchical and mixture-model cluster results contain similar variations in logging properties, with one cluster containing moderately enriched Al, moderately depleted Zr, and nearly average values for all other logging properties. Logging properties of the second cluster are opposite, with moderately enriched Zr, moderately depleted Al, and nearly average for all other variables.

We observe that the vertical heterogeneity of the logging data produces varying amounts of vertical detail represented by each cluster group. Data groups G1 and G2, which contain physical property logging data, have less downcore variability between clusters than G3, which only utilizes the scanning XRF elemental data. This is illustrated by the fact that G1 and G2 cluster lithologies have one cluster that comprises the top 6.32 m CCSF-A core (mud portion of the core), with the second cluster dominating the portion of the core below 6.32 m CCSF-A (diamict portion of the core). For G3, there is no single cluster that describes the entire top 6.32 m CCSF-A of core, and the section of core below 6.32 m CCSF-A frequently alternates between clusters. The difference in downcore cluster variance also is evident in the percentage of core contained in the dominant cluster (Table 5). There is little difference between G1 and G2 for the percentage of core section represented by the dominant and minor cluster. Within each data group (G1–G3), the model-based clustering approach has more downcore variability than hierarchical clustering (Fig. 9). This is also evident in the cluster percentages, where the difference between the two cluster lengths is greater for the G1 hierarchical clusters (92% versus 8%) than the G1 mixture-model clusters (76% versus 24%; Table 5).

Diamict-Only Lithostratigraphy

We suggest that cluster modeling can better resolve the heterolithic nature of this site if the number of clusters is increased to reflect the increased number of lithofacies and by limiting the input data to only the diamict portion of the core. As with the mud-diamict lithostratigraphy, we chose to limit the number of cluster groups to match the number of end-member lithologies we are trying to model (five). The same three data groups (G1–G3) are used for modeling the diamict-only lithostratigraphy. To identify which cluster is most similar to each observed lithofacies, the z-score of each logging property within each cluster again is compared to the observed diamict lithofacies (Fig. S3 [footnote 1]).

As expected, diamict-only lithostratigraphy modeling results are vertically more complex than those for the mud-diamict lithostratigraphy (Fig. 10). The most common cluster from each model result contains relatively average logging properties common to the observed clast-poor diamict that dominates this interval of core (Table 5; Fig. S3 [footnote 1]). However, the remaining clusters do not contain logging property associations that can be associated with the observed lithofacies (Fig. 6). Cluster groups for G1 and G2 hierarchical clustering and G3 hierarchical and mixture-model clustering are nearly identical; while G1 and G2 mixture-model clustering results are distinct (Fig. 10).

Our results indicate that it is possible to use supervised and unsupervised machine learning models on physical and elemental core logging data to classify glacimarine lithofacies with varying success. Bimodal mud-diamict lithofacies interpreted as nonglacial versus glacial conditions can be successfully identified using physical properties data augmented with scanning XRF analyses. In both QDA and SVM supervised classification models, the combination of these parameters resulted in high accuracy. For unsupervised cluster analysis, the inclusion of scanning XRF elemental abundance with physical property data offers only a slight improvement in model success. For the heterolithic diamict-only glacimarine lithofacies that are interpreted to represent variations in glacier extent, all modeling approaches are less successful; although unsupervised cluster analyses provided slightly better accuracy.

Supervised classification analysis is most useful for discerning between lithofacies when the number and types of facies expected is known. When using classification analysis as an alternative to CT scans and/or X-radiographs to highlight subtle lithofacies variations, unsupervised cluster analysis is preferable because it does not constrain the model to already known facies. The following discussion will primarily focus on interpretation of the unsupervised cluster models, which offers the opportunity to provide new information from the natural clustering of the logging data. We contend that the unsupervised cluster modeling results serve as a valuable complementary data set alongside visual observations and imaging for lithofacies coding, rather than as a replacement for a visual core description and CT scan and/or X-radiograph imaging. When cluster-model results are compared to observed lithostratigraphy and downcore plots of elemental ratios, a basic record of glacial dynamics can be discerned.

Evaluation of Classification Approaches

Both supervised and unsupervised classification approaches were most accurate with the bimodal lithofacies grouping but were less so when applied to the diamict-only interval. Quadratic discriminant analysis and SVM performed the most accurate classification without misclassification for the bimodal lithofacies, followed by hierarchical clustering. These three approaches offered the most accurate and interpretable results for the mud-diamict models presented in this study; whereas either classification approach used for the diamict-only models had little influence on the results, with no model able to accurately predict the diamict-only lithofacies. Unsupervised mixture-model clustering does not perform well when differentiating between clusters with uneven number of observations, such as those in the mud-diamict lithostratigraphy (Table 3; Ertöz et al., 2003). While the clusters within the diamict-only section are not completely even, they are more so than the mud-diamict clusters, and so the mixture-model clustering performed as well as hierarchical clustering. Overall, choice of input data had a greater impact on the clustering results than the clustering routine.

When using hierarchical clustering, choosing the appropriate number of clusters is critical to model performance, and with little to no a priori information on lithofacies, it is difficult to constrain the appropriate numbers of clusters (Milligan and Cooper, 1985). However, if the number of expected lithofacies within a model can be constrained, hierarchical clustering becomes more useful. When the number of clusters is unknown, mixture-model clustering may be a better choice, because the Bayesian information criterion (BIC) can be useful for informing on an appropriate number of clusters to select (Fraley and Raftery, 1998; Raftery and Dean, 2006). Plots of BIC values for models used in this study show that our choice of using two and five clusters for our mud-diamict and diamict-only lithostratigraphy modeling is appropriate (Fig. S4 [footnote 1]).

Evaluation of Model Input Data

Physical property logging data are best at capturing end-member lithologies, such as mud and diamict. This is because the greatest variation in physical properties between lithofacies is observed at the end-member scale (Figs. 3 and 6). The measurement resolution of the physical property logging data is too low to resolve thin beds of alternating lithofacies and only captures thicker major facies. Magnetic susceptibility and NGR measurements represent an average of an 8 cm and 10 cm interval, respectively; while b* and the scanning XRF elemental data are point measurements collected every 2 cm (Jaeger et al., 2014; Penkrot et al., 2017a). Some lithofacies within Site U1419 contain beds that are relatively thin—approximately a few cm (i.e., sandy diamicts) to 10–20 cm (i.e., mud with dispersed clasts). These facies are poorly resolved over the 8–10 cm of core that are integrated by the MS and NGR measurements, which confines the usefulness of our MS and NGR data sets to simple, end-member lithostratigraphies where beds occur on the meter scale rather than the cm scale. Cross plots of MS versus b* plotted by observed lithology illustrate the ability of the physical properties to better distinguish between the bimodal mud-diamict lithofacies than the heterogeneous diamict-only lithofacies (Fig. S5 [footnote 1]).

Scanning XRF elemental data are most useful in identifying subtle variations in sediment composition and texture that occur between lithofacies. Proxies for both coarse (Zr and Si) and fine-grained (Al, K, and Rb) minerals are represented in the elemental data. Out of the three shipboard-measured properties utilized in this study, NGR is the only one able to capture changes in grain size as a proxy for the relative amount of clay minerals. Lithofacies variations within the diamict-only section are primarily defined by differences in the amount of coarse grains (i.e., clast-rich diamict versus clast-poor diamict versus mud with lonestones); thus the elemental proxies for coarse grains provide more distinguishing power than NGR.

Inclusion of MS, NGR, and b* with elemental data for diamict-only classification did not improve overall accuracy for any method. In cluster analysis, it did not create a more accurate model of cluster lithology due to high loading scores of the physical properties that may overweight their importance compared to the elemental data (Figs. 8A and 8D). We recognize that this is caused by using standardized values for physical properties and log-ratio values for elemental data. However, when using mixed data treatment such as this, the resulting relationships between variables is not altered, and it is acceptable to combine both treatments into a single joint matrix prior to PCA analysis (Kynčlová et al., 2016). Both types of classification approaches for the diamict-only lithofacies that utilized b*, which has higher sampling resolution than MS and NGR, along with the elemental data, did not produce results that are able to accurately predict the observed lithofacies.

Evaluation of Cluster Model–Lithostratigraphy Relationships

Statistical measures are used to evaluate the fit between the unsupervised cluster model results and the observed lithology, while logging property distribution within each cluster allows for a textural and/or compositional interpretation to be made for the cluster. A Pearson’s chi-square test is used to test if the downcore position of cluster groups correspond with distinctive lithology categories, and Cramér’s V-value is used to test the strength of this relationship (Cramér, 1946; Liebetrau, 1983). Cramér’s V-value can vary from 0 to 1, with higher values indicating a higher association between cluster and observed lithofacies category.

Mud-Diamict Lithostratigraphy

Statistical evaluations for the six mud-diamict cluster models are found in Table 6. For all cluster models, a Pearson’s chi-square test results in a p value <<0.05, suggesting cluster modeling likely is able to capture the observed transition in end-member glacial diamict to nonglacial mud lithofacies. The diamict-to-mud transition is best approximated by the hierarchical clustering approach, with data group G1 having the highest Cramér’s V value. Incorporation of the scanning XRF data provides only a slight improvement on model fit over the hierarchical clustering of only physical properties (G2). The scanning XRF data only model (G3) has the least success. Within the hierarchical cluster models using G1 and G2 data, one cluster group contains average logging properties that are nearly identical to the diamict lithofacies (Fig. 10). The second cluster group contains an enriched b*, K, Rb, and Ca and depleted NGR, MS, and Si signature that is indicative of the more biogenic (green), fine-grained mud lithofacies.

Diamict-Only Lithostratigraphy

Statistical evaluations for the six diamict-only cluster models are found in Table 6. For all models, a Pearson’s chi-square test resulted in a p value <<0.05, suggesting downcore changes in clusters correspond at this probability level with variations in lithofacies. However, within the heterolithic diamict portion of the spliced interval, cluster modeling was not successful at direct correlation between unique cluster and lithofacies pairs (Table 7). The lack of correspondence also is observed in the Cramér’s V values for the diamict-only cluster models, which are much smaller than those for the mud-diamict models. The cluster model with the highest Cramér’s V value is the mixture-model clustering using only scanning XRF elements (data group G3) (Table 6). This cluster model is nearly identical to the hierarchical clustering model of the same inputs, only having a slightly higher Cramér’s V value.

For the data group G3 mixture-model, cluster 1 contains the largest number of observations (n = 2294; Table 7). Note that physical property data were not used in this model, but we present these values along with the XRF data to better describe and interpret cluster groups. Cluster 1 contains average values (z-score ∼0) for most properties, with moderate enrichment in Al and moderate depletion in b*, MS, Ca, and Zr. This indicates that it is likely a finer-grained lithology and it bears greatest similarity to the most common lithofacies, a clast-poor diamict; these two have a ∼74% co-occurrence in core (Table 7). We interpret this cluster as a mud-rich, clast-poor diamict.

Cluster 2 has the next largest number of samples (n = 1613) and is primarily characterized by enrichment primarily in Ca with moderate enrichment of b*, K, and Rb. Based on comparison with the bimodal mud-diamict models, this cluster could represent a lithology with an increased biogenic component. Bulk mineralogy results from U1419 do not reveal detectable amounts (i.e., >3% using RockJock; Eberl, 2003) of calcite, but smear slides do contain calcareous microfossils (Jaeger et al., 2014). Another explanation for the Ca enrichment could be a change of mineralogy to a more Ca-rich plagioclase, which is observed at Site U1419 (File S3 [footnote 1]). Together, this leads us to classify this cluster as a Ca-rich diamict. We use the b*/Ca ratio to distinguish between a more terrigenous or more biogenic source, with values >0.5 suggesting the Ca is more likely derived from a biogenic source (Fig. 12A).

Cluster 3 has a smaller number of observations (<15%, n = 769) and is moderately enriched in Si and Zr and depleted in b*, Al, K, Rb, and Ca. Zr and Si tend to be found in coarse sediment due to the presence of zircons and quartz in this size fraction (Rothwell and Croudace, 2015). The increased Si-rich quartz dilutes the relative amount of feldspar and heavy minerals where Al, K, Rb, and Ca are found. The coarse-grained texture of this cluster is also reflected in the Al/Si ratio, with lower values indicating a higher quartz sand component compared to clay (Hoang et al., 2010), and the Zr/Rb ratio, with higher values indicating a generally coarser texture (Wang et al., 2011) (Figs. 12B and 12C). We classify this as a coarse-grained diamict.

Cluster 4 also has a small number of observations (<15%, n = 730) and is highly enriched in Zr, moderately enriched in b*, Rb, and Ca, and depleted in NGR, Al, and Si. Depleted NGR and Al and enriched Zr, K, and Rb indicate fewer clay minerals and a greater amount of heavy accessory minerals (Blum 1997; Totten and Hanan, 1998). High Zr/Si and Zr/K ratios indicate a greater amount of heavy minerals in relation to quartz and feldspar (Figs. 12D and 12E). This cluster is classified as a heavy-mineral–rich diamict/mud with lonestones.

Cluster 5 has the fewest observations (∼1%) and appears sporadically downcore (Fig. 10G). As evidenced from the extreme enrichment and/or depletion of scanning XRF logging data (z-score values in excess of ±1–2σ), this cluster appears to represent noise and/or outliers.

Interpretation of Classification Model Results

Whereas the supervised classification approach has the benefit of a priori knowledge of the potential lithofacies for model validation, it is more likely that the physical properties data cannot be prescribed to particular lithofacies. Unsupervised classification provides relatively unbiased evaluation of property grouping and can provide a record of textural and compositional variations that vary independently from the visually observed lithofacies. Interpretation of model results here focuses on the value-added information derived from unsupervised classification models.

Mud-Diamict Lithostratigraphy

Supervised classification models and hierarchical clustering are most successful at identifying the bimodal mud-diamict lithofacies because the clast-free mud has logging characteristics (z-score values) in excess of ±1–2σ when compared with the diamict portion of the core (Fig. 6). This makes it easier to accurately delineate, using any of the classification approaches, the transition from glacial diamict to nonglacial mud at 6.32 m CCSF-A (Fig. 11). This temporal shift from glacially influenced diamict to hemipelagic mud corresponds with the regional retreat of outlet glaciers onto land during the last deglaciation (Davies et al., 2011). A benefit of this bimodal classification scheme is that it can identify intervals similar to those observed in the Holocene within the diamict-dominated portion of the core.

Using the G1 hierarchical clustering results, there are two intervals below 6.32 m CCSF-A that are identified as being similar to the upper Holocene section at 9.44 m and 70.79 m CCSF-A (Fig. 11). These intervals may indicate periods when regional climate was warm enough for an increase in productivity, but not for glaciers to fully retreat onshore, as evidenced by continued deposition of ice-rafted debris within these intervals. To test this interpretation, we compare these two intervals with the more highly resolved VCD (Fig. 4A). The interval at 9.44 m CCSF-A is represented by clast-poor diamict, but the interval at 70.79 m CCSF-A is represented by mud with lonestones. The interval at 9.44 m CCSF-A is very thin, ∼2 cm thick, and most likely represents noise within the logging data, while the mud with lonestones interval at 70.79 m CCSF-A more likely represents a time of relatively higher productivity within a glacier-dominated period.

Hierarchical clustering of the G2 physical property only data set (Fig. 9D) accurately captures the mud-diamict transition at 6.32 m CCSF-A, but it also identifies nine intervals as mud within the diamict portion of the core. Of these nine intervals, only four correlate with a green mud with lonestone facies, and the other intervals are most likely the result of data noise.

In summary, all classification models can identify a major deglaciation event characterized by the transition from diamict to mud, but cluster analysis also is able to identify sections of the glacial portion of the core that can be isolated for inspection to see if they are more similar to the nonglacial section. This biogenic-rich green mud within diamict intervals is important because it may represent warming, without a full collapse of the ice margin. Further paleoceanographic studies of these intervals are needed to provide insights into how glaciers respond to warming climates, helping to understand how tipping points for full glacial retreat may be reached.

Diamict-Only Lithostratigraphy

The relatively gradational and subjective differences in the diamict-only lithofacies are not ideal for either type of classification analysis. Supervised models struggle to more accurately differentiate lithofacies than the unsupervised cluster analysis. Clusters identified from the diamict-only portion of the composite splice do not correspond directly with unique visually observed facies (Fig. 13). This appears to be a consequence of the data sets used in this study. Although ideally chosen to represent textural proxies, all data sets can also be interpreted as compositional proxies, whereas the visual core description focuses on macrotexture (i.e., lonestones) and sedimentary structures.

We argue, rather than providing an alternative to using CT scans and/or X-radiographs, that the unsupervised cluster analysis results, specifically the unsupervised mixture-model clustering, can be used as a compositional component that augments the observed visual lithofacies. This can be used to develop robust lithofacies associations within the lithostratigraphy; these associations allow for study of temporal variations in glacial conditions. However, a highly detailed temporal record of glacial advance and retreat requires input from many complimentary records of regional glacial and climatic proxies (e.g., IRD mass accumulation rate, δ18O, and sea surface temperature) and is beyond the scope of this work. Based on downcore variations in cluster patterns, the diamict-only interval can be separated into three sections—lower (74.85–96.64 m CCSF-A), middle (19.3–74.85 m CCSF-A), and upper (6.32–19.3 m CCSF-A). These sections are distinctive in both clusters and observed lithofacies, reflecting variations in sedimentary processes related to glacier dynamics.

The lower section (74.85–96.64 m CCSF-A) is primarily defined by two thick intervals of cluster 3, interpreted to represent a coarse-grained component, which coincides with a clast-rich diamict and a sandy diamict unit, respectively (Fig. 13). The greater clast count in the clast-rich diamict is indicative of more intense ice rafting (Cowan et al., 1997), in this case perhaps from a quartz- and/or zircon-rich source, while the sandy diamict signifies deposition from turbidity currents (Cowan et al., 1999) in a proximal glacial setting (Ó Cofaigh and Dowdeswell, 2001). From this, cluster 3 can be interpreted to represent a period when the ice stream was relatively proximal to the core location. Thus, the core section dominated by these proximal facies, indicate ice-proximal glacial conditions, relative to site U1419 in the Gulf of Alaska (Fig. 13).

The middle section (19.3–74.85 m CCSF-A) is predominately represented by cluster 1, the mud-rich, clast-poor diamict. The prevalence of this type of diamict suggests a continuous glacial input for this period. However, this section is not homogeneous and contains intervals of cluster 2 (Ca-rich) and cluster 3 (coarse-grained) compositions, co-occurring with visually identified mud with lonestones, stratified diamict, and sandy diamict facies. When the Ca-rich cluster 2 co-occurs with the mud with lonestones facies, it is interpreted here to represent a relatively ice-distal glacimarine environment where productivity increases, but when glacial ice has not fully retreated onshore, as evidenced from the continued deposition of lonestones (Fig. 13). These intervals also have high b*/Ca ratios (>0.5), indicating the high Ca content defining the cluster is most likely a biogenic source (Fig. 12A). It is noted that at this co-occurrence, cluster 2 comprises a thicker interval than identified solely by the visual lithofacies, suggesting that ice-reduced conditions persisted for longer than what is recorded simply by the visual record. When cluster 2 co-occurs with the massive clast-poor diamict or sandy diamict facies, it most likely contains an increased amount of terrigenous-sourced Ca from plagioclase. Also within this middle section, cluster 3 corresponds with a unit of sandy diamict, interpreted as gravity flow deposition, indicating a time of relative glacier proximity. Clast-poor stratified diamicts also are observed in this section, and the mud-rich layers that comprise these stratified intervals are interpreted to be the product of proximal meltwater plume deposition due to their fine-grained nature, diffuse contacts, and cm-scale thickness (Cowan and Powell, 1991; Cowan et al., 1997; Ó Cofaigh and Dowdeswell, 2001; Curran et al., 2004; Dowdeswell et al., 2015). This suggestion of meltwater plume deposition indicates that a temperate ice stream was nearby. The combination of cluster 3, sandy diamict and/or clast-poor stratified diamict indicates relative glacier proximity (Fig. 1). The switching between lithofacies and clusters associated with relatively proximal and distal glacial conditions implies fluctuating glacial conditions with multiple occurrences of glacial advance and retreat (Fig. 13).

The upper section (6.32–19.3 m CCSF-A) is mostly represented by the cluster 4 (heavy-mineral–rich) composition, with some alternating sections of cluster 2 (Ca-rich) (Fig. 13). The visual lithofacies are chiefly clast-poor stratified diamict, which transitions into clast-poor massive diamict. The Ca-rich cluster 2 appears to primarily be controlled by an increase in a Ca-rich terrigenous source, most likely plagioclase (Fig. 12A; File S3 [footnote 1]). The Bering Glacier experienced documented periods of highly turbid meltwater discharge within the Holocene (Jaeger and Nittrouer, 1999; Jaeger and Kramer, 2014); thus high meltwater discharge could entrain and transport heavy minerals to the terminus. Therefore, we interpret heavy-mineral–rich cluster 4 to coincide with periods of increased meltwater discharge. A complementary or alternative interpretation is that it reflects a provenance shift to protoliths enriched in heavy minerals. The dominance of cluster 4 in this upper section suggests that it represents a time of warming, increased meltwater discharge, and retreating glacial conditions that lead up to the diamict-to-mud transition at 6.3 m CCSF-A, ∼15,000 yr B.P., when the Bering Glacier began its retreat onshore (Fig. 12 and 13; Davies et al., 2011). The start of this interval at 19.3 m CCSF-A may correspond with the beginning of regional deglaciation at ca. 19 ka (Clark et al., 2009).

Record of Glacial Dynamics in Southeastern Alaska

While this study places basic constraints on glacial dynamics in southeastern Alaska based on lithofacies observations and cluster model results informed by core geochemical and physical data, full evaluation requires data that cannot be derived from this analysis alone. The location of temperate tidewater glacial termini is controlled by many factors including but not limited to sea surface temperature, water depth, rate of ice calving, sedimentation, and amount of meltwater production. Thus, a robust record of offshore glacial dynamics at IODP site U1419 should integrate a high-resolution chronology (Walczak et al., 2016), with an alkenone-based sea surface temperature record, ice rafted debris mass fluxes (Sandefur et al., 2015), diatom sea ice records (LeVay et al., 2017), foraminifera assemblages (Belanger et al., 2016), and an environmental paleomagnetic record of sediment supply and dynamics (Velle et al., 2016). Provenance analysis based on an end-member mixing model also can add constraints to onshore glacial erosion, which can be correlated to records of offshore terminus advance or retreat (Penkrot et al., 2014; Penkrot et al., 2017b).

We are able to recognize a dynamic temperate glacimarine environment using visual lithofacies descriptions from core images and CT scans complemented with machine learning models of sediment composition derived from downcore logging data. Supervised quadratic discriminant function analysis, a support vector machine, and unsupervised cluster modeling are able to identify transitions in mud-diamict glacimarine lithofacies, whereas model results for heterolithic glacial diamict facies are unable to capture the subtle variations in facies seen in CT scan and/or X-radiograph core images. However, cluster modeling results for the diamict portion of the core provide a valuable complementary objective record that incorporates both textural and geochemical properties. The primary benefit of the cluster analysis is that it reduces noisy multivariate logging data into relatively simple clusters. Interpretation of clusters through comparison with logging data ratios, informed by principle component analysis, and observed lithofacies associations allows for changes in basic glacial condition to be identified. Variations in glacial conditions are more clearly observed in downcore changes in clusters than through comparison of multiple logging data ratios.

Twelve glacimarine lithofacies are described that are commonly observed in a temperate tidewater glacial system. The dominant lithofacies is a clast-poor diamict, but facies range from massive bioturbated mud to stratified diamicts. Principal component analysis shows that color reflectance (b*), magnetic susceptibility, and natural gamma ray logs resolve the major down-core transitions in lithofacies between mud and diamict, whereas scanning XRF data better resolve compositional changes within diamict facies. Physical properties (b*, MS, and NGR) are most useful for delineating variations in composition (b* and MS) and clay content (NGR). Elemental data appear to best capture differences in sand versus clay content and heavy-mineral abundance (Ca, Zr, and Si versus K, Rb, and Al) and biogenic contributions (Ca).

Success in using machine learning to classify lithofacies was more influenced by the data used in the model, and the variance of that data between lithofacies, than the choice of modeling approach. Supervised modeling techniques and hierarchical clustering produced a more accurate representation of glacial and interglacial lithologies than mixture-model clustering. There was little difference in success for either supervised or unsupervised approaches for the diamict-only lithostratigraphy. Physical properties NGR, b*, and MS are most useful for delineating groups or creating clusters that represent the mud and diamict end-member lithologies that reflect major variations in composition. Elemental data (Rb, Al, Zr, Si, and Ca) delineate more subtle variations in composition and texture due to their higher sampling resolution than the physical properties coupled with the inclusion of elemental proxies for both clay and heavy minerals. Elemental data are beneficial for adding a compositional component to the heterolithic diamict lithofacies.

A temporal record of variations in regional glacial conditions results from combining cluster-modeled compositional data with visually observed lithofacies associations and interpretations of logging ratios. This record reveals three general glacimarine settings: ice-proximal glacial conditions within the Gulf of Alaska relative to Site U1419; fluctuating glacial conditions with evidence of glacial advance and retreat; and retreating glacial conditions with increased meltwater influence that preceded a full glacial retreat onshore in the Holocene. Using cluster analysis to highlight compositional variations interpreted to represent meltwater production provides an additional proxy that is a less expensive and more time-efficient alternative to using sedimentary structures in CT scans and/or X-radiographs for identifying this influence. This may have implications for identifying times of past glacial instability for meltwater-dominated systems such as modern-day Greenland (Ó Cofaigh et al., 2016), Svalbard (Dowdeswell et al., 2015), Neogene Antarctica (Hambrey and McKelvey, 2000), and longer temporal records in the Gulf of Alaska drilled by Expedition 341 (Jaeger et al., 2014).

Expedition 341 was carried out by the Integrated Ocean Drilling Program (IODP). We thank the IODP-USIO and the captain and crew of the D/V JOIDES Resolution. We thank David Houpt for assistance with scanning XRF measurements at the Gulf Coast Core Repository, Clark Alexander for use of X-radiography equipment, Louis-Frédéric Daigle, Quentin Beauvais, and Julie Velle for assistance with CT scanning, and Sabine Hunze and two anonymous reviewers for comments and suggestions that significantly improved the manuscript. Funding for this research was provided by the Consortium for Ocean Leadership U.S. Scientist Support Program post-cruise grant to JMJ and LL and National Science Foundation grants OCE-1434402 (JMJ), OCE-1434945 (EAC), University of Florida graduate research fellowship for M.P., and by Natural Sciences and Engineering Research Council to G.S.

1Supplemental Materials. Figures S1–S5: Supplemental figures present box-plots of logging data grouped by observed lithofacies, and mud-diamict and diamict-only clusters. In addition, they show Bayesian information criterion (BIC) plots for the most accurate unsupervised clustering results and crossplots of b* vs. MS. File S1: Supervised and unsupervised classification R code. File S2: Basic statistical parameters for the logging data. File S3: Shipboard collected mineralogy RockJock results. Please visit or access the full-text article on to view the Supplemental Materials.
Science Editor: Shanaka de Silva
Associate Editor: Anthony E. Rathburn
Gold Open Access: This paper is published under the terms of the CC-BY-NC license.