In this study, a novel application of a statistical approach is utilized for analysis of downhole logging data from Miocene-aged siliciclastic shelf sediments on the New Jersey Margin (eastern USA). A multivariate iterative nonhierarchical cluster analysis (INCA) of spectral gamma-ray logs from Integrated Ocean Drilling Program (IODP) Expedition 313 enables lithology within this siliciclastic succession to be inferred and, through comparison with the 1311 m of recovered core, a continuous assessment of depositional sequences is constructed. Significant changes in INCA clusters corroborate most key stratigraphic surfaces interpreted from the core, and this result has particular value for surface recognition in intervals of poor core recovery. This analysis contributes to the evaluation of sequence stratigraphic models of large-scale clinoform complexes that predict depositional environments, sediment composition, and stratal geometries in response to sea-level changes. The novel approach of combining statistical analysis with detailed lithostratigraphic and seismic reflection data sets will be of interest to any scientists working with downhole logs, especially spectral gamma-ray data, and also provides a reference for the strengths and weaknesses of multicomponent analysis applied to continental margin lithofacies. The method presented here is appropriate for evaluating successions elsewhere and also has value for hydrocarbon exploration where sequence stratigraphy is a fundamental tool.

Sequence stratigraphy based on seismic data, downhole logs, and sediment facies forms a fundamental discipline in studies of sea-level change as well as for hydrocarbon exploration. However, statistical comparisons of how the lithologies of large-scale clinoforms correlate with sequence stratigraphic boundaries, as identified from seismic data, are scant in the literature and remain a topic of case-by-case analysis of limited cores (e.g., Catuneanu, 2006; Coe, 2003). The ability to evaluate critically the degree to which lithofacies can be distinguished by applying an objective classification scheme is only possible where detailed sedimentological descriptions can be compared with the physical data sets used in the analysis. Comprehensive seismic reflection data available for the New Jersey shelf (eastern USA) aids evaluation of the application of such statistical analysis to continental margin sequence stratigraphy.

Several generations of seismic surveys have been acquired from the New Jersey shelf, and 15 Early to early-Middle Miocene (ca. 23–13 Ma) seismic sequence boundaries were previously recognized (Monteverde et al., 2008; Monteverde, 2008). Combined with 5800 m of downhole geophysical logs and 1310 m of core recovered from 3 sites (M0027–M0029, Fig. 1A) during Integrated Ocean Drilling Program (IODP) Expedition 313, a considerable data set is available (Fig. 1B). These data have enabled the recognition of 18 distinct sequences within the drilled Miocene-aged successions (Miller et al., 2013b). The downhole logging data include spectral gamma-ray data acquired through pipe from almost the complete succession, representing the only continuous data set across sites. Sediments were recovered from the topsets, foresets, and toesets of a series of Early to Middle Miocene clinoforms from 750 to 180 m depth below seafloor with excellent (>80%) core recovery (Fig. 1B). In this context statistical analyses of the downhole spectral gamma-ray logs can be interpreted and compared with confidence to the lithology of the successions.

Gamma-ray logs are a fundamental data set and are acquired as standard for both scientific and commercial studies. The gamma-ray logs provide an indication of the composition of the sediments; this aids the recognition of major depositional units (e.g., Serra, 1984; Ellis and Singer, 2007; Table 1). The dominant sediment in the Expedition 313 topsets is nonconsolidated silt or clay (Mountain et al., 2010). Nonconsolidated coarse sand dominates in the foresets, and the toesets are characterized by glauconite-rich clay, silt, and fine sand (Mountain et al., 2010). The majority of the successions are dominantly siliciclastic, and X-ray diffraction results indicate an average quartz content of ∼60% (Mountain et al., 2010). Carbonate is rarely significant, with the exception of the uppermost ∼20 m of succession (Mountain et al., 2010). In these predominantly siliciclastic sediments, the relative proportions of sand, silt, and clay can theoretically be inferred from gamma-ray logs. However, this is complicated if the proportion of K-bearing micas and feldspars is high or if accessory minerals such as glauconite are present, as is common throughout the Expedition 313 sediments (Mountain et al., 2010). To recognize and resolve these various influences, the acquisition of spectral gamma ray logs, where the contents of the individual elements that emit gamma rays (U, Th, and K) are also differentiated, is invaluable, and this data set was relied on heavily during Expedition 313 for sequence stratigraphic interpretations (Table 1).

Advanced statistical analysis of the spectral gamma-ray data enables its significance and characteristic response to specific sedimentary heterogeneities within the Expedition 313 successions to be identified. This analysis subsequently benefits interpretations where surfaces are within minor coring gaps (generally <0.3 m) and in the poorly recovered upper 200 m, where coring was selectively undertaken (Fig. 1B). In summary, multivariate statistical analyses are performed with the intention of enabling the following: (1) a quantitative assessment of K, U, and Th concentrations and their comparative proportions as well as their relationship with lithology (see following discussions); (2) the provision of a method to objectively identify the major changes in spectral gamma ray in the three boreholes and to compare changes across sites; (3) an assessment of the relationship of the iterative nonhierarchical cluster analysis (INCA) results with sedimentary facies within the successions, as defined by Expedition 313 Scientists (Mountain et al., 2010); (4) the identification of significant changes in the successions from the statistical results and their relationship with the sequence stratigraphic surfaces identified from the recovered core and sequence boundaries inferred from seismic reflectors; (5) a continuous lithological interpretation from the statistical results; (6) an assessment of the extent to which the spectral gamma-ray log can be relied on to objectively predict sedimentary facies and identify key surfaces (see Discussion); and (7) a contribution to the IODP Expedition 313 goal of evaluating sequence stratigraphic facies models that predict depositional environments, sediment compositions, and strata geometries in response to sea-level change.

INCA is a multivariate statistical approach used to analyze data, suited to large data sets where a defined number of clusters can be hypothesized. This technique has been successfully used elsewhere to characterize geological formations based on log properties (Pelling et al., 1991; Tudge et al., 2009). INCA is a form of cluster analysis based on the k-means algorithm (Steinhaus, 1957; Lloyd, 1982; Forgy, 1965; Jancey, 1966; MacQueen, 1967; see Davis, 2002, for a review), which groups a set of data such that values within a group (cluster) are more similar to each other than to those in the other clusters, i.e., to minimize variability within a cluster and maximize variability between clusters. An initial seed point is selected for each cluster and moved iteratively while each data point is successively reallocated to the most appropriate cluster. Once no further reallocation takes place, the process is complete and the minimum variance for each cluster is achieved; k clusters of greatest possible distinction are produced. The k-means analysis presented in this paper uses the Euclidian distance between clusters and a minimum distance is set between seed points (Ball and Hall, 1965). INCA requires a set of variables to be chosen and either a hypothesis of the number of distinctive clusters within a data set and/or a search for the most likely number of clusters. The analysis is performed once each variable has been standardized to a group mean of zero and unit standard deviation. The general idea is that the more unique each individual output cluster, the better the analysis has performed and the more appropriate the hypothesis. Each output cluster is independent of depth, but can be subsequently analyzed against depth to investigate the variation downhole.

The variables run through INCA are determined by the intended interpretation. In this paper the focus is on the interpretation of lithology and an analysis of lithological variation from the downhole logs, for which spectral gamma-ray logs are ideal. An additional advantage of analyzing the K, U, and Th logs is that the acquisition of data through pipe enables evaluation of the complete drilled successions and results can be compared across sites.

In order to hypothesize the expected number of distinct clusters in the Expedition 313 successions, the number of different sedimentological compositions identified from the boreholes was considered. The lithological classification selected by the Expedition 313 Scientists uses composition and texture only to describe lithology. Where glauconite composes >50% of the overall constituents, the lithology prefix is glauconite and between 25% and 50% the prefix is glauconitic. The lithological classification separates sediments based on the degree of lithification (e.g., sand and sandstone are classified separately), but it is assumed that this does not significantly affect the gamma-ray response; therefore slightly raised counts in more compacted intervals and the role of diagenetic cement are not considered important for this analysis. Of the lithological classifications recognized across sites by the Expedition 313 Scientists (Mountain et al., 2010), 10 compose 99% of the recovered core. Several of these classifications are anticipated to have a distinctive gamma-ray signal, for example, glauconite sand is expected to have a high K/Th ratio and clay is generally characterized by a high Th content (Ellis and Singer, 2007). Certain sediments are expected to be less clear from the gamma-ray signal, such as distinguishing between sandy silt and silt. Thus it was decided to output 10 clusters (labeled C1–C10) intended to approximate these 10 lithological classifications and provide an effective characterization of the boreholes. This hypothesis was also tested by running analyses with cluster numbers between 2 and 20. However, due to the gradational nature of the relationship between K, U, and Th in the New Jersey sediments, an optimal number of clusters is not straightforward to formally select, but the changes in within-cluster and between-cluster distances can be used as a guide (see Supplemental File 11). The clusters that are output from the analyses are independent of depth and each has distinctive properties (Table 2; Fig. 2). If each lithology produces a distinct gamma-ray signature, each output cluster should be unique, and the hypothesis is appropriate. The clusters are plotted against depth in order to evaluate characteristic variation downhole. The method provides an objective means of assessment that is a beneficial tool to aid in the definition of lithology and key surfaces.

The clusters output from the INCA analysis can be interpreted in terms of sedimentological composition based on (1) the proportion of K, U, and Th (Table 2; Fig. 2) and (2) comparison with the recovered core by both numerical analysis (Table 3) and through visual comparison (Fig. 3).

Proportion of K, U, and Th

From the relative proportions of K, U, and Th in each of the 10 output clusters it is possible to predict the likely lithology, and from the range of values within each cluster, the distinguishing characteristics of each cluster are apparent. Cluster C1 has lower K, U, and Th values than that of any other cluster (Table 2; Fig. 2). These low total gamma-ray values suggest clean quartz-rich sandy sediments that contain little glauconite. There is no overlap in the interquartile range of cluster C1 for U and only a slight overlap for Th (with cluster C9) and K (with clusters C2 and C3) (Table 2). Clusters C2 and C10 have the highest Th content (Fig. 2), which is compatible with a clay-rich lithology, and are the only clusters with a Th concentration greater than U (Table 2). Values of Th in the interquartile range of cluster C2 do not overlap with those of any other clusters (Table 2). The clusters with highest K concentration (and high K/Th ratios) are C7, C8, and C9 (Fig. 2), and cluster C4 also has a high K/Th ratio (Table 2); this suggests that these clusters reflect a significant glauconite component. Values of K within the interquartile ranges of these clusters show no overlap with those of any other clusters (Table 2). Clusters C7 and C8 have the highest concentration of U with the 25th percentile value for both above the interquartile range of all other clusters (Table 2). Clusters C6, C7, and C8 have high U/Th ratios, typically indicating an elevated concentration of organic matter. However, unlike for clusters C7 and C8, a low absolute value of K and low K/Th ratio for cluster C6 suggest an absence of glauconite. There is greater overlap in interquartile ranges for clusters C3–C6 and C10. From the generally moderate concentrations of K, U, and Th in cluster C5 a correlation with silt-rich lithologies is anticipated, with the relatively high K content but low K/Th ratio consistent with the presence of a K-mica. Cluster C3 has low to moderate concentrations of K, U, and Th and could correlate with silts or sands (Fig. 2).

Comparison with Recovered Core

To establish the links between the INCA results and lithology, clusters are compared with sedimentological observations (where core recovery allows). The recovered core is depth-registered to the logging data through comparison of total gamma-ray logs with natural gamma measurements on the recovered core (aided by other physical property data sets as required). Therefore, core lithology can be precisely related to the spectral gamma-ray logs. From numerical analysis of the 10 clusters against the frequency with which they correspond to each identified lithology, the hypothesis that each cluster represents a distinct lithology can be evaluated (Table 3). The results indicate that this is true to a certain extent; for example, cluster C1 corresponds with quartz dominated sand in 85% of occurrences, as predicted (Fig. 3A). However, it is clear from Table 3 that not all clusters reflect a single lithology. For example, cluster C3 ranges in grain size from sand to clayey silt for 80% of occurrences (bottom row, Table 3; Fig. 3B). Within sand intervals, as the silt component increases, cluster C3 is observed to replace cluster C1 in dominance. Clusters C3, C5, and C6 correlate with sediments classified as silt, and >20% of occurrences with cluster C5 correlate predominantly with silt-dominated sediments (Table 3; Figs. 3B–3D). The comparative rarity of cluster C5 in relation to most clusters (top row, Table 2) can be diagnostic of a distinctive lithology or a key surface. Clusters C2 and C10 correspond predominantly with clay-rich sediments, as discussed herein, with silty clay most common for cluster C2 and clayey silt for cluster C10 (Table 3; Figs. 3E, 3F). Clusters C4, C7, C8, and C9 are observed to correlate with glauconite-containing sediments (Figs. 3G–3J). Clusters C4 and C7 commonly correspond with sediments that are classified as sand, but comparison with the recovered core indicates a glauco component (Table 3; Figs. 3G, 3H).

The numerical and visual comparisons with the recovered core indicate that although a specific gamma-ray signature is not always confined to a single lithology, a prediction can be made with a reasonable degree of confidence. For example, cluster C8 corresponds with either glauconite sand or glauconite mud in 82% of the analyzed interval. The combination of clusters is important in predicting a lithology; increased variability observed in silt-rich intervals reflects a more variable mineralogical composition (Figs. 3B–3D). This sensitivity to cluster combination is expected, considering that each cluster reflects a range of spectral gamma-ray values with the interquartile ranges for clusters C3–C6, and C10 in particular showing some overlaps (Table 2).

To summarize, sands, silts, clays, and sediment containing glauconite can be predicted with a reasonable degree of confidence from INCA clusters (Fig. 3), either as a discrete lithology (e.g., cluster C1 or C8) or as sediments encompassing a particular grain size range (Table 3, lowermost row).

The INCA analysis can be used to divide each site into a number of INCA divisions, guided by visual inspection of the most significant changes in cluster trends downhole in order to identify major changes between the relative dominance of the 10 clusters (Table 4). For example, the boundary between the two uppermost INCA divisions is located where cluster C1 increases dramatically uphole from less than 13% to greater than 68% dominant. On this basis the boreholes are divided into seven INCA divisions, labeled D7 at the base to D1 at the top of the hole, although M0027 is only divided into five (Fig. 4). From the cluster combinations and their relative percentages, the sedimentological characteristics of the seven INCA divisions (see summary of Table 4) are interpreted to reflect glauconite-rich sequences (divisions D7 and D5 and, in M0029, D3), finer grained sediments (divisions D6, D4, and D2), and coarser grained successions (divisions D3 and D1).

Sediments that accumulated in the same depositional environment in different sequences or systems tracts are expected to display similarities and so have similar cluster patterns. Accordingly, similar clusters are observed across sites in the clinoform topsets, rollovers, and foresets, although the toesets display greater variability (Fig. 4; see Supplemental File 22 for greater detail).

At all three sites, the upper few hundred meters of the boreholes pass through some clinoform topsets that are sand dominated, as evidenced by the occurrence of cluster C1 (Fig. 4). On the seismic profiles, they display a seismic facies consisting of relatively discontinuous internal seismic reflections (Mountain et al., 2010). This characteristic facies correlates with INCA division D1. Below reflector m4, and above m4.1, the seismic facies of the topsets changes to more organized continuous internal reflections (Fig. 4). This change fits with the change to silt at all sites, as evidenced by the appearance of cluster C3 (and subsidiary clusters) within INCA division D2 (Table 4). In M0027 and M0028 the boundary between divisions D1 and D2 is more abrupt than in M0029, where the sediments grade from sand to silt over a few meters.

Boreholes M0027 and M0028 show similar features at the clinoform scale (Fig. 4). Below reflector m4.1, the topsets are predominantly silt, with a dominance of cluster C3 and subsidiary C2, C4, C6, and C10 in very similar proportions (Table 4). Below the clinoform rollover, the upper parts of the clinoform foresets are clean sand deposits (cluster C1) that are enriched in glauconite (cluster C4) at their upper limit and correlate with INCA division D3 (Fig. 4). The lower part of the clinoform foresets is dominated by silts (clusters C3/C6 dominant with subsidiary C3-C6, C10) correlating with INCA division D4 (Fig. 4).

These silts in M0027 and the lower part of the clinoform foresets in M0028 are also characterized by significant occurrences of cluster C5, reflecting a higher K content that may reflect a higher abundance of K-rich mica, consistent with sedimentological observations (Fig. 4; Mountain et al., 2010). From an INCA statistical point of view, the difference between the silts from the topsets (division D2) and the foresets (division D4) is slight, essentially marked by the absence of sands (cluster C1) and a lower percentage of the high-Th cluster C2 in the foresets. The clinoform toesets sampled immediately below the foresets (division D5) are enriched at both sites in glauconitic sands (cluster C9).

The clinoform crossed in site M0029 shows a different geometry compared to the other sites, and the foresets have been sampled in a distal position, seaward of the clinoform rollover (Fig. 4). Accordingly, the vertical cluster succession is slightly different. Below the sandy topsets of division D1, the clinoform foresets are silt dominated (clusters C3, C6, and C10) with the absence of clean sand deposits (cluster C1) in the upper part. As in boreholes M0027 and M0028, the uppermost toesets sampled at this site (INCA division D3; Fig. 4) contain glauconitic sands (cluster C9). There is an absence of cluster C1 within the toesets at this site (INCA divisions D3 to D7) and within INCA division D2, with the exception of a small interval encompassing reflector m4.5, which reflects the lowest stratigraphic appearance of cluster C1 (Fig. 4). This absence of cluster C1 below this point indicates that intervals of clean sands are lacking and any sands here are related to condensed intervals with correspondingly higher gamma ray. A rarity of cluster C2 and the decreased dominance of cluster C10 within the toesets (except in division D6) correlates with the sands and silts being more depleted in Th with respect to U and K.

In addition to the broad-scale trends described in the preceding, the statistical results allow distinctive INCA cluster combinations to be recognized in detail across sites and facilitate the identification of lithofacies. Six distinct cluster patterns have been identified from detailed visual inspection of the cluster trends downhole (Supplemental File 2 [see footnote 2]) and are presented in Figure 5. For example, a dominance of cluster C2 with subsidiary C3 (Fig. 5A) characterizes INCA division D6 in M0028 and M0029 (clinoform toesets, Fig. 4). From comparison with the recovered core, this cluster combination reflects the compositionally similar tan clays (M0028) and clayey silts (M0029) of lithologic unit VI (Mountain et al., 2010). For clays elsewhere cluster C10 is commonly secondary to C2. Cluster pattern C2-C5-C10 with a significant absence of C3 and C6 characterizes an interval of clays within INCA division D2 in M0027 and M0028 (Fig. 5B). Notably high magnetic susceptibility reinforces this as a distinct interval (Mountain et al., 2010; Nilsson et al., 2013).

Cluster combination C1 and C4 is another distinctive pattern, reflecting sands containing glauconite (Fig. 5C). This combination occurs at the INCA division D2-D3 transition, at the upper limit of the clean sands found below the clinoform rollovers crossed in sites M0027 and M0028 (Fig. 4). This pattern is also locally evident within INCA division D1 in sites M0028 and M0029 (see discussion of Upper Unconsolidated Sediments). Variation in glauconite concentration is often coincident with major lithological changes and commonly characterizes key sequence stratigraphic surfaces recognized in the core. In INCA clusters variation in glauconite concentration is apparent from changes in the relative proportion of clusters C4 and C9, with cluster C9 correlating with higher concentrations. Such a pattern is observed, for example, in INCA division D3 in M0029, which is characterized by a high concentration of glauconite at the base, clearly contrasting from the underlying sediments (Fig. 5D).

The distinctive cluster combination of C7 and C8 (high U) is only observed in the deeper part of sites and is only dominant within INCA division D5 (lithologic unit VII) of M0027 (Fig. 5E). Both of these clusters are relatively rare (Table 4), particularly in M0028 and M0029. There are only three appearances of cluster C8 in M0028 and M0029, each time in close conjunction with cluster C7 within strongly bioturbated sediments and in the vicinity of an INCA (and sedimentological) boundary. Diagenetic processes in the vicinity of cemented sediments can also contribute to high U relative to K and Th. The most significant interval of cluster C7 in either M0028 or M0029 occurs in M0029, in conjunction with cluster C9 between 666 and 708 m in depth (INCA divisions D4, D5; Fig. 4). This is interpreted to be a deep offshore environment (Mountain et al., 2010) with the C7 and C9 combination reflecting the highest glauconite concentrations within this interval.

Cluster C5 is rare in comparison with the other clusters observed in silt-rich or sandy lithologies and the precise pattern of clusters with which it occurs with can be diagnostic of particular mineralogies. In certain intervals in the lower parts of the boreholes, cluster C5 occurs in conjunction with clusters C4 or C7–C9 in significant concentrations, suggesting that glauconite is present. Where glauconite-indicating clusters are absent, cluster combinations that include cluster C5 appear to be diagnostic of K mica-rich intervals, as confirmed by comparison with sedimentological descriptions of the recovered core (Mountain et al., 2010). Intervals characterized by subtle mineralogical changes can be clearer from these cluster patterns than from core observations; for example, cluster pattern C5-C6 in M0029 forms an INCA subdivision in D2 where no equivalent sedimentological subdivision is identified (Fig. 5F). As described in the preceding, cluster C5 is common in the silts that compose the lower part of the clinoform foresets in M0027 and M0028 (division D4), where it occurs in conjunction with C3-C6-C10 and/or C4-C2.

The sequence stratigraphic interpretation of key surfaces within the successions recognized from sedimentological observations of the recovered core or from the downhole logging data has been achieved through integration with the seismic profiles within the limits of their lower (∼5 m vertical) resolution (Mountain et al., 2010; Miller et al., 2013b). The gamma-ray logs aid with recognizing significant lithological changes and can help identify sequence boundaries and transgressive or flooding surfaces. The character of an individual sequence stratigraphic surface, and the associated gamma-ray response, will vary depending on its location along the clinoform profile due to the facies present. The presence or absence of glauconite may vary along a sequence stratigraphic surface. Sharp lithological changes are generally reflected by a distinct change in INCA clusters, whereas gradational transitions are expressed by a more subtle variation from one cluster combination to another. Examples of changes in clusters across the geometry of the clinoform structure are given in Figure 6, focusing on variation in the vicinity of the m5.8 and m5.4 sequence boundaries, as these sequences are well understood in terms of their sequence stratigraphic significance (Miller et al., 2013b; Proust, his own data).

Sequence Boundaries

Sequence boundaries are commonly characterized by a sharp change in the gamma ray log, typically to lower values uphole (e.g., Coe, 2003), but the change across the surface will vary if the systems tracts either side of the boundary are incomplete due to erosion, as is the case for several Expedition 313 sequences (Miller et al., 2013b).

Unconformities are generally clear on the topsets (landward of the clinoform rollover) where they are commonly a merged sequence boundary and transgressive surface (Mountain et al., 2010; Proust, his own data). An example of a sequence boundary in a topset position is m5.4 in M0027, which is interpreted from sedimentological observations to be an erosion surface separating the underlying silts from a very coarse sand lag (Fig. 6A; Miller et al., 2013b). A brief appearance of cluster C5 marks the surface.

The m5.2 sequence in M0029, the m5.4 sequence in M0027 (Fig. 6B), and the m5.8 sequence in M0028 (Fig. 6C) are the only thick foreset deposits recovered during Expedition 313 (Fig. 4). Unconformities are usually clear in these foreset deposits and display coarsening-upward lowstand deposits above the sequence boundary (Miller et al., 2013b). For example, sequence boundary m5.4 in M0028 (Fig. 6B) is marked by a distinct lithological change accompanied by a sharp decrease in gamma rays, as reflected by the disappearance of cluster C10 uphole (Fig. 6B). Within lowstand toe-of-slope apron glauconite sands, sequence boundary m5.8 in M0027 is positioned from sedimentological observations within an interval where INCA clusters signal a clear but gradual compositional change from U-rich clusters C7 and C8 to K-rich clusters C4 and C9 uphole (Fig. 6C). Stacked gravity flow deposits within the toesets can obscure sequence stratigraphic surfaces (Mountain et al., 2010). A localized increase or an upward change in glauconite may aid identification of a sequence boundary. For example, in the toesets of M0028 and M0029 cluster C9 appears immediately above sequence boundary m5.8 (Figs. 6D, 6E). Cluster C7 also underlies sequence boundary m5.8 in both sites (disappearing uphole at the top of INCA division D7; Figs. 6D, 6E), aiding cross-site correlation.

Flooding and Maximum Flooding Surfaces

These surfaces are located in the finest grained sediments (highest gamma ray values) but may also be characterized by high localized glauconite concentrations (Coe, 2003; Catuneanu, 2006). This peak in glauconite is apparent from INCA clusters by an isolated occurrence of cluster C4 or C7–C9 (refer to examples in Supplemental File 2 [see footnote 2]). Three maximum flooding surfaces are shown in Figures 6A–6C and are characterized in each case by a reappearance of cluster C4 at the surface, with an underlying gap in cluster C4 ranging from <2 m (e.g., in the M0027 topsets of the m5.4 sequence; Fig. 6A), to a more considerable absence (comprising INCA subdivision B in the M0027 foresets of the m5.8 sequence; Fig. 6C). Miller et al. (2013b) place this latter flooding surface at ∼458 m, coinciding with the reappearance of cluster C2 uphole (Fig. 6C). Within sequence m5.4 in M0028, the intrasequence reflector m5.35 is inferred from sedimentological observations to correspond with a maximum flooding surface (Miller et al., 2013b), and is located where INCA cluster C5 disappears uphole (Fig. 6B). This is another example of cluster C5 defining a distinct compositional change (within the alternating silt, sand, and silty sands) that is clearer from INCA clusters than from the recovered core.

Transgressive Surfaces

Transgressive surfaces can be characterized by overlying lag deposits (e.g., Coe, 2003), which are usually reflected in INCA clusters by an interval of low gamma ray values (cluster C1 or C3). In M0027, an interval of cluster C1 at 272 m corresponds with a coarse sand lag (Fig. 6A). The coarse sand lag overlying sequence boundary m5.4 is not entirely clear from INCA clusters, although it corresponds with a dominance of cluster C3 and a gap in cluster C10 (reflecting lower gamma radiation) (Fig. 6A). Transgressive surfaces within the thick foreset sequences shown in Figures 6B and 6C are not clearly defined by INCA clusters, although they do correspond with a change in composition (clusters C2 and C5 become significant uphole).

Fining and Coarsening Sequences

Gamma-ray profiles reflect coarsening-upward sediments in the lowstand and highstand systems tracts and fining upward in the transgressive systems tract. Such graded trends are visible from the distribution of INCA clusters, although continuous gradual grain size changes (as opposed to distinct jumps, e.g., from sand to silt to clay) are clearer from the gamma-ray log (Supplemental File 2 [see footnote 2]). Coarsening-upward trends can be recognized either by an increase in the dominance of a statistical cluster that reflects lower gamma radiation, or by a successive change in clusters from those reflecting higher to lower gamma radiation. Fining-upward successions show these changes in reverse. Above sequence boundary m5.4 in M0027, the sediments of the highstand systems tract coarsen upward to the sequence boundary m5.33 (∼289–271 m; Miller et al., 2013b; Fig. 6A). This is reflected in INCA clusters by an increase in the dominance of cluster C6 uphole (low gamma ray values) overlain by an interval of cluster C1 (lowest gamma ray values). The coarsening-upward successions recognized from core observations above sequence boundary m5.8 by Miller et al. (2013b) in M0027 (Fig. 6C) are to some extent identified in INCA clusters by a variation in the abundance of cluster C5 in finer grained sediments (higher gamma rays) and C6 in slightly coarser sediments (lower gamma rays).

To summarize, the quantitative assessment of all major changes in gamma rays provided by the statistical results is a useful aid for recognizing the significance of variations in spectral gamma rays in terms of key sequence stratigraphic surfaces. However, the statistical analysis is most useful for surface identification when analyzed in conjunction with core observations and other geophysical data (see Discussion).

The correspondence of INCA spectral gamma-ray clusters with the major lithologies enables a reasonable prediction of sand, silt, clay, and glauconite content. INCA can be used to interpret lithological characteristics within intervals of incomplete core recovery. In these intervals, the INCA clusters can also be used to estimate the most likely depth of sequence stratigraphic surfaces inferred from the seismic data; several examples are discussed herein (highlighted with asterisks in Fig. 4 and shown in more detail in Supplemental File 2 [see footnote 2]).

Sand-Dominated Sediments Within the Miocene Sequences

Although core recovery was excellent within the Miocene successions, sandier intervals, such as below clinoform rollovers, are generally characterized by lower core recovery, with the sequence stratigraphic surfaces inferred from the seismic profiles often coinciding with an uphole increase in gamma-ray values (Miller et al., 2013b). The statistical analysis is valuable for precisely locating the change from clean sands (cluster C1) to siltier sediments (clusters C3, C4, and C6) and thus confirms the depth in the borehole that corresponds to several seismic reflectors (e.g., m5.7, m5.47, and m5.45 in M0027, Fig. 4; m5.2 and m5.3 in M0028 in Supplemental File 2 [see footnote 2]; Figs. 4 and 5C).

In places within the clinoform toes, lower core recovery complicates definitive placement of a surface, and several alternative depths are proposed (Miller et al., 2013b). Here the INCA analysis can be a useful tool. For example, in M0029 INCA clusters support the placement at 746 m for sequence boundary m5.8 (in a coring gap) due to the consistency with cluster patterns in M0028 (a 2 m interval of cluster C9 immediately above the surface is succeeded uphole by an abrupt change to a dominance of cluster C2; Figs. 6D, 6E). In M0027 an isolated occurrence of cluster C9 within a small interval of poorly recovered glauconite sand likely correlates with the m6 surface (Fig. 5E). The statistical analysis confirms the absence of a major lithological change within a small gap in core recovery above sequence boundary m5.7 in M0028; clusters C3, C4, and C6 (Th-depleted sands) are present throughout. Similarly, within a larger gap in core recovery below sequence boundary m5.3 in M0029 (∼5 m, Fig. 1B), INCA clusters indicate no change in composition from the C3, C4, and C6 dominated silts locally recovered in this interval.

Upper Unconsolidated Sediments

Within INCA division D1 (upper ∼200 m), where unconsolidated material posed considerable drilling challenges and not all intervals were cored, through-pipe spectral gamma-ray measurements are the only continuous data set available (Fig. 1B). The sedimentology is inferred from the INCA analysis and best depth estimates for the Pleistocene and Miocene seismic reflectors within this interval are presented in a cross-site interpretation (Fig. 7; Table 5). Overall, the interval is characterized by clean sands (INCA cluster C1) interspersed with combinations of clusters C2, C3, C5, C6, and C10 (finer grained silt and/or clay layers), some of which contain cluster C4 (high K/Th ratio) and thus may contain glauconite. The uppermost fine-grained layer (F1, Fig. 7) is characterized by an absence of cluster C2 in comparison to the lower silts and clays of division D1, although C2 is less dominant throughout division D1 than within the Miocene clinoform sequences. Cluster C5 is rare within the fine-grained intervals of INCA division D1 in comparison to the sequences below (and absent from D1 in M0027), and in places this aids cross-site comparison between M0028 and M0029 (Fig. 7; Table 5). In M0028 and M0029 the distinctive cluster pattern C1-C4 is observed, with other clusters rare, in the Pleistocene sediments (layer G1, Fig. 7).

Miocene reflectors m4–m1 coincide in M0029 with three intervals of finer grained sediment (layers F4–F6), which can be correlated across sites from the recognition of similar cluster combinations, although the confidence of the correlation varies (Table 5). From the limited sedimentological observations possible in M0027 and M0029, the best estimate of the depth in the core that corresponds to these reflectors is within the depths suggested from the INCA clusters for layers F4–F6 (Fig. 7). Above reflector m1 is another finer grained layer F3 where the INCA cluster combination allows a tentative correlation across sites (Fig. 7). Rare isolated occurrences of cluster C9 within the finer grained intervals of division D1 suggest a significant glauconite component in places (Fig. 7).

The location suggested by INCA clusters for the shallow MIC3a and MIC3c Pleistocene reflectors is bracketing layer F1, although clusters are more disperse in M0029 (Fig. 7). The increased variability in clusters complicates INCA placement of reflector MIC4 in M0029 and in M0027 thick intervals characterized by cluster C1 alone make depth estimates corresponding to the seismic reflectors more challenging (Fig. 7). The interpretation provided here can be compared with that given for the Pleistocene sequences by Miller et al. (2013a, 2013b), where channel deposits above reflector MIC4 identified from seismic interpretation are inferred to be present in M0028 and M0029 but not M0027. These likely correspond to the cluster C1-C4 characterized interval (layer G1) in the statistical analysis (Fig. 7).

Extent to which Statistical Analysis Could Be Used for Lithological Interpretation

One of the key questions posed here is the extent to which spectral gamma-ray data can be relied on for lithological interpretation in the absence of any other data. The preceding described the relationship between INCA clusters and the main sediments observed in the New Jersey shelf successions and demonstrated that certain clusters or cluster combinations enable both a reasonable prediction of lithology and the key sedimentological changes within successions to be recognized.

To be effective at identifying lithological variation, the statistical clusters should be able to identify subtle grain size variations as well as the more significant lithological changes. The cluster variations will be most significant where there is greater mineralogical and textural change. Two further examples of subtle grain size variation, selected due to the similarity of the lithological transition but their markedly different response in INCA clusters are: (1) a gradational change across the INCA division D4-D3 boundary in M0027 from river-influenced offshore silts that become increasingly sandy uphole over a few meters (410 m; M0027 in Supplemental File 2 [see footnote 2]); and (2) an abrupt surface separating fine sands from silts within INCA division D2 in M0028 (310 m; Supplemental File 2 [see footnote 2]). In the first example, from the highstand systems tract of the m5.8 sequence, the change in statistical clusters is very clear, with a spread of clusters C2–C6 and C10 (silt-rich sediments) successively decreasing in dominance to be succeeded by cluster C1 (sands). In the second example, the fining uphole from fine sand to silt is not clear from INCA clusters, being reflected by only a very subtle increase in the proportion of cluster C10.

The Expedition 313 data sets benefit from comprehensive seismic data that enable detailed comparison of seismic reflectors with stratigraphic surfaces recognized in the recovered core and downhole logs (Mountain et al., 2010; G. Mountain, his data). Despite the difference in vertical resolution between seismic data and the Expedition 313 data sets, analyzing the statistical results from the spectral gamma-ray analysis in conjunction with the seismic data enables some sedimentological facies to be inferred. This analysis is invaluable where core recovery is low or absent, as shown by the interpretation of the upper 200 m. From the results presented, the effectiveness of applying an objective classification scheme to distinguish different lithofacies is apparent.

Extent to which Statistical Analysis Could Be Used to Identify Sequence Stratigraphic Surfaces

Where there are changes in clay content, glauconite concentration, or organic matter across a sequence boundary, maximum transgressive surface, or other flooding surface, the statistical analysis generally displays a clear change in clusters and is useful to (1) confirm sedimentological interpretations; (2) help identify the most significant changes in gamma ray data where a sequence stratigraphic surface is predicted from the seismic profiles but is not clear in the core or where alternative interpretations are possible; and (3) identify lithological changes that potentially correspond with sequence stratigraphic surfaces where there is no core recovery. Some important trends emerge from the analysis. For example, sequence boundaries in the poorly recovered sands below the clinoform rollovers in M0027 and M0028 are commonly characterized by an uphole gamma-ray increase and isolated occurrences of a distinct cluster often correlate with sequence stratigraphic surfaces, reflecting a distinct compositional change; maximum flooding surfaces are often located at the uphole reappearance of cluster C4 (glauconite increase) above a brief absence, and transgressive surfaces associated with a lag deposit are generally marked by an interval of lower gamma-ray clusters (see preceding discussion of flooding and transgressive surfaces).

However, only sequence stratigraphic surfaces that are characterized by a compositional change are invariably identified by INCA clusters. Other features, such as an erosive surface within a fairly homogeneous lithology, or cemented or bioturbated intervals, are not necessarily identified. In addition, even if such surfaces are identified from INCA clusters (e.g., clusters C7 or C8 may identify cemented intervals and/or bioturbation if associated with raised levels of K or U), the sequence stratigraphic implication of such distinctive clusters or cluster combinations can be unclear in the absence of additional observations. This is shown where a rare occurrence of cluster C7 in M0028 correlates with intensely bioturbated glauconite mud in a cemented interval (top of INCA division D5; Fig. 6B) but is not inferred from sedimentological observations to reflect a sequence stratigraphic surface (Miller et al., 2013b). Combining the statistical analysis of the spectral gamma ray logs with analysis of other geophysical data, notably sonic logs or density measurements, would help identify most sequence stratigraphic surfaces. Although intervals of sonic logs were collected and density values were derived from core measurements, including these data in our analysis was considered beyond the scope of this paper.

Variation that is Clearer from the Statistical Results than from Other Observations

Certain distinctive intervals and subtle compositional changes are clearer from statistical cluster variations than from a standard assessment of log trends (or from analysis of the recovered core). Examples of compositional variations that are very clear from the INCA clusters include the distinctive cluster combinations discussed herein (Fig. 5; brackets in Figs. 6B, 6C). In particular, the ∼100 m interval at the base of INCA division D2 in M0029 consists of just one sedimentary subdivision and is classed as fairly homogeneous with an absence of sequence stratigraphic surfaces, but is shown by INCA clusters to have significant variations in composition (M0029 in Supplemental File 2 [see footnote 2]). The statistical results can also help within regions of relatively homogeneous sediment where it is more difficult to tie the seismic reflectors to features recognized in the recovered core. For example, the Oligocene glauconite-containing sands of M0027 (INCA division D5; M0027 in Supplemental File 2 [see footnote 2]) are characterized by a dominance of cluster C5 with two instances where an appearance of INCA cluster C8 uphole reflects an increase in U content and coincides with the best estimate from sedimentological observations for a seismic reflector (o1 at 596.3 m and an unnamed surface at 538.68 m, shown in M0027 within Supplemental File 2 [see footnote 2]).

Limitations and Potential Variations in Future Analyses

All core data and all downhole logs, except the spectral gamma ray log measured through the drill pipe, have sections of the succession where no measurements are available (Fig. 1B). Due to relying on through-pipe gamma-ray logs to enable the most complete statistical analysis, the effect of the drill pipe was considered. Analysis of a 342 m interval of open-hole gamma-ray logs from M0027 (depths 603–410 m and 337–188 m) identified a statistically significant positive linear correlation between the open hole and through-pipe data sets with a ∼25% attenuation caused by the steel pipe that appears to be unrelated to lithology. The lack of a caliper log requires the assumption to be made that lower gamma-ray counts are the results of sandier lithologies rather than washout zones. However, the excellent core recovery within the Miocene clinoforms, which includes the recovery of many sandy lithologies, suggests that this is a valid assumption.

Lithologies could be analyzed in more detail by selecting a greater number of clusters to be output from the INCA analysis. For example, within intervals of INCA division D1 (e.g., 29–72 m in M0027; Fig. 7), the INCA analysis returns cluster C1, uniquely making further interpretation difficult. This is in contrast to observation of the limited core recovered where the fluvial to estuarine sands are subdivided into three successions: (1) clean well-sorted sands, (2) poorly sorted sands, and (3) poorly sorted sands with some burrows and gravels that may reflect transgressive lags (Mountain et al., 2010). A more detailed INCA analysis may better distinguish some of these compositions, although could be toward the limit of the gamma-ray log resolution.

There is scope to use other combinations of geophysical properties and/or derived quantities. For example, a statistical analysis of spectral gamma-ray ratios alongside density and magnetic susceptibility is an excellent indicator of glauconite concentration. An analysis including conductivity at shallow and deep depths of investigation (or their ratio) and sonic velocity better quantifies pore salinity variations and variations in induration. An analysis incorporating the full set of geophysical logs would produce a comparable and very effective assessment across sites of the most significant petrophysical changes within the successions and would be very beneficial for the identification of sequence stratigraphic surfaces. The statistical analysis is limited by the variable with the lowest resolution. Using petrophysical data acquired from measurements on the recovered core, where available, would be beneficial for studies at a higher resolution and includes properties such as density that are invaluable for comparisons with the seismic sequences.

The advantage of the k-means analysis applied to the problem presented in this paper is both its simplicity and suitability to the type of data set being analyzed. However, alternative or complementary statistical techniques that could be considered include principal component analysis, self-organizing maps, and discriminant analysis. These techniques were considered less suitable here because (1) principal component analysis is better suited to data sets containing a larger number of variables where some redundancy is suspected in those variables; (2) discriminant analysis is more appropriate where the number of different groups is known with certainty; and (3) a k-means analysis was considered to produce results that are easier to interpret than self-organizing maps (see Davis, 2002, for a review).Variations of the k-means analysis and other statistical techniques could be explored further, but is beyond scope of this paper, which focuses on the application of the statistical results to a geological problem rather than on an in-depth discussion of statistical techniques.

Four main lithological groups are statistically recognized: sand, silts, clays, and glauconite-containing lithologies. A distinct lithology may be characterized by dominance of a single INCA cluster (notably cluster C1 for clean sands) or by a specific combination of INCA clusters (e.g., cluster combination C2/C3 reflects tan clays).

In regions with low or no core recovery, consistent judgments on facies variation can be made from the statistical analysis of the continuous spectral gamma-ray logs.

Sediments and sequence stratigraphic surfaces in the same location of a clinoform sequence generally have a similar expression in INCA clusters; for example, cluster C1 is dominant below a clinoform rollover.

INCA clusters corroborate previously defined boundaries, with the occasional differences between the most significant INCA changes and the most significant changes observed in the core generally located in intervals of poorer core recovery.

An isolated occurrence of a particular cluster (e.g., cluster C9) or the occurrence of a comparatively rare cluster (e.g., cluster C8) or cluster combination (e.g., C2-C3) can be beneficial for highlighting significant variations that commonly correspond to sequence stratigraphic surfaces within the successions.

The statistical clusters can be valuable for identifying subtle lithological or textural changes in the sedimentary successions (e.g., variation in cluster C5 within silt-rich successions), which are sometimes clearer from the INCA results than from visual observation of the downhole logs or core.

The INCA results contribute to our understanding of the key stratal surfaces on the New Jersey continental margin by tying the continuous logging data more tightly into core observations.

The multivariate statistical approach to the analysis of downhole logs represents a novel application of an advanced analytical technique to quantifying significant changes and allows comparison of characteristics across sites. The detailed calibration of the statistical results with the Expedition 313 sediments in intervals of excellent core recovery provides an assessment of the ability to infer facies and key stratigraphic surfaces. The statistical results enable interpretations to be made with more confidence in regions of poorer core recovery, and the same principles can be applied to the study of siliciclastic margins elsewhere.

This research used data provided by the Integrated Ocean Drilling Program (IODP) and the International Continental Scientific Drilling Program (ICDP); we benefitted from numerous discussions with Peter K. Harvey and his INCA (iterative nonhierarchical cluster analysis) program to carry out the statistical analysis described. The three-dimensional plots presented in Figure 2 were constructed following discussion with Dursan Akaslan. We thank R. Wilkens, N. Christie-Blick, T. Lado Insua, and an anonymous reviewer for helpful reviews.

1Supplemental File 1. Statistical analysis in greater detail and justification of k-value. In order to investigate the effect of the selected number of clusters, k, on the statistical results and to evaluate the impact on the ability to distinguish the 10 major lithological classifications of the New Jersey successions, a variety of tables and plots are presented in Supplemental File 1. If you are viewing the PDF of this paper or reading it offline, please visit or the full-text article on to view Supplemental File 1.
2Supplemental File 2. Statistical analysis at the scale of the borehole. The iterative nonhierarchical cluster analysis (INCA) clusters (C1–C10) for M0027, M0028, and M0029 are plotted against depth in meters below seafloor, and shown both as a column and as individual clusters. Each cluster is colored to loosely correspond to lithology (see Table 2). If you are viewing the PDF of this paper or reading it offline, please visit or the full-text article on to view Supplemental File 2.