Gold Open Access: This paper is published under the terms of the CC-BY license.


DNA in marine sediment contains both fossil sequences and sequences from organisms that live in the sediment. The demarcation between these two pools and their respective rates of turnover are generally unknown. We address these issues by comparing the total extractable DNA pool to the fraction of sequenced chloroplast DNA (cpDNA) in sediment from two sites in the Bering Sea. We assume that cpDNA is a tracer of non-reproducing fossil DNA. Given >150,000 sequence reads per sample, cpDNA is easily detectable in the shallowest samples but decays with depth, suggesting that sequencing-based richness assessments of communities in deep subseafloor sediment are relatively unaffected by fossil DNA. The initial decrease in cpDNA reads suggests that most cpDNA decays within 100–200 k.y. of deposition. However, cpDNA from a few phylotypes, including some that match fossil diatoms, are present throughout the cored sediment, ranging in age to 1.4 Ma. The relative fraction of sequences composed by cpDNA decreases non-linearly with increasing sediment age, suggesting that detectable cpDNA becomes more recalcitrant with age. This can be explained by biological activity decreasing with sediment age and/or by preferential long-term survival of only the most thoroughly protected DNA. The association of cpDNA reads with published records of siliceous microfossils, including diatom spores, at the same sites suggests that microfossils may help to preserve DNA. This DNA may be useful for studies of paleoenvironmental conditions and biological evolution on time scales that approach or exceed 1 m.y.


Bulk marine sedimentary DNA may contain both fossil DNA (e.g., Coolen and Overmann, 2007) and DNA from the living subseafloor community (e.g., Teske and Sørensen, 2008). Identifying the relative distributions of these two sedimentary DNA pools versus sample age has three advantages. First, it may aid identification of microbial taxa that survive deep into sediment. Second, identification of fossil DNA distributions may aid paleoceanographic and paleobiological studies. Finally, it may improve the definition of the deep sedimentary biosphere.

Marine sedimentary microfossils and organic biomarkers have been used for many years. Because DNA is regarded as much more labile, subseafloor DNA is customarily considered of best use for understanding the “deep biosphere” (e.g., Teske and Sørensen, 2008). However, several recent studies have reported fossil DNA from marine sediment. Härnström et al. (2011) tracked genetic variation in diatoms cultured from anoxic sediment as much as 100 yr old. In euxinic basins where preservation is thought to be enhanced, DNA associated with photosynthetic organisms has been found in Holocene sediment and used for paleoecological study (Coolen, 2011; Boere et al., 2011a; Coolen et al., 2013). Boere et al. (2011a) suggested that bacterial preservation is taxon specific. Coolen et al. (2013) noted variation in eukaryotic DNA sequences with sediment depth, attributing this to Holocene variation in Black Sea water-column paleoconditions, and hinted that DNA preservation may be possible beneath oxic water columns. Episodic deposition of green sulfur bacteria has also been noted in organic-rich Mediterranean sapropels deposited over 200 k.y. (Coolen and Overmann, 2007), and the influence of overlying water masses can be seen in microbial communities within at least the first meter below seafloor (Hamdan et al., 2013). Interestingly, Orsi et al. (2013) noted the presence of photosynthetic, eukaryotic RNA in a variety of sediment samples aged between 0.03 and 2.7 Ma.

Without a continuous record that spans long time scales, it is difficult to understand the maximum extent of fossil DNA preservation. Inagaki et al. (2005) inferred the existence of a Cretaceous DNA “paleome” in ca. 100 Ma black shale, based on sequences associated with marine-type sulfate-reducing bacteria. However, these organisms may have inhabited the sediment since its deposition. In contrast, Allentoft et al. (2012) used 14C dating to calculate a 521 yr half-life for mitochondrial DNA in terrestrial bone. These large differences between 100 yr and 100 m.y. time scales illustrate that (1) the relative contributions of fossil versus indigenous materials to DNA pools are poorly constrained, and (2) in marine sediment, we cannot assume all DNA to be non-fossil. In general, genetic studies of material that pre-dates the Quaternary are at odds with the current understanding of DNA preservation, especially in wet environments (Lindahl, 1993). In this study, we examine DNA concentration and composition from continuous sedimentary records as old as 1.4 Ma at two sites in the Bering Sea. The geochemical regime is likely broadly representative of high-productivity continental margins (Wehrmann et al., 2011). In order to separate fossil DNA distributions from living communities in subseafloor sediment, we focus on chloroplast DNA (cpDNA) and assume that it represents ancient material from the sunlit surface world. This focus allows us to address the preservation and utility of DNA as a sedimentary biomarker on time scales up to and greater than 1 m.y.


Sites and Sampling

Our sediment samples were retrieved using the advanced piston core system of the JOIDES Resolution during Integrated Ocean Drilling Program (IODP) Expedition 323 (Takahashi et al., 2011). The first site, Site U1339, was drilled at 54°40.2103′N, 169°58.9106′W in July 2009. The second site, Site U1343, was drilled at 57°33.4156′N, 175°48.9951′W in August 2009. Water depth is 1870 m at U1339 and 1953 m at U1343. Sediment from U1339 is a mix of biogenic (primarily diatom), volcaniclastic, and siliciclastic components, while sediment from U1343 is primarily silt (Takahashi et al., 2011). The samples in this study ranged in age from 0.001 Ma to 0.563 Ma at U1339, and from 0.0006 to 1.398 Ma at U1343 (Table DR1 in the GSA Data Repository1) (Takahashi et al., 2011). Sedimentation rates varied between 22 and 50 cm/k.y. at U1339 and averaged 26 cm/k.y. at U1343 (Takahashi et al., 2011). Both sites exhibit slight to moderate bioturbation and have detailed micropaleontological records (Takahashi et al., 2011).

Immediately after recovery (as 9.5 m cores in cylindrical plastic liners), sections were cut with sterilized blades into 1.5 m sections on the drillship catwalk (between the rig floor and ship laboratories). Perfluorocarbon tracers were used in the drilling process, and cores were tested onboard for coring and sampling contamination (Smith et al., 2000; Takahashi et al., 2011). We subsampled for DNA immediately, using clean technique and taking care to avoid sample cross-contamination (Coolen et al., 2009), by inserting a sterilized cut-off syringe into a freshly cut section face, avoiding disturbed sections or core ends. We froze wrapped and bagged syringes at −80 °C. We report depths in “core composite depth below seafloor” (CCSF-A), to allow comparison between samples from different holes (horizontally offset ∼15 m) at each site. CCSF offsets and depth-age models are from Takahashi et al. (2011). Samples were shipped on dry ice and stored at −80 °C.

DNA Extraction and Amplification

We extracted DNA from subsamples using the PowerLyzer DNA isolation kit (MO BIO Laboratories, We extracted 0.25 g of sediment per tube, and also blanks without sediment, in an Envirco laminar flow work station. For lower-yield samples, we combined multiple extracts. We quantified DNA yield with a Qubit 2.0 fluorometer and determined a detection limit of 1.01 ng DNA/g sediment (3σ of our buffer).

We amplified DNA using primers (518f, 926R) targeting the v4v5 hypervariable region of bacterial 16S ribosomal DNA (Huse et al., 2010). We amplified replicate aliquots as per manufacturer’s directions using PfuUltra II Fusion HS polymerase (Agilent Technologies, We sequenced samples and blanks using Illumina (MiSeq v3) technology at the University of Rhode Island (USA) Next Generation Sequencing facility ( Detailed methods are provided in the Data Repository. Sequences have been deposited at the National Center for Biotechnology Information (Bethesda, Maryland, USA) under BioProject ID PRJNA321207.

Sequence Analysis

We trimmed and merged reads with CLC Workbench 6.0 ( We used the MOTHUR MiSeq pipeline to align sequences, assign taxonomy, and cluster (Schloss et al., 2009; Kozich et al., 2013). Our protocol deviates from the standard operating procedure ( in that we removed all taxa except chloroplasts before clustering. No remaining sequences overlapped between sediment samples and the kit blank.

To build our phylogenetic tree, we aligned the ten most abundant operational taxonomic units (OTUs; defined at a 99% cutoff) with database sequences using Clustal X (Jeanmougin et al., 1998). We used MrBayes 3.2 (Ronquist et al., 2012) to infer a tree using 106 generations (standard deviation <0.005; see the Data Repository).


We extracted and amplified DNA in sediment from near seafloor to a depth of 218 m CCSF at Site U1339 and 364 m CCSF at Site U1343 (Table DR1). The amount of DNA extracted per gram of wet sediment decreased rapidly with depth, with a best-fit curve following a power law (Fig. 1).

The number of 16S reads after quality control varies between 150,567 and 324,963 reads per sample (Table DR1). Between 0.73% (n = 1586) and 0.006% (n = 13) reads per sample are taxonomically assigned as chloroplasts and fall inside major photosynthetic eukaryotic clades (Fig. 2). Free-living cyanobacterial sequences constitute a much smaller fraction (between 0.04% and zero).

Dominant chloroplast taxa in our data set include sequences that group with the diatom genera Chaetoceros (OTU 01) and Thalassiosira (OTU 05) (Fig. 2). These diatom taxa are well resolved phylogenetically, and chloroplast sequences indicative of several other groups, such as picoeukaryotes and heterokonts, are also present. This gene fragment is poor for resolving taxonomy of land plants, however (Fig. 2).

The fraction of chloroplast reads per sample declines with sediment age at both sites, but never reaches zero (Fig. 3A). The fraction of reads identified as cpDNA is greatest in the shallowest depth sampled at each site, 0.73% at 0.65 m CCSF for Site U1339 and 0.64% at 0.15 m CCSF for Site U1343 (Table DR1). Minimum fractions identified were 0.026% at U1339 and 0.006% at U1343, or 57 and 13 total reads, respectively. This number of reads is very small, but the sampling intensity is even higher (>150,000 total reads; see the Data Repository for methods; Table DR1).

For sediment younger than 100 ka, the observed fraction of cpDNA follows a curve with a half-life of 15 k.y. (Fig. 3A). At greater ages, the cpDNA data depart from a half-life decay model by converging on a value >0. Because the relative abundance of cpDNA represents a shrinking fraction of a shrinking DNA pool (Fig. 1), we multiply the fraction of cpDNA sequences by the measured quantity of total DNA extracted per gram of sediment. This product, which approximates the relative amount of cpDNA remaining over time, fits a power law instead of a half-life decay model (Fig. 3B; R2 = 0.96 for Site U1339, R2 = 0.84 for Site U1343).

Contamination Considerations

When dealing with low-biomass samples, potential contamination issues must be addressed. Several lines of evidence indicate that there was no significant sample contamination during coring and sediment subsampling: (1) all of the samples were obtained by piston coring, which is generally much less disruptive to sample integrity than other drilling techniques (Smith et al., 2000); (2) the perfluorocarbon contamination tracer used during coring was not detected in any of the cores at these sites (Takahashi et al., 2011); (3) our observations show a trend with depth (Fig. 3), in contrast to previous studies of contamination tracers where no depth trends were observed (Lever et al., 2006); (4) a previous study including Sites U1339 and U1343 found no anomalous cell counts even at extremely low cell densities (∼103 cells/cm3) (Kallmeyer et al., 2012); and (5) chloroplast DNA decreases with sediment age (Fig. 3), while contamination can be expected to cause a relative increase of contaminant to in situ biomass in older sediment.

Contamination during DNA extraction, amplification, and sequencing must also be considered. We believe that our laboratory technique and processing of blanks minimized this issue because (1) many of our commonly found taxa were also reported by shipboard micropaleontology (Takahashi et al., 2011); (2) commonly observed taxa included polar marine organisms (Fig. 2); and (3) again, contamination would impact deeper samples disproportionately, in opposition to our observed trends with depth (Fig. 3).


Some of the total DNA in these sediment sequences must be from active microbiota, as indicated by ongoing biogeochemical activity in the subseafloor at these sites (Wehrmann et al., 2011). The decrease in DNA content with increasing sediment age (Fig. 1) is consistent with decreases in cell concentrations (Kallmeyer et al., 2012) and biogeochemical activity (Wehrmann et al., 2011) with increasing depth at these sites.

Persistence of Fossil cpDNA over Geological Time

The individual sequences we report here are related to eukaryotic chloroplast 16S ribosomal genes (Fig. 2). Cyanobacterial reads are one to two orders of magnitude less common (Fig. DR1 in the Data Repository), though their presence in young sediment is consistent with other studies of Holocene sediment (Coolen et al., 2009). The decline in the fraction of cpDNA sequences with increasing sediment age is not surprising, as phytoplankton remains presumably continue to decay over time.

As described in the Results section, for both sites, the relationship of cpDNA sequence fractions to age best fits a power law (Fig. 3B). This suggests that residual cpDNA is more resistant to degradation in older sediment. This trend, of enhanced degradation in younger sediment and recalcitrance at depth, is similar to the pattern of viral particle abundance in marine sediment (Engelhardt et al., 2014).

We do not know the mechanism behind the apparent relative slowdown of DNA degradation with age. Whether this decreased turnover is due to decreased lability of residual DNA, an overall decrease in enzyme activity, a decrease in spontaneous decay rates, or some combination of these and other factors remains presently unknown.

Some studies have shown that DNA preservation can be enhanced by mineral surfaces (Lindahl, 1993; Romanowski et al., 1991), and specifically silica (Grass et al., 2015). In this context, it is interesting that many of the cpDNA OTUs in this sediment, including six out of ten dominant OTUs, are associated with diatoms and silica-secreting heterokonts (Fig. 2). These taxa (most notably the diatoms Chaetoceros and Thalassiosira) are well represented and even dominant in microfossil counts from these sites (Takahashi et al., 2011). This association between cpDNA and siliceous microfossils indicates that the cpDNA has been present in the sediment since it was deposited. The most common sequence type in this sediment, OTU 01/Chaetoceros (50.4% of total cpDNA reads at Site U1339 and 30.1% at Site U1343), is taxonomically associated with a prominent microfossil that occurs throughout the sediment in the form of siliceous resting spores (Takahashi et al., 2011). This association of some sequences with spores suggests that cpDNA preservation may be enhanced on long time scales in dehydrated microenvironments.

Possible Utility of Fossil cpDNA

These records extend the known co-occurrence of microfossil taxa with their DNA by well over an order of magnitude in time (Lejzerowicz et al., 2013). This result suggests the utility of DNA, or possibly even RNA (Orsi et al., 2013), in siliceous microfossils as an information-dense biomarker, which could provide (1) paleoceanographically useful information about temporal and geographic variation of individual diatom strains, and (2) biologically useful information about adaptation and change of strains on time scales that exceed 1 m.y.

Definition of the Deep Biosphere

The upper boundary of the deep sedimentary biosphere is defined differently by different studies (e.g., Colwell and D’Hondt, 2013; Orcutt et al., 2013). Teske and Sørensen (2008, p. 4) advanced a definition of this upper boundary as “the sediment horizon where water column bacterial and archaeal communities are fading out and solely sediment-typical bacterial and archaeal communities are remaining.” This “fading out” occurs in sediment of 100–200 ka age at our sites (Fig. 3). However, rare cpDNA sequences are still detectable among >150,000 total reads in our ca. 1 Ma sediment. Consequently, it may take millions of years before detectable sequences consist exclusively of organisms that were active in the sediment.

We propose an alternate definition of the deep sedimentary biosphere’s upper boundary, as the age beyond which residual surface-world DNA changes little. At our sites, this inflection occurs at ca. 100–200 ka, suggesting that after this point, fossil DNA does not appear to interact at an appreciable rate with enzymes or cells found in this sediment.


Plankton DNA in marine sediment decays over geologic time (e.g., Boere et al., 2011b). At our Bering Sea sites, the majority of cpDNA sequences disappear within the first 100–200 k.y., but traces are present in sediment of every age sampled (as old as 1.4 Ma). Some of these cpDNA reads match siliceous microfossil taxa previously identified in the same sedimentary sequences, suggesting that microfossils may help to preserve DNA. This persistence of a small relative fraction beyond 1 Ma suggests that residual cpDNA becomes increasingly recalcitrant with increasing sediment age. These results highlight both (1) the potential of fossil DNA for paleoecology studies, and (2) its relative isolation from the biogeochemical processes driven by active subseafloor microbiota.

We thank the IODP; the staff, crew, and science party of Expedition 323; and Heather Schrum, Nils Risgaard-Petersen, Dennis Graham, Victoria Fulfer, Paul Johnson, and Janet Atoyan. We thank the National Science Foundation (NSF)–funded Center for Dark Energy Biosphere Investigations (C-DEBI: NSF grant OCE-0939564). We used the Rhode Island Marine Science Research Facility and the Genomics and Sequencing Center, supported in part by NSF under EPSCoR Grants Nos. 0554548 and EPS-1004057. This is C-DEBI contribution #329. We thank the editor and reviewers for their contributions to this manuscript.

1GSA Data Repository item 2016199, details of methods protocols, taxa-specific depth and site profiles, comparative trees, and a table of sample data, is available online at, or on request from