From 2007 to 2010, the United States Geological Survey (USGS) conducted a low-density (one site per 1600 km2), geochemical and mineralogical soil survey of the conterminous USA (c. 8 × 106 km2). This project was initiated to address the lack of a national soil geochemical database that was a critical need for state and federal environmental agencies, public health specialists and those engaged in risk assessment of contaminated land. Sampling and analytical protocols were developed in consultation with stakeholders at a 2003 workshop, and pilot studies were carried out from 2004 to 2007. Sampling began in 2007 and concluded in 2010. Chemical and mineralogical analyses were completed in 2013, and the datasets were released to the public that same year. Geochemical and mineralogical maps were published in 2014, and an interactive website was released in 2019. The author was Project Chief for this effort throughout the lifetime of the study. The evolution of the project is discussed from its inception through to the publication of results and its impact. The lessons learned during the project are reviewed in the hope that applied geochemists who undertake such broad-scale geochemical mapping projects in the future will find them useful.

Thematic collection: This article is part of the Continental-scale geochemical mapping collection available at:

In 2007, the United States Geological Survey (USGS) initiated a low-density geochemical and mineralogical survey of soils of the conterminous USA (c. 8 × 106 km2). The sampling density was one site per 1600 km2, representing 4857 sites from which 14 434 samples were collected (Fig. 1). This effort was originally part of the North American Soil Geochemical Landscapes Project (NASGLP) that was to be conducted in collaboration with the Geological Survey of Canada (GSC) and the Mexican Geological Survey (Servicio Geológico Mexicano (SGM)). Recommendations for sampling and analytical protocols were developed at a workshop in 2003, and pilot studies were conducted from 2004 to 2007 to test and refine these recommended protocols. The final sampling protocol for the continental-scale survey included, at each site, a sample from a depth of 0–5 cm, a composite of the soil A horizon and a deeper sample from the soil C horizon or, if the top of the C horizon was at a depth greater than 1 m, the sample was collected from a depth of c. 80–100 cm. The <2 mm fraction of each sample was analysed for a suite of 45 major and trace elements using methods that yield the total, or near-total, elemental content. The major mineralogical components (24 mineral species) in samples from the soil A and C horizons were determined by a quantitative X-ray diffraction (XRD) method using Rietveld refinement. Based on recommendations from the 2003 workshop and tested during the pilot studies, a microbiological component was added to the study. This consisted of the determination of three soil pathogens: Bacillus anthracis (anthrax), Yersinia pestis (plague) and Francisella tularensis (tularemia or rabbit fever) in samples collected from a depth of 0–5 cm. Phospholipid fatty acid (PLFA) and soil enzyme determinations were performed on about 10% of the samples collected from the soil A horizon.

Sampling in the conterminous USA was finished in 2010, with chemical and mineralogical analyses completed in May 2013. The resulting dataset provides an estimate of the abundance and spatial distribution of chemical elements and minerals in soils of the conterminous USA, and represents a baseline for soil geochemistry and mineralogy against which future changes may be recognized and quantified.

Mexico completed its sampling and analysis in approximately the same time frame as the USA but the data have not yet been released to the public. Unfortunately, Canada dropped out of the project in 2009 after completing sampling in the provinces of Nova Scotia, New Brunswick and Prince Edward Island.

This paper focuses on the USA portion of the NASGLP, which came to be known as the Soil Geochemical Landscapes of the Conterminous United States Project.

In the late 1990s, with steadily increasing remediation activities being carried out at sites with contaminated soil, environmental regulators, risk-assessment specialists and the public health community needed data to estimate ‘background’ values of potentially toxic elements in soils. They quickly learned that national-scale soil geochemical data were sorely lacking in the USA, Canada and Mexico.

Neither Canada nor Mexico had a national-scale soil geochemical database, although some excellent regional-scale studies had been conducted in the Canadian prairies (Garrett and Thorleifson 1993; Garrett 1994; Garrett et al. 2008) and Ontario (Frank, et al. 1976; OMOE 1993). The database most often used to estimate background element concentrations in soils of the conterminous USA was produced in a study conducted from the mid-1960s to the late 1970s by the USGS (Boerngen and Shacklette 1981; Shacklette and Boerngen 1984; Gustavsson et al. 2001; Smith and Reimann 2008; Smith et al. 2013b). It has been discussed in detail by Smith et al. (2013b). The study collected soil at a depth of c. 20 cm from 1323 sites sustaining native vegetation. Although the sites were distributed relatively evenly throughout the conterminous USA, they only represent a density of about one site per 6000 km2. The <2 mm fraction of the samples was analysed for 50 elements using a variety of methods, all of which provided the total element concentration. However, 30 of the elements, including such environmentally important elements such as Cd, Cr, Be and Pb, were analysed by an emission spectrographic method that only yielded semi-quantitative results. This small dataset, although very useful for its time, was inadequate to meet the growing needs of stakeholders.

The directors of the USGS, GSC and SGM agreed in 2001 that establishing a soil geochemical database for North America was a subject for collaboration between the three countries. In October 2002, a meeting was held in Denver, Colorado (USA) to discuss the way forward. Twenty-nine attendees were present representing the USGS, GSC, SGM, Agriculture and Agri-Food Canada, Health Canada, US Environmental Protection Agency (US EPA) and the Natural Resources Conservation Service (NRCS). It was agreed that the first step should be to engage the stakeholders for soil geochemical data to get their input regarding a continental-scale geochemical soil survey of North America.

In March 2003, the Soil Geochemistry Workshop was held in Denver, Colorado, attracting 112 attendees representing 43 stakeholder groups (Table 1). During the workshop, breakout groups were asked to provide recommendations on: (1) the design of a continental-scale geochemical soil survey; (2) the sampling protocols for the survey; and (3) the analytical protocols for the survey.

Survey design

The workshop group focusing on the survey design was charged with: (1) developing a statement of the objective for the survey; (2) defining the target population; and (3) making recommendations on the spatial distribution of the samples to be collected. The stated objective for the proposed survey, as developed by the group, was to determine the current, unbiased, geochemical status of North American soils. This included defining the continental-scale geochemical variations, delineating regional- to continental-scale geochemical patterns and interpreting the patterns in terms of the processes that caused them. The target population for the survey was defined as all soils of the North American continent, including agricultural soils, soils sustaining native vegetation, alluvial soil, residual soil and urban soils.

There was much debate among the group's participants on how to select the initial target sites for sampling. A stratified design based on various conceptual frameworks was discussed. Some of the options for stratification included soil series or other soil-classification subdivision, ecological classifications, bedrock or surficial geology, hydrological regions and population density. The discussion of these options brought into focus the competing interests of workshop participants representing the various physical sciences and those whose prime concern was human health. Although there were some recognized advantages to a stratified design, it was apparent that the selection of one method of stratification would be likely to put constraints on the use of the resulting data by other investigators with a need for a different conceptual framework, thus reducing the data's overall utility.

To preserve the maximum long-term use of the data, it was recommended that a site-selection scheme be employed that gave each part of North America an equal chance of being sampled. The subject of target-site selection was a matter of debate for the next few years as the pilot phase of the project (discussed below) was conducted. Another workshop was held in 2006 to reach final agreement on this subject. Thirteen attendees representing the USGS, GSC, EPA, NRCS, Centers for Disease Control and Prevention, Minnesota Geological Survey, Savannah River Ecology Laboratory, Agricultural and Agri-Food Canada, and Environment Canada participated in the meeting. The consensus recommendation from the group was to use a Generalized Random Tessellation Stratified (GRTS) design to select target sites for the continental-scale geochemical soil survey. The GRTS design produces a spatially balanced set of sampling sites without adhering to a strict grid-based system. Its attributes have been fully described by Stevens and Olsen (2000, 2003, 2004) and routines for implementing the design are readily available. It was also recommended that the target sites should represent a density of one site per 1600 km2 (13 323 sites for the continent). This would be equivalent to grid-based sampling using 40 × 40 km grid cells.

Sample collection protocols

The group focusing on sample collection protocols recommended that sampling at each site be based primarily on genetic soil horizons as opposed to constant depth intervals. Sampling by horizon provides data on discrete soil genetic units, whereas depth interval sampling mixes different genetic units in an uncontrolled and largely unknown manner. It was therefore recommended that the following samples should be collected at each site: (1) a composite of the soil O horizon; (2) a composite of the soil A horizon, the uppermost mineral soil; (3) a sample from the unit judged to be most representative of the soil B horizon; and (4) a sample from the soil C horizon, which is generally the partially weathered parent material. In addition, one depth-based sample from a depth of 0–5 cm was recommended at each site. This sample type was strongly supported by the public health specialists attending the workshop because this is the portion of the soil profile with which humans, especially children, most often come into contact during their daily activities.

There was a strong consensus that the size fraction for chemical analysis should be the <2 mm fraction to be compatible with most other published, broad-scale geochemical studies conducted throughout the world (e.g. Shacklette and Boerngen 1984; Sandström et al. 2005; Mackových and Lučivjanský 2014; Caritat and Cooper 2016; Batista et al. 2022). Establishing an archive of samples collected during this study to provide material for future research was strongly supported. The size of field samples, therefore, had to be adequate to meet immediate project needs as well as providing archival material.

Analytical methods

The group assigned to make recommendations on analytical protocols felt that the most important measurement for the study is total elemental composition. This was regarded as the most consistent type of analysis. Other methods, such as partial chemical extractions, are much more dependent on procedural details and may rely on the experience of the operators. It was recommended that the primary analytical protocol used on the soil samples should be a four-acid (HCl, HNO3, HF and HClO3) digestion followed by analysis using a combination of inductively coupled plasma atomic emission spectrometry (ICP-AES) and inductively coupled plasma mass spectrometry (ICP-MS). This methodology offers the combination of high throughput, good sensitivity and a broad range of analytes (>40 elements) at a reasonable cost. This approach would have to be supplemented by single-element techniques to determine As, Se, Hg and different forms of C. The group also recommended additional analyses, on at least a subset of samples, involving fusion (sintering). This is a more effective approach for the analysis of refractory elements such as Zr, Nb and Ti. These elements are often regarded as immobile relative to other elements and, thus, potentially serve as a useful reference for interpreting soil processes. In addition, it was recommended that a subset of samples be analysed by additional methodologies including instrumental neutron activation analysis (INAA) and X-ray fluorescence (XRF), providing data against which the primary analytical protocol could be compared.

The group also recognized a second important suite of soil characterization measurements including field moisture content, bulk density, soil pH, cation exchange capacity, particle-size analysis, and citrate-dithionite-extractable Fe, Al and Mn. In addition, it was suggested that quantitative mineralogy should be determined by XRD analysis on at least a subset of the samples.

In addition to basic elemental, mineralogical and soil characterization parameters, the breakout group considered other types of analyses that would add value to the proposed continental-scale geochemical soil survey: (a) bioaccessibility; (b) organic geochemistry; and (c) microbiology.

Two specific approaches were suggested to determine bioaccessibility. The first was a simple water leach from which distribution coefficients (Kds) could be derived to establish the solubility of metals in soil solution. The second type of bioaccessibility test would focus on human health issues. These studies would involve extractions with simulated human gastric fluid and simulated human lung fluid to mimic what happens chemically when soil particles are ingested or inhaled.

The presence of specific organic compounds in soil was of interest because of the toxicity of some of these compounds, and because regional- to continental-scale data on the distribution of these compounds is very limited. The incorporation of organic analysis of the soil samples would address the long-range transport of organic compounds, and the geographical distribution of pesticides and their breakdown products.

The consensus of the group was that soil microbiological studies should be considered as a component of the continental-scale geochemical survey. The recommendation was to utilize two complementary methods. Phospholipid fatty acid (PLFA) analysis would provide a measure of microbial mass, as well as markers for major groups of organisms (i.e. patterns of microbial ecology). It does not, however, reveal the identity of individual microbial species. The second method is DNA fingerprinting. This approach has the potential to provide both patterns of microbial ecology and identify specific micro-organisms. There are a number of pathogens potentially present in soils (e.g. anthrax) and a continental-scale soil survey could provide background data on their distribution. Sample preservation and transport issues are critical for soil microbial analysis. Sample collection must be performed using sterile procedures, including sterile gloves and sterile sample containers. The soil could be held at room temperature for 1–2 days but should be frozen for transport to the laboratory.

Pilot studies

The 2003 Soil Geochemistry Workshop produced recommendations for sample collection and analytical protocols for the geochemical soil survey of North America. As these were far more complicated than envisioned by the directors of the USGS, the GSC and the SGM or by the participants in the 2002 planning session, a pilot study was considered a necessity to determine what protocols could and could not be accomplished with the time, funding and personnel resources likely to be available for the study. A workshop was held in March 2004 at the University of California, Davis, to design these pilot studies. Twenty-six participants attended the meeting representing 10 stakeholder entities (Table 2). The outcome of this workshop was a pilot phase, initiated in 2004, that consisted of studies at two very different scales. One of the pilot studies was conducted at a continental scale, and consisted of sampling and analysis of soils at c. 40 km intervals along two transects across Canada, the USA and Mexico. The purpose of this continental-scale pilot study was to test and refine the recommendations from the 2003 workshop and to optimize field logistics. A pilot study at a more regional scale in a 20 000 km2 area of northern California was conducted as a model for higher-resolution, process-orientated, follow-up studies that might be performed in areas of interest selected from the low-density continental-scale mapping. The results of these pilot studies were published as 21 papers in a special issue of Applied Geochemistry in August 2009 (Bern 2009; Cannon and Horton 2009; Chiprés et al. 2009a, b; Eberl and Smith 2009; Garrett 2009; Garrett et al. 2009; Goldhaber et al. 2009; Griffin et al. 2009; Grunsky et al. 2009; Holloway et al. 2009; Klassen 2009; McCafferty and Van Gosen 2009; Morman et al. 2009; Morrison et al. 2009; Reeves and Smith 2009; Smith et al. 2009; Tuttle et al. 2009; Tuttle and Breit 2009; Wanty et al. 2009; Woodruff et al. 2009). Chiprés et al. (2009b) and Smith et al. (2009) give details of the design, sample collection and analytical protocols for the continental-scale pilot, and Goldhaber et al. (2009) provide similar information for the northern California, regional-scale, pilot study.

The USA–Canada portion of the continental-scale pilot study carried out as many of the recommendations from the 2003 workshop as budget would allow. At each site, up to four separate samples were collected as follows: (1) material from a depth of 0–5 cm; (2) soil O horizon, if present; (3) soil A horizon; and (4) soil C horizon or, if the top of the C horizon was at a depth of more than 1 m, a sample from c. 80–100 cm. It was decided not to collect a sample judged to be the most representative of the soil B horizon because such a judgement requires an experienced soil scientist. Most of the collection of samples would be performed by geologists and applied geochemists for the pilot phase. A separate sample of soil from a depth of 0–5 cm was collected for determination of organochlorine pesticides, the only organic analysis that budget would allow. A separate sample of the soil A horizon was collected for microbial characterization, including PLFA analysis, soil enzyme assays, and determination of selected human and agricultural pathogens (Brucella abortus, E. coli, Cryptosporidium parvum and Bacillus anthracis). Volumetric sampling was conducted for the soil O and A horizons to allow estimates of bulk density. Analytical protocols are summarized in Table 3. More detailed sampling and analytical protocols are given in Smith et al. (2005).

Upon completion of the pilot study in 2007, an evaluation was made regarding what sampling and analytical protocols would be used in the full continental-scale survey. Several important decisions were made based on the pilot studies:

  1. Soil O horizon material is not available at most sites in North America and, thus, this sampling medium is not appropriate for generating data for the entire study area.

  2. Volumetric sampling for the estimation of bulk density required too much time at each site, thus adding considerable cost to the project, and so was not included in the full continental-scale project.

  3. Analyses involving sintering of samples, XRF, and INAA were not included in the full continental-scale project because of cost considerations. Archival samples will be available if these analyses are needed in the future.

  4. The cost of sampling and analysing for organic compounds is prohibitively expensive; therefore, no organic analyses were performed in the full project.

  5. While bioaccessibility studies are extremely interesting, it was decided that resources were insufficient to include these in the full project. Archival samples will be available if these studies are desired in the future.

  6. Although quantitative mineralogy by XRD is extremely labour-intensive in terms of sample preparation and interpretation of spectra, the resulting data are vital for the interpretation of observed geochemical patterns. Therefore, quantitative mineralogy was added to the full continental-scale project.

  7. A reduced microbiological component was maintained because of the interest generated by the pilot studies. This consisted of the determination of three soil pathogens in samples of soils collected from a depth of 0–5 cm: Bacillus anthracis (anthrax), Yersinia pestis (plague) and Francisella tularensis (tularemia or rabbit fever). In addition, PLFA and soil enzyme analysis was performed on about 10% of the samples from the soil A horizon.

The analytical protocols used for the full continental-scale survey of the conterminous USA are summarized in Table 4.

The pilot studies were completed in the USA and Canada in 2007, and sampling for the full continental-scale project began that same year. In Mexico, pilot studies were completed in 2008 and the full sampling started in 2009. Unfortunately, Canada dropped out of the project in 2009 because of a change in scientific priorities at the GSC. Mexico continued in the project but, as of early 2022, has not released their data. For this reason, the remainder of this paper will focus only on the USA portion of the study.

Project management

The project had one Project Chief and a three-person Steering Committee (including the Project Chief). This Steering Committee made all the decisions throughout the life of the project. It should also be noted that this was a very small project in terms of USGS staff assigned to it. The project never had more than the equivalent of five or six full-time USGS scientists working on the project during any 1 year.

Sampling protocols

A GRTS design, as discussed in a previous section, was used to select target sites that represented a density of c. one site per 1600 km2 (4857 sites for the conterminous USA). If a target site was inaccessible for any reason during the sampling programme, the field crew would select an alternative site as close as possible to the original site with the landscape and soil characteristics as similar to the original site as possible. The following guidelines were also used in the site-selection process to ensure that samples were not collected from obviously contaminated areas:

  1. No sample should be collected within 200 m of a major highway.

  2. No sample should be collected within 50 m of a rural road.

  3. No sample should be collected within 100 m of a building or structure.

  4. No sample should be collected within 5 km downwind of active major industrial activities such as power plants or smelters.

The sampling protocols used for the full survey were finalized based on the experience from the pilot study. The protocols represented a combination of depth-based and horizon-based sampling. Ideally, the following samples were collected at each site: (1) soil from a depth of 0–5 cm; (2) a composite of the soil A horizon (the uppermost mineral soil); and (3) a sample from the soil C horizon (generally partially weathered parent material) or, if the top of the C horizon was deeper than 1 m, a sample from about 80–100 cm. In addition, a separate sample of surface soil (0–5 cm) was collected at each site for the determination of soil pathogens, and separate samples of all three sample types were collected at 10% of the sites for future microbial characterization studies.

Samples were collected by state beginning in 2007, with the last sample collected in late 2010 (Fig. 1). Sampling in 2007 (the six New England states and New York) was conducted by USGS personnel. Most of the sampling during the final 3 years was conducted by teams of students chosen for their academic course work in soil science and participation on their university's soil judging team. Nineteen students representing 12 universities participated in this sampling programme. Samples in North Dakota and South Dakota were collected by staff of the US Department of Agriculture's Natural Resources Conservation Service. The Pennsylvania Geological Survey, the Conservation and Survey Division of the University of Nebraska's School of Natural Resources, and the Minnesota Geological Survey collected samples in their respective states. Not all sample types were collected at each site. For some urban sites (e.g. city parks or private gardens), only the surface sample (0–5 cm) was collected for fear of digging into buried utilities or sprinkler systems. In addition, a small number of samples were lost in shipping, so some sites have only one or two sample types.

Sample preparation and chemical analysis

All samples were shipped to the USGS laboratories in Denver, Colorado, where they were prepared and submitted for analysis by a USGS contract laboratory in the order they were collected, by state. As a result of this process, chemical analyses were carried out from late 2007 to early 2013. For large geochemical surveys like this one, the ideal course of action is to submit the samples for chemical analysis in a single batch after all samples have been collected in order to avoid bias in the chemical data caused by changes during the several years of the collection phase, such as changes in analytical instruments or analysts. The year-to-year budget process in the USGS, however, dictated that samples had to be submitted on a yearly basis. All samples within a given state were randomized prior to chemical analysis to avoid confusing spatial variation with any possible systematic bias within a given analytical technique. This randomization does not eliminate a systematic error but the error is effectively transformed into one that is random with regard to geographical location within a state.

Each sample was air dried at ambient temperature, disaggregated and sieved to <2 mm. The <2 mm material was then crushed to <150 µm prior to chemical analysis. Concentrations of 41 elements were determined by ICP-AES using a method adapted from Briggs (2002) and by ICP-MS using a method adapted from Briggs and Meier (2002). The sample was decomposed using a near-total four-acid (HCl, HNO3, HF and HClO3) digestion at a temperature between 125 and 150°C. The lower reporting limits (LRL) are shown in Table 5. Mercury (Hg) was determined by cold-vapour atomic absorption spectrometry (AAS) after dissolution in a mixture of nitric and hydrochloric acids in a modification of the method published by the US EPA (1994). The LRLwas 0.01 mg Hg kg−1. For analysis of arsenic (As), the sample was fused in a mixture of sodium peroxide and sodium hydroxide at 750°C. The fused mixture was then dissolved in hydrochloric acid and analysed by hydride-generation AAS using a method similar to Hageman et al. (2002). The LRL was 0.6 mg As kg−1. Selenium (Se) was determined by hydride-generation AAS after dissolution in a mixture of nitric, hydrofluoric and perchloric acids (Hageman et al. 2002). The LRL was 0.2 mg Se kg−1. Total carbon (C) was determined by the use of an automated carbon analyser. The sample was combusted in an oxygen atmosphere at 1370°C to oxidize C to carbon dioxide (CO2). The CO2 gas was measured by a solid-state infrared detector using a method similar to Brown and Curry (2002). The LRL was 0.01% C. The concentration of organic carbon was calculated by subtracting the amount of inorganic (carbonate) carbon (determined from the mineralogical data for the carbonate minerals calcite, dolomite and aragonite) from the total carbon concentration.

Statistical summaries for each sample type can be found in Smith et al. (2013a) and all the data can be downloaded from this same publication.

Quality control (QC)

In this project, trueness estimation was carried out on three separate levels. The USGS contract laboratory analysed a certified reference material (CRM) with every batch of 48 samples. At the second level, the USGS QC officer inserted at least one reference material between every batch of 20–30 samples. Both a CRM from the National Institute of Standards and Technology and a non-certified USGS in-house reference material prepared specifically for this study were used. The USGS principal investigator for the study (D.B. Smith) initiated the final QC tier, which included the insertion of two blind reference materials in each batch of 20–30 samples.

Precision is generally assessed by repeated analyses of a reference material or by replicate analyses of real project samples. With regard to the latter method, the USGS contract laboratory inserted duplicate samples at random intervals at an approximate rate of one duplicate sample per 80 regular samples. The results from the QC protocols can be found in Smith et al. (2013a).

Mineralogical analysis

All A-horizon and C-horizon samples were analysed by XRD, and the percentages of major mineral phases were calculated using a Rietveld refinement method. Splits of the <2 mm fraction were used for analysis. Zinc oxide (ZnO: 10 wt%) was added to each sample as an internal standard, which allows the calculation of the amorphous component (the portion of sample that is not quantified by the diffraction technique). The sample–ZnO mixture was ground for 3 min in isopropyl alcohol using a micronizing mill and agate beads. Dried samples were disaggregated by being passed through a 400 µm sieve and lightly pressed into back-loaded sample mounts. Samples were analysed on a PANalytical X'Pert PRO Materials Research Diffractometer using Cu Kα radiation to collect digital data continuously from 3° to 70° 2θ (scan speed = 0.0567° 2θ s−1). PANalytical HighScore Plus software version 2.2a was used for pattern processing, mineral-phase identification and Rietveld quantitative mineral analysis. Rietveld refinements simultaneously adjust the percentage of each identified phase to achieve the best least-squares fit between the observed diffractogram and the experimental diffractogram calculated as the combined contributions of each individual phase. The refinements include calculations that correct for the preferred orientation of phyllosilicate minerals and account for variations in the peak shape.

Evaluation of the reliability of this method was performed by analysing standard mixtures of pure mineral phases prepared in-house and statistically evaluating the data. Standard ST1001 contained quartz and ZnO. Standard ST1003 contained quartz, ZnO, orthoclase, plagioclase (albite), illite and calcite. Each of these standards was included in each batch run of 45 unknowns to evaluate any instrumental drift. A series of additional standard mineral mixtures prepared in-house that contained from two to six phases of common rock-forming minerals were analysed separately between five and 10 times to evaluate the dolomite, chlorite, muscovite and amorphous content. Rock standard USGS G-2 (Flanagan 1969, 1976) was also analysed to test the method, even though it is a crystalline rock matrix.

Table 6 shows the mineral components reported in this study. Complete data and statistical summaries are available in Smith et al. (2013a).

Map preparation

In the maps shown in Smith et al, (2014, 2019), the spatial distribution of each element or mineral is shown either by an interpolated and smoothed average colour surface map or a proportional symbol map. Elements and minerals with only a small proportion of samples above the LRL are shown as proportional symbol maps; the others are shown as colour surface maps.

Maps showing predicted element and mineral concentrations for the conterminous USA are interpolated from the actual data points published in Smith et al. (2013a) by an inverse distance weighted (IDW) technique using ArcGIS software. The IDW method used predicted unique concentration values for an array of 444 km2 grid cells covering the conterminous USA (c. 8 × 106 km2). A weighted average of concentration values for all data points within 75 km of the centre of each grid cell was calculated. The closer a sample point was to the cell centre being estimated, the more influence that point had on the averaging process. The relative influence of closer and more distant points was adjusted by varying a power function. For all of the maps, a default value of 2 was used for the power function, which assigns relatively high significance to the nearest data points. Use of this value commonly produces numerous small, circular areas showing the influence of individual sample values that vary substantially from neighbouring samples and yields a spotted appearance of local high and low values. Many of these could be removed by varying interpolation parameters to produce a smoother surface. The IDW method is an exact interpolator in which minimum and maximum concentrations can occur only at sample points, so the interpolated values all lie within the range of analytically determined concentrations. To produce the final maps, the array of grid cells generated by IDW was further smoothed using a bilinear interpolation to suppress the appearance of sharp grid-cell boundaries. To produce the geochemical maps, samples containing less than the LRL for an element were assigned a value of one-half of the LRL. For mineralogical data, a nominal LRL of 0.2 wt% was used for all mineral phases that had been quantified, realizing that actual detection limits may have been higher in cases of severe peak overlap. A value of 0.1 wt% was assigned to all cases of non-detection for each mineral. For approximately half of the geochemical maps, a small number of outliers (between one and six) were removed prior to the IDW interpolation. Those outliers are extreme high concentrations that appear to represent a point source of either natural or anthropogenic origin and are not representative of a larger region. If they are included in the interpolation, they produce an unrealistically large area of predicted high concentrations and also unduly affect the classification of values used in the display. All removed outlier samples are shown as points (diamond symbols) on the interpolated maps along with their element concentrations.

As pointed out by Reimann (2005), perhaps the most important step in the construction of an informative geochemical map is the choice of classes for colour-coding the element concentrations. The chosen classes must relate to the statistical distribution of the concentrations if processes governing the regional distribution of an element are to be accurately discerned. In accordance with the recommendation of Reimann (2005), classes were chosen on the basis of percentiles of the analytically determined values for each element and mineral. The interpolated grid cell values are thus portioned into 10 classes as follows: 0–10th percentile, 10th–20th percentile, 20th–30th percentile, 30th–40th percentile, 40th–50th percentile, 50th–60th percentile, 60th–70th percentile, 70th–80th percentile, 80th–90th percentile and 90th–100th percentile. Each class is represented by a unique colour, ranging from blues (lowest concentrations) to reds (highest concentrations). With these classes, each colour represents approximately 10% of the data. For elements with more than 10% of concentration values below the LRL, the percentile representation was modified so that all values below the LRL are included in the lowest concentration class, and the remaining values are shown by the appropriate percentile values and colours.

Examples of a geochemical map (As) and a mineralogical map (quartz) are shown in Figures 2 and 3, respectively.

Publication strategy

The objective of this project was to release the raw geochemical and mineralogical data as quickly as possible so that stakeholders could begin to use them. This was accomplished in 2013 with the release of USGS Data Series 801 (Smith et al. 2013a). The next goal was to release the geochemical and mineralogical maps. This was done in 2014 with the publication of USGS Open-File Report 2014-1082 (Smith et al. 2014). The final objective was to publish interpretations of the observed geochemical and mineralogical patterns. For this, an interactive website was developed and released in 2019 as USGS Scientific Investigations Report 2017-5118 (Smith et al. 2019). In this publication, the user can view all the geochemical and mineralogical maps, view an interpretation of the major geochemical and mineralogical patterns, download the raw data, view statistical graphics for each element and mineral, and download each map as either a TIFF file or in.KML format, which allows the map to be opened in Google Earth.

Engaging stakeholders

The input of stakeholders to the planning stage of the project was invaluable. The decisions and recommendations coming from the 2003 workshop provided the basis for carrying the project into the future. Being able to state that the design, sampling protocols and analytical protocols were the result of such inclusive discussions provided a degree of legitimacy that might have been missing if all decisions had been made by a small group of USGS geochemists.

While communicating with stakeholders is extremely valuable, the process can lead to some frustrations. A few stakeholders felt that because USGS personnel would be accessing sample sites for the soil geochemistry project, the sampling crews could easily take additional time on site either to collect additional samples or to make on-site measurements to suit the interests of the particular stakeholder. These requests came with no offer to provide funding. Project leadership quickly learned that it was necessary to have total focus on the objectives that were developed through the 2003 workshop and refined by the pilot studies. Any agreement to take on additional activities had to be made very carefully. Such activities, no matter how interesting or useful they were, could put the entire project in jeopardy by using up all the hard-earned funding before the core objectives were met.

While early communication with stakeholders through the workshop was successful, maintaining communication throughout the decade-long project was less so. The project tried to develop a website where stakeholders could easily leave comments but this also made it easy for hackers to trash the site at regular intervals. For this reason, the website had to be taken down permanently after only a few months. After that, communication was primarily through presentations at scientific conferences and by email. Given the advances in social media, communication with stakeholders would be much more efficient today.

Sample medium

The project was designed from the very beginning as a geochemical soil survey, so there was never any discussion or debate about the appropriate sample medium.

Sampling density

The decision to have a sampling density of one site per c. 1600 km2 was based on the report by Darnley et al. (1995) that recommended sampling for a global-scale geochemical survey be based on a grid system of 160 × 160 km cells. Each cell could be subdivided into cells that were 80 × 80 or 40 × 40 km. The members of the 2006 workshop focusing on sampling design felt that a density of one site per 6400 km2 (80 × 80 km) was too sparse. Sampling at a density comparable to a 40 × 40 km grid (one site per 1600 km2) was thus recommended for the US study.

Sample collection

Collecting samples at almost 5000 sites throughout an area the size of the conterminous USA (c. 8 × 106 km2) in a limited amount of time is a huge undertaking. At any one time, the project had as many as eight separate field crews working. This meant that it was essential that each field crew was thoroughly trained in the sampling protocols. The sampling protocols were simple enough that anyone with undergraduate training in soil science could successfully carry them out. Field trips with USGS personnel and personnel from other agencies engaged in sampling were conducted to make sure the sampling was standardized to the fullest extent possible. As students came onto the project each summer, they were taken into the field by one of the USGS scientists and taught the sampling protocols in detail.

Sample preparation and chemical analysis

Given that soil was the only material being sampled, it was an obvious decision that the <2 mm fraction would be analysed. As mentioned in an earlier section, this fraction is used worldwide for chemical analysis of soils and the US project needed to maintain consistency with these other studies.

The methods chosen for chemical analysis yield the total, or near-total, elemental content of the sample. Most of the procedures involve a four-acid dissolution step. Some minerals are naturally resistant to acid attack and do not fully dissolve. These minerals include chromite (FeCr2O4), cassiterite (SnO2) and rutile (TiO2). For this reason, soils containing those minerals may show concentrations of Cr, Sn or Ti that underestimate the true values. For example, Morrison et al. (2009), in a publication resulting from this project's pilot phase, compared Cr concentrations for chromite-bearing soils based on the four-acid extraction with a lithium metaborate fusion of the sample followed by ICP-AES. The fusion digestion provides a true total concentration. The authors found that Cr concentrations determined by the fusion method were up to 10 times higher than the concentrations determined by the four-acid extraction. Samples of soil formed on a chromite-bearing ophiolite ranged from 1200 to 11 670 mg kg−1, whereas the four-acid extraction gave values ranging from 488 to 2610 mg kg−1. Therefore, care must be taken in interpreting chemical analyses on soils that may contain these acid-resistant minerals. For soils that do not contain these resistant minerals, the four-acid extraction gives a good estimate of the total concentration.

It should be noted that the breakout group charged with making recommendations on chemical analysis selected methods that were available through the USGS contract laboratory. These methods, as discussed previously, provided analysis for a relatively large number of elements (see Table 4). However, the LRL for some important elements was higher than desired for such a project. For example, Ag had a LRL of 1 mg kg−1. This is about 19 times higher than the estimated average concentration of 0.053 mg kg−1 (Rudnick and Gao 2003) in the Earth's upper crust. This resulted in well over 99% of the Ag values determined in this study being reported as <1 mg kg−1. Other elements with LRLs higher than would be desired include (% below LRL in the soil C horizon is shown in parenthesis) Cd (45%), Cs (73%), Se (57%) and Te (96%). One of the primary improvements that could be made if such a continental-scale geochemical soil survey were repeated would be to analyse for more elements and have the LRLs for all elements well below the crustal average.

Analytical methods that would have added such important elements as Au, platinum group elements and rare earth elements would have been prohibitive given the available budget. The cost to analyse more than 15 000 samples (including QC samples) for 44 elements plus forms of C in this survey was about US $1 million. Adding just the elements listed above to the project would have pushed the analytical costs beyond US $1.5 million. This was a cost the project could not absorb.

Pilot studies

The 3 year pilot phase of the project was absolutely essential to the success of the full continental-scale project. The pilot studies allowed the sampling protocols to be tested and refined prior to implementing the full sampling. As discussed previously, many of the recommended procedures coming from the 2003 workshop were found to be too time-consuming and expensive to carry out in the full project.

Project ‘add ons’: quantitative mineralogy and microbiology

Recommendations from the 2003 workshop included adding quantitative mineralogy by XRD to the study. After the pilot studies, it was decided to include this component for samples from the soil A and C horizons (9575 samples). The importance of this dataset in the interpretation of the observed geochemical patterns cannot be overestimated.

There was also much interest in the 2003 workshop in adding a microbiological component to the survey. The pilot studies included a strong microbiological component (see Table 3). After the pilot study, the microbiological protocols for the full US study were modified to include: (1) the determination of Bacillus anthracis in all samples collected from a depth of 0–5 cm, and the determination of Yersinia pestis and Francisella tularensis in 2000 samples from the same depth (Griffin et al. 2014); and (2) PLFA and soil enzyme analysis on about 10% of the samples collected from the soil A horizon (Waldrop et al. 2017).

International collaboration

As discussed previously, the project was initially a collaborative effort between the USGS, GSC and SGM with the goal of mapping the soil geochemistry for the entire continent of North America. This status was maintained until about 2009 when the GSC dropped out of the project. The GSC had new leadership and went through a strategic planning process whereby the continental-scale soil geochemistry effort was deemed not to be of sufficient priority to justify funding. This was perhaps the biggest disappointment encountered during the lifetime of the project. The USA and Mexico carried out their work individually after that. As of the time of writing, early 2022, the dataset from Mexico has not yet been released.

Support of USGS management

A project of this duration and expense could not have been completed without the full support of USGS management. The project was fortunate to have this continuing support from multiple USGS directors and acting directors who served during the project as well as support from the USGS Mineral Resources Program, which provided all funding for the project.

Sample archives

Splits of each sample collected during this study have been archived and are available for study. Several investigators have taken advantage of this archive but one thing has become clear: hardly any organization is willing to conduct their work on all the collected samples. The hopes were that future studies would be truly national in scope and would provide determinations on all the collected samples, or at least all of one sample type. However, it quickly became evident that most investigators were only able to conduct studies on a small subset of the sample archive. This usually consisted of perhaps 200 samples from a specific area of interest or representing a transect across an area of interest.

Throughout the course of the project, several important stakeholders requested in-person briefings on the project. These included the EPA, the US Department of Agriculture, the US National Committee for Soil Science of the National Academy of Sciences, the Assistant Secretary of the Interior for Water and Science, the Association of American State Geologists, the Senate Committee on Energy and Natural Resources, the House Committee on Natural Resources, and the White House Office of Science and Technology Policy.

Upon release of the data and maps in 2013 and 2014, respectively, the project received communications from various people and organizations congratulating the USGS on completing the project. Here are just three examples:

Simply put, the maps offer the most complete profile of the chemical and mineral makeup of U.S. soil ever produced. It is truly significant data, and the maps produced from it should be an important tool for anyone who designs, manages, or, well, just lives on the land

Jennifer Reut, Landscape Architecture Magazine.

The data isn't so fine that it will tell you what lies in your backyard behind the raspberry bush. But it will show you the metal and mineral patterns that color your part of the world and they will remind you – as they should – of the astonishing and diverse chemistry that the planet creates under our feet

Deborah Blum, Pulitzer-Prize winning science writer,

[C]ongratulations again on such a huge contribution to the field of geochemistry. USGS should be very proud of seeing this through – awesome is an overused word but in this case totally justified!

Fiona Fordyce, British Geological Survey.

To date, the data and maps have been cited more than 300 times in the scientific literature, Master's and PhD theses, and state and federal government publications. Topics of these publications include public health, global climate change, forensics, environmental regulation, food safety, homeland security, ecosystems and mineral exploration, as well as basic geology, geochemistry and mineralogy. These continental-scale geochemical (and mineralogical) mapping projects generally take several years to plan and execute. In addition, the extensive fieldwork and chemical (and mineralogical) analyses cost a lot of money. However, the resulting data, maps and sample archives will be used by generations of Earth scientists, public health specialists, environmental regulators, climate scientists and specialists in other fields. The initial investment is large and there will be frustrations along the way, but the effort will pay dividends for decades to come.

This continental-scale geochemical mapping project could not have been completed without the assistance of numerous USGS personnel who assisted with sample collection and sample preparation. In addition, assistance in sample collection was provided by the Natural Resources Conservation Service, the Minnesota Geological Survey, the Pennsylvania Geological Survey, and the University of Nebraska's Conservation and Survey Division. The project could not have been completed without the unflagging support of Kate Johnson, Coordinator of the USGS Mineral Resources Program from 1999 to 2011.

Thoughtful reviews by Paul Morris and Alecos Demetriades greatly improved the manuscript.

DBS: project administration (lead), writing – original draft (lead), writing – review & editing (lead).

This work was funded by the United States Geological Survey Mineral Resources Program.

The author declares that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

All geochemical and mineralogical data generated in this project are freely available in USGS Data Series 801 ( Smith et al. 2013a, as shown in the References).

This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License (