Despite their importance to environmental issues and mineral resources, surficial sediments are largely unmapped at depth in the United States. Full three-dimensional (3D) models of these materials are needed to support hydrologic modeling, geotechnical engineering, and mineral-resource inventory applications. The main source of information for such models in the Midwestern United States is lithology logs from public water-well records. Three-dimensional modeling of lithology was conducted to elucidate the nature and quality of spatial information sourced from water wells. The modeling was conducted on an ∼130-km2 area near Lake Erie in northeast Ohio that contained an end moraine superimposed on a buried glacial valley. An integrated approach to 3D modeling was adopted where traditional interpretive techniques were used to define glacial stratigraphic units (glacial outwash, till, etc.) with geostatistical simulation of lithofacies conducted within the stratigraphic layers. Despite the large amount of variability and noise inherent in water-well data, there were statistical patterns in the lithology records related to glacial stratigraphy. In contrast, the well data provided only minimal information on facies geometry within these units. The spatial structure of facies in the vertical direction was based on thickness statistics from the water-well data and on geologic interpretation for the horizontal direction. Several sequential indicator simulation models were conducted to investigate the effect of grid-cell thickness, stratigraphic trimming, and length-to-thickness ratios on the reproduction of thickness statistics. This case study confirms that water wells are a viable data source for stratigraphic modeling, but better data sources (e.g., outcrops and geophysics) are needed to parameterize facies simulations.


Maps and models of surficial materials are key to solving current problems in geology and the environmental sciences. Hydrologic studies that address issues such as water supply and non-point source pollution require accurate and realistic representations of unconsolidated sediments at depth. Surficial materials maps are also essential to engineering and geologic hazard investigations such as risk assessments for seismic damage and landslides. Despite the importance of surficial deposits, mapping at depth is rare across the United States. When available, surficial models are typically in the form of two-dimensional (2D) maps drawn at county and state scales. Such maps are common for states with glacial deposits. An additional source, the digital soil survey (SSURGO, available from the United States Department of Agriculture, Natural Resource Conservation Service (Gabriel et al., 1992)), provides information at the county scale to an average depth of ∼1.5 m. For most areas, the depth below the surveyed soil to the bedrock interface is unmapped. The Ohio Division of Geological Survey is conducting three-dimensional (3D) mapping of surficial materials, 1:100,000-scale, as a first step toward filling this knowledge gap (Venteris, 2007; Schumacher, G.A., Venteris, E.R., and Swinford, E.M., unpublished data, 2007). In addition to this work, quantitative techniques for mapping are being investigated to better understand the spatial variability of glacial sediments and the quality of geologic information available from lithology logs sourced from public water-well databases.

Three-dimensional modeling techniques are essential for the mapping of surficial materials due to the complexity of glacial sediments. The glacial stratigraphy in Ohio is the result of a lengthy depositional and erosional history involving several oscillations of the Laurentide ice sheet. This complexity is especially reflected in the sediments of the buried valleys, which typically contain alternating deposits of till, lacustrine sediments, and >100-m-thick out-wash. Such valleys are often a major source of groundwater and the subject of numerical modeling studies involving contaminant distribution and water supply. Three-dimensional geologic models are required to define the physical characteristics of this highly heterogeneous flow medium. In addition, 3D models reduce the abstraction inherent in traditional maps and cross sections, making such models invaluable for describing and illustrating complex glacial geology to the general public.

The quality and quantity of available data creates serious challenges to the mapping and modeling of surficial sediments. The main source of data at depth for this and similar studies is lithology logs from public water-well records (Ohio Department of Natural Resources Division of Water (ODOW), 2007). Well records contain lithologic information (sediment type, thickness, and color) filed by private water-well companies. These wells have a typical spatial density of four wells per square kilometer, which has been found to be insufficient to characterize the lateral spatial structure of alluvial and buried valley sediments by traditional geostatistical structural analysis (Weissmann et al., 1999; Ritzi et al., 2000). The depths of these wells vary widely, but rarely extend beyond 50 m, often leaving large intervals of buried valleys with little or no data upon which to constrain models. In addition, the lithologic information contained in individual water wells ranges in quality, from highly detailed borings that can provide reliable information on both stratigraphic units and facies, to clearly erroneous records. Careful evaluation and interpretation of the water-well data set is required before their use. Water-well data can occasionally be supplemented from detailed borings (from private sector studies, the Ohio Environmental Protection Agency, or the Ohio Department of Transportation), which contain penetration tests and full laboratory analysis of texture. Such detailed records were not available for the immediate study area.

This work seeks to create 3D models based on water-well data using a combination of stratigraphic interpretation and geostatistical simulation. Geologic interpretation, confirmed by statistical analysis of the well data, is used to define regions where the assumption of stationarity is reasonable (Carle et al., 1998, Weissmann and Fogg, 1999; Ritzi et al., 2000; Proce et al., 2004). Facies heterogeneity is then modeled within the defined stratigraphic units using sequential indicator simulation (SISIM).

Spatial variability in lithology and hydrologic parameters at the facies scale is usually modeled using indicator geostatistics (Journel, 1983; Carle and Fogg, 1996; Ritzi et al., 2000). In this approach, the sedimentary facies are divided into a set of categorical variables to be estimated or simulated. The set of indicators can vary in detail from several texture classes to binary approaches that divide textures into categories of high and low conductivity. Simulated facies can then be assigned engineering or hydrologic properties (porosity or saturated hydraulic conductivity) by using average values or through further geostatistical simulation (Feyen and Caers, 2006).

The indicator simulation approach is justified on practical and theoretical grounds. Previous studies of fluid flow within consolidated (Deutsch, 2003) and unconsolidated sediments (Ritzi et al., 2000; Weissmann et al., 2002; Carle et al., 2006) have demonstrated that accounting for the heterogeneity of facies within sedimentary units is critical to realistic modeling (as opposed to the simpler approach of assigning average values of hydrologic parameters to each stratigraphic unit). The generation of many equiprobable models (realizations) greatly facilitates uncertainty assessment. In addition, the indicator approach provides solutions to many statistical difficulties (Journel, 1983; Deutsch, 2003). For example, hydraulic conductivities occur over a wide range of magnitudes and show strong partitioning between sediment types. Hence, the modeling of hydraulic conductivity as a continuous variable presents difficulties because assumptions of normality and-stationarity are rarely met. On a practical level, water wells and current geophysical techniques typically do not provide information of sufficient quantity or quality to allow explicit mapping at facies scales. The multiple realizations generated by geostatistical simulation provide a method of accounting for facies heterogeneity that directly incorporates the considerable uncertainty into the work flow. Finally, simulating lithology types rather than flow parameters allows the direct comparison of modeling results to traditional geological maps and models (Venteris, 2007).

This work seeks to better understand the potential uses and limitations of water-well data for the creation of 3D models at the stratigraphic and facies scales. It is critical to obtain a better understanding of the information contained within this data set. For much of the state of Ohio, and most of the industrialized world, similar well databases represent the main source of information about the shallow subsurface.

Study Area

A buried valley near Lake Erie was the focus of this modeling exercise, which was initially conducted to support qualitative mapping of portions of the Ashtabula 30 × 60-min quadrangle (Venteris, 2007; Schumacher, G.A., Venteris, E.R., and Swinford, E.M., unpublished data, 2007). In general, the study area (Fig. 1) is glaciated with extensive deposits of Wisconsinan and some Illinoian-age drift at depth (White and Totten, 1979). Near Lake Erie, ice proximal (till, kames, and outwash) and lake deposits (lacustrine deposits and beach ridges) cover the Portage escarpment (Brockman, 1998). Farther inland, the depositional environment transitions to till plains and buried valleys. A key feature of most buried valleys in this region is that they were ice-dammed to the north (Bagley, 1953). These buried valleys are different than those studied to the south that drained into the Ohio River (Ritzi et al., 2000). In general, the blocking of drainage resulted in the deposition of a greater amount of lacustrine facies and a higher proportion of fine sediments than in valleys with free drainage.

The area where the Painesville end moraine is superimposed on the buried valley (in the southwest corner of the 1:24,000 Ashtabula South quadrangle) is the subject of this modeling study (Fig. 1). On the immediate surface, the study area contains end moraines, beach ridges, and lacustrine sediments. Till is the dominant deposit, especially at depths below 3 m. In general, the upper layer is a till with an average thickness of 12 m, which contains occasional sand and gravel lenses. Below this till is the main subsurface feature of the study area, a major north-south bedrock valley. The feature is approximately four miles wide and contains up to 100 m of glacial sediments. The valley is filled with till (overlying) and sand and gravel deposits interbedded with lacustrine and probably remnant till facies (Venteris, 2007). The area has been mapped at the surface many times (White and Totten, 1979; Pavey et al., 1999), but studies at depth have not been published.


Preprocessing of Well Data

The first step in modeling was to convert the water-well data to a format appropriate for geostatistical modeling. The water-well database from ODOW consisted of lithologies described using three-letter codes (for example, CLA for clay and DRF for drift). For this study area, there were sixty-five different codes describing sediment, rock, and anthropogenic material such as fill. These descriptions required simplification for modeling, a process that included interpretation and generalization. A lookup table was created to reclassify the lithologic descriptions into four main lithologies (clay, silt, sand, and gravel). Subsequent comparisons between the water-well data and more detailed lithology logs from bridge borings for Ashtabula County suggested that silt was grossly underrepresented 01(Table 1). It was likely that the “clay” units of the water-well records contained lithologies ranging from clay to silt. These four texture classes were modeled using sequential indicator simulation (SISIM) techniques in Venteris (2007), where the goal was to aid decision making for qualitative mapping. For this work, the goal was to create a model appropriate for groundwater modeling. Accordingly, the lithologies were further grouped into materials with high and low hydraulic conductivity (Fig. 2). Ritzi et al. (2000) found that simple binary models dividing the domain into fine-grained (abbreviated m, representing clay and silt) and coarse-grained (abbreviated s, representing sand and gravel) hydrofacies captured most of the important variation in hydraulic conductivity. In addition, by defining categories that group clay with silt, the issue of the under-detection of silt in the water-well data set was avoided.

The wells were discretized in the vertical direction to facilitate modeling. Lithologies for wells in the original ODOW database are assigned over-depth ranges; the upper and lower contacts for each lithologic unit are provided. After conversion to indicator codes, the water wells were discretized at 0.305-m (one-foot) increments. The interval represents the minimum thickness of data in the original records.

Stratigraphic Modeling

A stratigraphic 3D model was developed for the study area based on the water wells and geologic interpretation. In general, the study area was interpreted as containing a glacial till layer overlying a glaciofluvial system. Thin (<3 m) surface features apparent in Figure 1 were not incorporated into the stratigraphic model. The top of the till layer was defined by a digital elevation model (Powers et al., 2002). The elevation of the interface between the glaciofluvial system and overlying till was determined through analysis and interpretation of the water-well data. The data were processed initially to extract the bottom of the first recorded clay layer for each location. This was used as the first approximation of the depth of till. An iterative process followed wherein an interpolated surface was created (kriging using Geostatistical Analyst, ESRI, 2007), and each data point was evaluated for potential exclusion. Wells with values significantly above or below the probable elevation of the till/glaciofluvial interface were removed, and the surface was recalculated. In effect, the position of the interface at unreliable wells (often too shallow due to sand lenses or too deep due to over-generalized records) was interpolated from the reliable values at surrounding wells. During this process, five wells were considered erroneous and removed from all subsequent modeling. The bottom of the glaciofluvial system was defined by the bedrock topography map (Fig. 2), which was created mainly by hand contouring (Powers and Swinford, 2004). The bedrock topography contours were edited to account for new unpublished data collected using seismic refraction. The new study was inconsistent with a side valley that was drawn on the southeast, bottom edge of the map; therefore, it was removed, and the bedrock topography was reinterpreted.

Statistical and Geostatistical Modeling of Facies

The general modeling approach was to create a series of 3D grid models (voxel or “brick” models) using SISIM (using SGeMS software, v1.4; Remy et al., 2006). The key task was specifying the facies proportions and spatial structures for input into the simulation algorithm.

Spatial structures for the models were defined using the approach of Ritzi (2000). Typical geostatistical practice is to define spatial structures empirically by the calculation of an experimental variogram from the data and then fitting a model variogram for use in estimation and simulation. However, it is well established that variograms derived from water-well data are unreliable (Weissmann et al., 1999; Ritzi, 2000; Ritzi et al., 2000), and alternative methods are needed to model spatial structure. Experimental variography was attempted on this data set but did not produce useful variograms, especially in the horizontal direction. Accordingly, length statistics from the well logs were converted to structural ranges for input into SISIM (Ritzi, 2000). The approach is outlined below. Firstly, Carle and Fogg (1996) showed that the mean facies (k) length (l) in a given direction (u) is related to the derivative of the autovariogram at the origin  
where pk is the facies proportion, and h is the lag or separation vector. Taking the derivative of the spherical or exponential model variograms (Deutsch and Journel, 1998) at h = 0 and using c = pk(1 – pk) (for a binary variable) to define the positive variance contribution term (c), yields the equation  
where ϕ is 3 or 1.5 for the exponential and spherical models, respectively, and ak is the structural range of the autovariogram. Combining equations one and two gives the result  
which relates the autovariogram range to the mean facies length and proportion.

Solutions to Equation 3 were obtained from statistics calculated from the discretized water-well data, grouped by the till and glaciofluvial stratigraphic units described earlier in this paper. The proportions and mean vertical lengths (thicknesses) were calculated for each facies. Clustering bias was checked using the DECLUS routine in GSLIB software (Deutsch and Journel, 1998) and was found not to be an important source of error. Structural lengths for the simulations were based on the mean thickness of facies s. This approach differs from that of Ritzi et al. (2000) and Proce et al. (2004), where facies m was used to avoid the potential bias in s due to incomplete penetration. In this study area, the majority of the wells penetrate to bedrock. Also, for the till layer, the thickness of facies m is largely controlled by the layer geometry, which is not stationary. Facies s, within the till, is a minor constituent, and hence is less affected by thickness variations in the till sheet.

As in previous studies, constraining the structural model in the horizontal direction proved difficult. Experimental variography did not provide a useful model, and there was no geophysical or outcrop information available. In the absence of information on the lateral extent of facies, sensitivity studies were conducted over length-to-thickness ratios ranging from 1 to 1 to 500 to 1. A single realization was generated for each ratio using the same random seed number. It was found that ratios ranging from 1 to 1 to 100 to 1 remained faithful to the model proportions, with the range of mean thickness of facies s being 0.20 m for the till layer and 0.5 m for the glaciofluvial layer. Large length-to-thickness ratios (>100 to 1) decreased the proportions and thickness of facies s. Over the studied range of length-to-thickness ratios, maxima in v occurred at a ratio of 10 to 1 for facies s in the till layer, and 25 to 1 in the glaciofluvial layer. In the absence of reliable information on the lateral extent of facies, these ratios were used in further simulation experiments. Subsequent analysis was focused on the reproduction of thickness statistics, and detailed analysis of horizontal lengths was postponed until better constraining information becomes available.

Conditional and unconditional simulations were conducted to explore several modeling issues. Several grid dimensions were used in this study and are summarized in 02Table 2. Unconditional simulations conducted on grids A and B were used to explore the impact of cell thickness on the v of generated facies. The dimensions of these test grids were chosen to minimize the effect of the top and bottom bounding surfaces on the simulation and thickness statistics (Emery, 2004). Three simulations with different seed numbers were conducted on the B grid to compare length statistics between runs with otherwise identical parameters. Finally, conditional and unconditional simulations were run over the entire domain of the well data (grids C, D, and E) with grids of reduced resolution. Facies proportions and thickness distributions were generated for all simulation runs to compare the results.


Glacial Stratigraphy

The hypothesized stratigraphic model was a layer of Wisconsinan till over a glaciofluvial system (Fig. 3). This preliminary conceptual model was developed from geomorphic relationships apparent in surface maps and a few detailed borings outside of the study area (in the same buried valley; Venteris, 2007). Ultimately, the stratigraphy should be tested with additional detailed borings and perhaps geophysical data. However, for this work, the main use of stratigraphy was to delineate regions where the proportion and spatial structure of facies were similar. Therefore, the stratigraphic model was at least partially validated if the statistics calculated from the well data show contrasts between the units. The proportion of facies s and the coefficient of variation (cv) are less in the till layer than in the underlying glaciofluvial system 03(Table 3), confirming that the model delineates a contrast in facies structure.

Sand and gravel deposits were found at depth throughout nearly the entire study area (Fig. 4), raising interpretive questions. In particular, facies s was common below the northeast-southwest-trending Painesville end moraine (prominent topographic high apparent in Fig. 3). The origin of the sand and gravel core was unknown, but determining its nature remained important for hydrofacies modeling. The core may be an old beach ridge that was subsequently covered with till or an erosional remnant from a larger glaciofluvial deposit. Arbitrarily, the latter interpretation was adopted for this model. Facies simulations were based on the assumption that there was an ice proximal glaciofluvial deposit covering the study area that was subsequently truncated (the till/glaciofluvial interface is interpreted as an erosional surface). By this interpretation, the spatial structure of facies was assumed constant throughout the glaciofluvial layer.

Preliminary Unconditioned Simulations

Unconditioned simulations were conducted to help choose the structural model and grid parameters. The model for spatial structure was based on the statistics generated from the well data 03(Table 3) and the results of previous heuristic analysis (Ritzi, 2000). The facies for modeling and the functional form (linear, spherical, and exponential) of the variogram must be chosen for each stratigraphic unit. The glaciofluvial system had a cv greater than one, so the exponential structural model was appropriate. The structural range was nearly identical between facies m and s03(Table 3). The till layer was more problematic because the cv was nearly one, and the structural ranges were not the same between the facies (the proportion of s in the till layer was so small that it had little practical importance, and it was modeled mainly for illustrative purposes). Facies s was chosen for this layer because the thickness of m is known not to be constant over the domain due to the geometry of the till sheet (Fig. 3). Models using both exponential and spherical variograms were conducted for the till layer.

The values of v from the models showed sensitivity to the thickness of the grid cells. For grid A, the cells were thinner than the minimum recorded thickness in the well data, and this was strongly reflected in the results. For the till layer, the mean and median thicknesses were too thin, and there was a greater proportion of facies m04(Table 4). The model for the glaciofluvial system matched the proportions correctly but was also too thin. The grid with a 1-ft interval (B) was more appropriate for comparison because the minimum thickness for the model and well data were equal. Accordingly, the match between the models and the well data was improved. There was additional, if minor, improvement when using the exponential over the spherical model for the till layer. Overall, v was increased and the quantiles matched better, although they were still too thin 04(Table 4). The model and well histograms were of similar shape (Figs. 5 and 6). However, histograms for the wells from both stratigraphic units show a larger proportion of middle and extreme thicknesses than the simulation models. For all models, the standard deviation of thickness was much less than the well data.

Additional runs were conducted with different random seed numbers to illustrate the variability in the thickness statistics between realizations. Most of the variation between runs with different seed numbers occurs at the tail end of the distribution, above the third quartile 04(Table 4).

Simulations over the Full Domain

Further simulation was conducted to make models for visualization and to investigate the effects of trimming the models to stratigraphy and data conditioning on the thickness statistics. Initial attempts at modeling used the dimensions of grid B (extended to cover the entire domain), but produced files that were too large for efficient visualization on a personal computer (using EVS-Pro software (Ctech, 2007)). Therefore, a coarser grid (C) was used for these experiments.

Thickness statistics were relatively insensitive to data conditioning and stratigraphic trimming. Using a thicker grid cell (0.61 m) further increased the mean, standard deviation, and quantile thicknesses of all runs, bringing the distributions in closer agreement with the well data 05(Table 5). Trimming the realizations by the stratigraphic surfaces (Figs. 7 and 8) reduced the mean and maximum thickness of facies s for both layers, but the effect was not large. Likewise, data conditioning had little effect on the distribution of thicknesses other than forcing a match between maximum thicknesses between the model and well data.


That SISIM does not reproduce the length statistics of the input data is well known from earlier studies (e.g., Carle et al., 1998). However, the comparative analysis of histograms presented here illustrated the issue in more detail. The results were sensitive to the thickness of the grid cells. The vertical cell size set the minimum thickness of the model distribution, and many single cell thicknesses were generated. The potential error from choosing an inappropriate vertical resolution was especially apparent when it was set below that of the well data. Also, in cell-based modeling, each thickness for a sedimentary layer is defined by a series of adjacent cells with like lithology. The probability of a continuous series of facies over a given length decreases as cell size decreases because there are more chances to interrupt the sequence. Mean thickness was made to match the well data by progressively increasing the thickness of the grid cells 05(Table 5). Doing so improved the fit of the average length and standard deviation, but degraded the fit for all quantiles except the maximum.

There are many proposed solutions to improving the match between the model input and output parameters such as simulated annealing (Deutsch and Journel, 1998) and the quenching step contained in TPROGS software (Carle, 1998). The usual justifications for annealing are to improve the fit of the realizations to the input parameters (proportions, etc.), improve the results of groundwater simulations, or bring the realizations closer to the geologists' concept of facies heterogeneity (Deutsch, 2003). Alternate simulation algorithms and post-processing are beyond the scope of this paper because there is little independent information outside of the thickness statistics to evaluate competing models. Also, the results contained conflicting information as to the quality of fit of the SISIM models to the well data. Clearly, the mean and standard deviation of facies thickness did not match, but these measures of center and spread are intended for normal distributions (the distributions are not truly log normal; therefore, simple transforms do not address the issue). For grid resolutions B and C, the fit of the overall distributions as measured by the quantiles seems reasonable, especially when considering the quality of the well data. However, unconditioned SISIM outputs a thickness distribution of consistent shape (Figs. 5 and 6), and any deviation of the well histogram from that shape cannot be accurately reproduced by SISIM alone.

The modeling exercise illustrates the current limited knowledge about the geology of the study area and the gains in that knowledge that can be made using water-well data. Firstly, the interpreted stratigraphy model of till on the glaciofluvial system was very simple, but clearly partitioned the data by proportion of facies. As a whole, the water-well data set provided meaningful, if incomplete, information on glacial stratigraphy. Further study was needed to determine if the sand and gravels below the end moraine were from the same deposit as those found in the buried valley. In contrast, the water-well data provided very limited information on lithofacies geometry. While it was possible to generate statistics on facies thickness, individual well records showed a wide range in quality due to lumping and splitting issues, filing of erroneous locations, lithologies, etc. Such issues certainly created a bias in the length statistics, but that bias was not quantifiable without high-resolution, independent data. In addition, due to the spatial intensity and accuracy of the water-well data, no information could be obtained from the well data on the lateral continuity of facies. The horizontal ranges could be based on the maximum thickness obtained from the sensitivity analysis as presented above, or from geologic interpretation. Geostatistical analysis based on the interpreted 3D map of the area (Venteris, 2006) suggested a length-to-thickness ratio of 100 to 1 for these sediments. Thickness statistics generated from this model 05(Table 5) were generally thinner than the previous runs based on smaller ratios, but the standard deviation was increased closer to the target values.


Due to the quantity and quality of available data, the effective modeling of surficial materials necessitates the collaboration of geoscientists from many disciplines and perspectives. Geostatistical modeling requires a stratigraphic framework to define regions of like spatial structure (stationarity). Given the variable quality of water-well records and the complexity of glacial sedimentary environments, stratigraphic mapping based on water-well data and geologic knowledge is a viable approach. The veracity of such stratigraphic modeling can at least be partially checked through statistical analysis of the well data. However, public water-well data are generally not of sufficient resolution to allow interpretive, explicit mapping of facies. Geostatistical simulation or other Monte Carlo techniques provide a convenient method to generate multiple equiprobable models upon which to base uncertainty analyses. As in this study, such simulations are often limited by the lack of realistic models of the spatial structure of lithofacies, particularly the horizontal length of layers. Facies studies based on detailed 3D outcrop models (as described in other papers in this special issue of Geosphere), close proximity borings, and high-resolution geophysical data are needed to create training images (Strebelle, 2002) and geostatistical structure models to produce more realistic facies simulations for glacial sedimentary environments.

This work was partially supported by the Central Great Lakes Geologic Mapping Coalition, United States Geological Survey Cooperative Agreement No. 04ERAG0061. Gary S. Weissmann and Robert W. Ritzi Jr. provided helpful reviews, which resulted in significant improvements to the modeling and the manuscript.