Field geology has traditionally relied on two-dimensional, paper-based workflows. Although digital mapping techniques are rapidly replacing paper ones, three-dimensional (3-D) terrain models and 3-D visualizations have the potential to revolutionize field studies, yet to date, few studies have embraced this technology. The development of structure-from-motion (SfM) photogrammetry has allowed routine production of high-resolution terrain models from a series of photographs taken at arbitrary angles using “multiview stereo” (MVS) software. However, few studies have applied the MVS approach outside of specific, highly controlled field environments that are easily accessible. In this study, we examine methods for ad hoc application of ground-based MVS in remote field areas with large-scale (>2 km2) multifaceted topography and complex geology. Specifically, we emphasize methods that could be employed in a typical geologic field study without the use of specialized equipment beyond a camera, and we identify various pitfalls that can be avoided during this type of work. We present several scenarios that illustrate the different ways that MVS can be implemented in the field. These scenarios vary with respect to: (1) the manner in which ground control points (GCPs) are collected and distributed; (2) the baseline-to-distance ratio of the imagery; (3) the number of photographs taken; and (4) the type of camera used. Each scenario yields 3-D terrain models from which plane orientations can be extracted and upon which 3-D linework can be drawn. We caution that if absolute accuracy—the difference between the location of the objects on the model and their true position on a geodetic coordinate system—is critical to a project, then great care must be taken in using MVS models obtained solely from ground-based photographs because several factors can contribute to spatial errors as large as hundreds of meters over scales of a few square kilometers. The two primary factors that contribute to these significant spatial errors in MVS models are (1) the distribution and positional accuracy of GCPs and (2) the baseline-to-distance ratio. Nonetheless, MVS is a tool that can easily be applied to any field study regardless of terrain complexity, scale, or accessibility, and it has the potential to revolutionize field studies, particularly in areas with steep terrain.

Until recently, field geology techniques had changed little since the advent of geologic mapping itself (i.e., Smith, 1815), but new technologies now promise to revolutionize all aspects of field geology, from initial data acquisition to final interpretations. This revolution began with the replacement of paper-based mapping by field GIS-based digital mapping (e.g., Jones et al., 2004; Clegg et al., 2006; Pavlis et al., 2010; Whitmeyer et al., 2010), but these developments are only the vanguard of further technological developments (Pavlis and Mason, 2017). We believe that the most important among these developments is the photogrammetric technique based on the structure-from-motion (SfM) algorithm (e.g., Longuet-Higgins, 1981; Faugeras et al., 1987; Weng et al., 1988; Viéville and Faugeras, 1990; Crowley et al., 1992; Faugeras, 1993; Taylor and Kriegman, 1995; Westoby et al., 2012), the implementation of which has become more widespread due to recent improvements in computational power. SfM has gained prominence as a result of its inclusion in popular “multiview stereo” (MVS) software packages (Furukawa and Hernández, 2015) such as Pix4D ( and AgiSoft PhotoScan (, which allow construction of three-dimensional (3-D) terrain models from a series of photographs taken from arbitrary camera positions.

Terrain models generated at various scales have been available for nearly 20 yr from airborne (aircraft- to space-based) and ground-based platforms, primarily using either radar or lidar (light detection and ranging) instruments. Similarly, terrain models in the form of conventional topographic contour maps generated with conventional photogrammetric methods from stereo aerial photographs have been available since the dawn of aviation. Nonetheless, data from high-resolution methods are not always available, and high data acquisition and data processing costs limit their applicability (e.g., Smith et al., 2016; Pavlis and Mason, 2017). In contrast, with MVS, a single person armed with a good camera for imaging at ground level, or lofted in the air by a drone, and a global navigation satellite system (GNSS) receiver for spatial positioning is capable of acquiring data that can easily generate terrain models with resolutions comparable to that of lidar (Pavlis and Mason, 2017). Moreover, MVS can be rapidly deployed for any required duration at a fraction of the cost of other methods.

Although MVS is poised to transform the way we do field geology (Pavlis and Mason, 2017), the techniques involve principles not widely understood outside the small community of experts in photogrammetry and software engineers developing this technology. Excellent tutorials exist on using MVS software (e.g., Agisoft, 2017; Matthews et al., 2016), but MVS software is so user friendly that a complete novice armed with only a smartphone’s built-in camera can generate a 3-D outcrop model in minutes with applications like SCANN3D ( This ease of use leads to “black box” approaches by typical users, yet misunderstanding the limitations of these systems can lead to major errors. What those errors are is not widely known in the geoscience community.

In this paper we evaluate this problem through an analysis of the results of a case study. The case study represents a realistic scenario that would be faced by a non-expert in photogrammetry attempting to apply MVS to a field geology problem. Thus, one goal of this paper is to illustrate some pitfalls that are likely to lead to dead ends that waste field time, laboratory time, or both. Similarly, another goal of this paper is to suggest best practices for the most productive application of MVS methods in the field.

There is extensive literature on the accuracy of terrain models using MVS in controlled scenarios (e.g., Bemis et al., 2014; Colomina and Molina, 2014; Siebert and Teizer, 2014; Carrivick et al., 2016; Wilkinson et al., 2016; Cawood et al., 2017; Mosbrucker et al., 2017). Specifically, comparisons between models derived from terrestrial lidar and from MVS have been thoroughly evaluated in controlled experiments (e.g., Wilkinson et al., 2016; Mosbrucker et al., 2017). Nonetheless, these studies have generally focused on hand specimen– to individual outcrop–scale 3-D models (e.g., DePaor, 2016; Wilkinson et al., 2016; Cawood et al., 2017). To our knowledge, there have been no studies to date that consider the realistic field geology scenario of acquiring these data, ad hoc, during the progression of a field study, particularly over macroscopic scales of a few kilometers.

The analysis we present in this paper was made possible by a serendipitous study (Brush, 2015) in which MVS-derived terrain models were acquired in an ad hoc fashion during the course of fieldwork in an area where we were experimenting with techniques for 3-D mapping on a lidar-based terrain model. The result was that our MVS models overlapped extensively with the lidar models, which allowed us to directly compare the two data sets. As such, we are able to assess the results of different 3-D mapping methods, specifically the accuracy of MVS models, in realistic situations where terrain, field time, and field logistics limit data acquisition, rather than in controlled experimental conditions typical of other studies (e.g., McCaffrey et al., 2008; Westoby et al., 2012; Wilkinson et al., 2016; Cavalli et al., 2017). In addition, our underlying approach deviates strongly from that of most geoscience studies to date (e.g., see the extensive summary in Carrivick et al. [2016]) that have focused primarily on relatively flat terrain and morphometric terrain analyses. Instead, our study emphasizes the use of MVS models for bedrock geologic mapping, a problem that places distinctly different constraints on the required model accuracy.

Our emphasis in this paper is that our ad hoc MVS workflow is a likely scenario for most casual users because of the ease of use in acquiring these data. Thus, our results are critical for future users as this technology becomes widespread. During the course of this study, the use of drones became more widespread, but we continued to focus on ground-based methods so that their application to field studies could be evaluated separately from methods used by drones and/or other unmanned aerial vehicles (e.g., Pavlis and Mason, 2017). In particular, we emphasize a number of key issues with MVS that are not obvious at first glance:

  1. Although MVS photogrammetry is user friendly and robust in its ability to construct a 3-D model through iterative numerical techniques, the accuracy of the resulting MVS model is still dependent upon fundamental principles of photogrammetry. These include model dependence on the baseline-to-distance ratio (baseline length of the image array versus distance to the features being imaged), distribution and accuracy of ground control points (GCPs), and the quality of the camera and camera settings used to acquire the imagery (e.g., Wolf and Dewitt, 2000).

  2. Imagery acquired exclusively from ground-based platforms suffers from the same problems as terrestrial lidar where limited look angles produce model distortions (e.g., Cawood et al., 2017) that can be easily misinterpreted. More importantly, MVS models can carry other potential errors, which can produce pitfalls that are not obvious a priori. For example, inconsistent lighting, changing shadows, or a change from dry to wet ground during photograph capture can lead to miscorrelations in the SfM processing step, producing distortions that can be difficult to remove or may even be missed if field observations on the geometry of geologic structures were not thorough.

  3. Despite some challenges, geologic interpretations derived from MVS models are far superior to interpretations derived from comparatively low-resolution terrain models derived from other sources. The main reason for this is that intricate terrain produces complex intersections with geologic structure and such intersections are easy to identify in an MVS model. More work, however, is needed to determine best practices regarding what methods should be applied under different field scenarios.

To consider this problem, we begin with a description of the study area and an overview of the series of field campaigns carried out as part of this study. We then consider a comparison of various data acquired under different scenarios within the scope of this study. Lastly, we discuss the origins of discrepancies and suggest practical guides for use of this technology as we move into extensive use of these data in real-world field studies.

To date, geologic studies using high-resolution terrain models have typically focused on geomorphic (e.g., James and Robson, 2012; Westoby et al., 2012; Hugenholtz et al., 2013; Reitman et al., 2015; Brunier et al., 2016; Carrivick et al., 2016; Cavalli et al., 2017) or engineering applications (e.g., Siebert and Teizer, 2014; Vanneschi et al., 2014; Smith et al., 2016; Telling et al., 2017). For an extensive recent review of these studies and the methods, see Carrivick et al. (2016).

Although geomorphologists have embraced MVS extensively, bedrock geologic studies have made surprisingly little use of this technology. Several studies have emphasized hand specimen– to outcrop-scale models (e.g., McCaffrey et al., 2008; Lato et al., 2013; Pickel et al., 2015; DePaor, 2016; Wilkinson et al., 2016; Telling et al., 2017; Cawood et al., 2017) as well as remote analyses of fracture arrays (e.g., Bemis et al., 2014; Vasuki et al., 2014). Particularly prominent among the outcrop-scale models are paleontological applications that are referred to as “close-range photogrammetry” (e.g., Matthews et al., 2016). This terminology, however, is misleading because MVS is not limited to close-range (≤300 m from camera to object) modeling (Wolf and Dewitt, 2000). Other studies have used airborne lidar over large areas to resolve bedrock structure where outcrop is poor (e.g., Pavlis and Bruhn, 2011; Dyess and Hansen, 2014) using subtle topographic features to track compositional layering. Studies by Svennevig et al. (2015) and Tavani et al. (2014) are important examples where researchers experimented with full 3-D mapping procedures, in which geologic features were interpreted with 3-D polylines and structural information was extracted from models of map-scale outcrops. Nonetheless, all of these studies are isolated examples in a field that continues to be dominated by a flat map–centric—and typically paper-based—paradigm for field data collection and analysis. Here we report on the first phase of a project to use high-resolution terrain models in an area with complex metamorphic structure, perhaps the most difficult visualization problem in field geology.

The central Panamint Mountains of eastern California, USA (Fig. 1), are an upper greenschist to amphibolite facies metamorphic complex that developed in the hinterland of the Cordilleran fold-thrust belt during Mesozoic contraction (Labotka et al., 1980). The protolith consists of middle Proterozoic augen gneiss, quartzofeldspathic gneiss, and muscovite-biotite-quartz schist basement overlain by late Proterozoic miogeoclinal sedimentary rocks that were deposited during the initiation of the Cordilleran passive continental margin (Labotka et al., 1980). Previous studies (Lanphere et al., 1964; Labotka, 1978; Labotka et al., 1980; Hodges et al., 1987; Crossland, 1995; Stevens et al., 1997; Cichanski, 2000; Andrew, 2002), together with our own work, indicate a protracted period of Mesozoic ductile deformation during low-pressure, high-temperature metamorphism. The initial latest Triassic to mid-Cretaceous ductile deformation that generated the main, continuous cleavage was primarily a Jurassic event (Labotka et al., 1985; Andrew, 2002; Cobb, 2015). Late Cretaceous deformation produced an overprinting crenulation cleavage with associated folds within the main Jurassic continuous cleavage (Brush, 2015; Cobb, 2015). The Cretaceous ductile deformation was intense, producing isoclinal folds in compositional layering and a strong L-S tectonite fabric associated with large finite strains as indicated by deformed objects that commonly show maximum strain ellipsoid axial ratios >10 (Fig. 2; Brush, 2015; Cobb, 2015).

The metamorphic assemblage is exposed in a partial crustal section exhumed by low-angle, west-dipping normal faults generated during Neogene extension (Burchfiel and Stewart, 1966; Labotka et al., 1980; Stewart, 1983; McKenna and Hodges, 1990; Albee et al., 1981; Hodges et al., 1990; Labotka and Albee, 1990; Wernicke, 1992; Andrew and Walker, 2009). Extensional structures are more complex in the northern Panamint Mountains (Wernicke et al., 1988; Wernicke, 1992), but in the central Panamint Mountains the structure is essentially an east-tilted crustal block that exhumes low-grade to unmetamorphosed rocks on the east and the highest-grade rocks on the west (Labotka and Albee, 1990). Thus, the structures described here had a pre-extensional geometry that requires a rigid body rotation of 30°–50° to the west as well as some component of vertical-axis rotation to restore the Mesozoic geometry (Serpa and Pavlis, 1996). The tectonic significance of these field data are considered elsewhere (Brush, 2015; Cobb, 2015; Pavlis et al., 2016), but this regional geologic perspective is important for assessment of the complex structures described below in the context of MVS modeling.

Overview of Field Campaign

Data were collected over five field seasons (about five person-months total). This phased approach illustrates a typical workflow for investigators who are new to 3-D techniques and who are investigating a new field area. However, because we have been continually testing software and alternative techniques as part of the project, our field methods were commonly very inefficient at first, and improved over time. Therefore, the methods, results, and revisions to the workflows are herein presented simultaneously in order to provide context to our workflow development.

Throughout the project we used field GIS mapping techniques using the data structure described by Pavlis et al. (2010) for metamorphic terranes. Base map layers comprised 0.5- to 1-m-resolution orthophotos acquired from the U.S. Geological Survey and from ESRI ArcGIS Online ( These were co-registered to digital raster graphics (DRGs) of U.S. Geological Survey 7.5-min quadrangle topographic maps. Over the course of five field seasons (2013–2017), we completed digital geologic mapping in three canyons along the western slope of the Panamint Mountains (from south to north; Fig. 1): Pleasant Canyon; lower Surprise Canyon; and upper Wildrose Canyon. These sites were chosen for both their geologic significance and ease of access, i.e., road access in Pleasant and Wildrose Canyons and hiking trail access in Surprise Canyon. In our first approach (Brush, 2015) we used ArcPad field GIS software ( on a Trimble TDS Recon device running Microsoft Windows Mobile and equipped with a compact flash (CF) card and a Wide Area Augmentation System (WAAS)–enabled GNSS receiver for digital geologic mapping (position accuracy 5 m; Nautikaris, 2016). We also employed a LaserCraft Contour XLRic laser rangefinder, which gave us limited capability for 3-D mapping of cliff faces from distances of as much as 1 km away with a bearing accuracy of 0.5°. In later field seasons we used a digital geologic mapping system ( based on QGIS software ( on Microsoft Windows tablets paired with a WAAS-enabled Bluetooth GNSS receiver. Table 1 summarizes the software used in this research.

In March 2014 we acquired terrestrial lidar data sets at the three sites mentioned above using a Riegl LMS-Z620 terrestrial laser scanner (TLS) provided by UNAVCO ( with a UNAVCO field engineer operating the instrument and processing the data (Fig. 1; Supplement 11). For all three localities, the TLS data were restricted to the canyon floors because of equipment mobility limitations. It was during the 2014 field season that we began experimenting with MVS image acquisition (Brush, 2015).

Between 2015 and 2017 we acquired more imagery for our MVS studies, and we used our field observations, together with 3-D visualizations, to clarify structural interpretations. We then gained insights related to the geologic structure of the region from working with 3-D visualizations and used those insights to test specific geologic hypotheses through field observations at particular locations (e.g., see Pavlis and Mason [2017] for one example). Specifically, we acquired new imagery of the south wall of Pleasant Canyon and located new GCPs on the north canyon wall to evaluate our techniques (see below). The working geologic maps of Pleasant and lower Surprise Canyons developed from two-dimensional (2-D) methods, aided by 3-D visualization, are shown in Figure 3.

Lidar Data Collection

The Riegl TLS system we used is mounted on a tripod and has maximum range of 2 km, a scanning rate of 8000 points/s, and a rangefinder accurate to 10 mm (Riegl Laser Measurement Systems, 2010; Fig. 4). A field of view of 80° vertical and 360° horizontal was used to collect point clouds. Simultaneously, co-registered digital images were captured with a Nikon D700 12.1 megapixel (4256 × 2832 pixels) digital single-lens reflex (DSLR) camera that was mounted to the TLS. The Nikon camera was equipped with a 20 mm, fixed-focal-length Nikkor lens, and the photographs were taken with variable f-stop and exposure levels and an ISO of 200. These images were merged with the lidar point cloud in the Riegl RiSCAN Pro software (, the same software used to operate the TLS from a rugged laptop (Table 1), to generate a colored point cloud for data visualization. We experimented with draping the photographs on filtered triangulated irregular network (TIN) models of the lidar point cloud, but generally avoided doing so due to image smear and the time-consuming workflow relative to the MVS models we developed during the study. Thus, most of our work with lidar models either used the colored point cloud for interpretation or simply used the lidar models to reference our MVS models (see below). The TLS and camera settings were chosen in order to get the best point cloud resolution possible within the field time allotted for data collection in each canyon (two days per canyon).

The TLS was also equipped with a roving Trimble R10 GNSS receiver. A Trimble NetR9 GNSS receiver was used for the base station and setup within 8 km of each site. GNSS data were post-processed using Trimble Business Center software (, from which scanner locations were pinpointed within 2–3 cm. The laser scanner is stated to be accurate to within 10 mm (Riegl Laser Measurement Systems, 2010). This resulted in georeferencing accuracy for each scan site of ∼3 cm, with a relative accuracy (scan position to scan position alignment) of ∼1 cm.

Several TLS deployment sites were chosen in each of the three canyons studied (Fig. 1; Supplement 1 [footnote 1]). Because the TLS is a heavy (16 kg) and sensitive piece of equipment that can be difficult to move from place to place, even with two field personnel, field locations for TLS deployment were chosen based on the ease with which the site could be reached while still maintaining a clear 360° view of the canyon being scanned. This was carried out by driving or hiking the TLS to each site and setting up the tripod, laser scanner, camera, R10 GNSS rover, and controlling laptop. Reflective targets were not necessary for scanning with this TLS due to iterative closest point (ICP) processing (Zhang, 1994).

Multiview Stereo (MVS) Photogrammetry Data Collection

Digital photographs for our MVS studies were obtained using three different camera and lens configurations: (1) a Canon EOS Rebel T3i DSLR camera with a resolution of 18 megapixels and an 18–55 mm IS II lens; (2) a Nikon D5300 DSLR with a resolution of 24.2 megapixels and a Nikkor fixed-focal-length AF-S DX 35 mm, f/1.8 lens; and (3) a Sony Cyber-shot DSC-HX9V digital camera with a resolution of 16.2 megapixels and a 16× optical zoom lens (imagery was acquired at maximum wide-angle zoom, equivalent to 18 mm). Both the Sony and Nikon cameras are equipped with an internal GPS receiver, but we used GPS data only from the Sony to locate camera positions. The Nikon GPS was virtually unusable due to unacceptably long wait times for satellite fixes and because the camera battery life was compromised with the GPS active. Where GPS positions were not available directly from the camera (i.e., for the Canon and the Nikon), we located the camera positions using the WAAS-enabled Bluetooth GNSS receivers we carried with us for our digital mapping system. Table 2 summarizes the camera and other MVS-specific characteristics of each site in this study. Though studies have shown promise in employing smartphones to acquire photographs for MVS studies (e.g., Micheletti et al., 2015; Prosdocimi et al., 2015), we opted not to due to concerns about camera quality, image resolution, and internal memory as well as a lack of manual control. We used two relatively high-quality DSLRs (Canon and Nikon) as well as a more basic “point and shoot” camera with a reliable GPS (Sony) in order to evaluate results from different camera qualities, an important problem for field studies that employ variable-quality cameras limited by field logistics; e.g., a large DSLR may be undesirable in many field scenarios. The Sony also represents the logical alternative to a smartphone for studies in remote areas such as this because it is lightweight, is GPS enabled, has low cost, can store large amounts of data without relying on internet access, and is less prone to damage than a smartphone.

Ground Control Point (GCP) Collection for MVS

GCPs were obtained from individual sites using four methods that are described here and evaluated in detail below. Method 1 was an experiment to use a widely available device, a handheld laser rangefinder, to geolocate natural objects as GCPs. This is an attractive scenario for a typical geologic field study given that such a device is highly portable, widely available, and can acquire GCP locations on otherwise impassible cliff faces up to 1 km away. Here we used the LaserCraft Contour XLRic laser rangefinder to locate large boulders, the bases of trees or shrubs, and color contrasts on natural outcrops. GCP location was determined within ESRI ArcPad software ( by merging the operator’s position as determined from the WAAS-enabled CF-card GPS receiver and the ranging information (offset distance, azimuth, and elevation angle) from the rangefinder. Digital images were captured at these sites using the tablet computer used for digital mapping, and the objects whose locations were collected with the rangefinder were annotated on these images for reference to the site position. These annotated digital images were later used to locate the GCPs in the photographs used for MVS modeling.

In method 2 we exploited the overlap between our MVS imagery and the TLS data as an experiment to simulate a scenario in which a high-precision differential GNSS unit was used to locate natural objects, and those positions were used as GCPs. Note, however, that this method probably generally provided a better GCP geolocation system than might actually be provided with a high-precision GNSS. This is because we could use objects on cliff faces visible in the TLS data as GCPs, an unlikely situation that in reality would require rock climbing with a heavy high-precision GNSS unit.

Method 3 was a more conventional method for obtaining GCPs in which we placed artificial markers within the scene and located the markers with the WAAS-enabled Bluetooth GNSS receivers we carried with us for our digital mapping system. Markers were then located in the photographs used for MVS modeling.

Finally, in method 4 we used camera positions as our only reference points to develop models. In other words, no GCPs were used to build these models. However, in all of the cases where we used method 4, we had a second model that utilized one of the other three GCP methods for comparison. Table 2 summarizes methods used for establishing GCPs at each site.

Lidar Data Processing

GNSS coordinates were collected at the scan origin of each individual scan setup (Supplement 1 [footnote 1]). These scan locations were then assigned in Riegl RiSCAN Pro software to each scan position, and the scan positions were aligned to each other using the Riegl Multi-Station Adjustment (MSA) module. This adjustment uses a variant of the ICP algorithm to align successive scans as opposed to the more common method of using fixed targets in the scan area to tie successive scans together (Williams et al., 2012). The MSA algorithm also assigns an error sphere around each scan location so that the individual scan locations are allowed to adjust within this sphere. These spheres were set to account for worst-case expected accuracy of the GNSS locations (4 cm); this flexibility can potentially increase the relative accuracy of the scan-to-scan ties beyond the georeferencing accuracy of the GNSS locations. Once MSA is complete, all of the individual scans are then tied together in one rigid body.

A series of seven photographs was collected simultaneously with each lidar scan using the Nikon D700 which was attached to the scanner with a calibrated mount. These seven photos were then used to color the points collected by the scanner. Distortions in the camera lens and misalignment of the calibrated mount resulted in individual point coloring that is not always true to the color of the real-world objects.

MVS Data Processing

The photographs collected in the field were processed using Agisoft’s PhotoScan Professional software (Table 1) to generate dense 3-D point clouds, TIN models (typically texture-mapped with imagery), or both. The amount of time required to generate an MVS point cloud depends on the number of images, camera resolution, the number of central processing unit (CPU) cores, and graphics processing unit (GPU) performance. Processing efficiency is outside the scope of this paper. However, for a review of multicore performance specifically for PhotoScan, refer to Bach (2015). Our initial workflows for this processing are as described by Brush (2015), and typically followed the processing scheme suggested in the PhotoScan documentation (Agisoft, 2017). We later experimented with more advanced camera optimization techniques (Matthews et al., 2016), but we found these methods typically made only incremental improvements over the general workflow suggested by Agisoft (2017), provided the original images were of high quality and properly acquired. Camera and image optimization is not evaluated here, however, a recent evaluation can be found in Mosbrucker et al. (2017).

Both the MVS- and TLS-generated point clouds were exported from PhotoScan and RiSCAN Pro, respectively, as ASCII (x, y, z [coordinates of a point], R [red], G [green], B [blue], I [infared]) files and .las (binary format for storing lidar data) files, respectively, and imported into Maptek I-Site Studio software ( for additional processing and analysis (Fig. 5; Table 1). First, the point clouds were quality checked using I-Site Studio, and, if necessary, they were edited to remove obvious outliers and spikes using both manual selection and automated filtering. In lidar data, outliers and spikes can arise due to noise (e.g., the scanner catches dust, a bird, or insect in flight) as well as the presence of vegetation. In MVS data, outliers and spikes can arise for other reasons such as improper masking of sky (particularly with clouds), cloud shadows that moved across the scene while taking photographs, and foreground clutter. Each of these can lead to errors if not removed. However, these types of features were not abundant in most of our MVS and TLS data (Fig. 5).

We conducted numerous experiments using TIN models generated from the point clouds, both as textured models exported from PhotoScan as well as models generated within I-Site Studio after point cloud editing. The former was used for geologic interpretations, and the later was used for the spatial accuracy analysis discussed below. Where these point clouds proved too cumbersome to visualize readily, we tiled the data using CloudCompare software (open source, to subdivide the data into more reasonably sized areas for analysis.

To compare the MVS and lidar point clouds and interpretations on them, we used both I-Site Studio and CloudCompare. Each software package has its strengths and weaknesses, but both contain automated functions for analyzing point cloud mismatches. In addition to using these automated functions, which have a potential for error if the point clouds are distorted or scaled improperly, we also queried specific positions from recognizable objects in the point clouds for use as a direct comparison between the lidar and MVS point clouds (see below).

3-D Mapping Methods

Software development for geologic interpretation advanced rapidly during the course of this study. In our initial work, I-Site Studio was the only practical software platform for making 3-D interpretations directly on 3-D models. Later in this study other software became available, including ArcGIS Pro (, CloudCompare, and Move (Midland Valley, For the bulk of this study we continued to use I-Site Studio because of the rapid visualization provided in this software versus the other platforms. For final model development and cross-section preparation we exclusively used Move software (Table 1).


MVS models have seen extensive use in engineering and geomorphology applications (e.g., Siebert and Teizer, 2014; Vanneschi et al., 2014; Colomina and Molina, 2014; Bemis et al., 2014; Carrivick et al., 2016; Telling et al., 2017). In these applications, particularly engineering, high accuracy is essential, with absolute errors of ±1 m being unacceptable. In the context of field geology, absolute accuracy of a model is usually not as important as the relative accuracy between points in the model, with some caveats. For example, in field projects involving the analysis of bedrock structure, as long as the relative positions of features within the model are accurate and the scaling is accurate, the geometry of the structure can be constructed with high precision. However, there are situations in field geology where absolute accuracy is important. In studies where it is necessary to merge models in order to construct one large model, poor absolute accuracy can make this task difficult. Indeed, we experienced this problem in Surprise Canyon, with multiple iterations required to properly register all of the point clouds. In regards to mapping, features mapped on a dense point cloud or surface model (e.g., Pavlis and Mason, 2017) need to have high absolute accuracy in order to be merged with regional data, such as geologic mapping on a digital elevation model (DEM) reference base. In addition, it is important to also realize that absolute accuracy remains subject to the basic data acquisition process that generated the model, which is the primary subject of this paper.

Although there is significant discussion about the accuracy of MVS terrain models in engineering and geomorphology literature (e.g., Haneberg, 2008; Stojakovic, 2008; James and Robson, 2012; Hugenholtz et al., 2013; Siebert and Teizer, 2014; Colomina and Molina, 2014; Furukawa and Hernández, 2015; Reitman et al., 2015; Carrivick et al., 2016; Mosbrucker et al., 2017), these analyses generally consider cases with well-controlled data acquisition parameters. For example, Carrivick et al. (2016) provided an exhaustive review of previous accuracy assessment studies, but in all of the cases described, large numbers of GCPs were available (typically >50), the terrain was generally very subdued, and the researchers were concerned with absolute accuracy at the sub-meter level. We suggest that these analyses have limited applicability to a field science problem where absolute accuracy is less important than relative accuracy, and where logistics, terrain, equipment restrictions, and time restrictions conspire to produce far-from-ideal data acquisition conditions. With this approach in mind, we consider four practical problems in data acquisition and their effects on both absolute accuracy and relative accuracy:

  1. We evaluate different techniques for georeferencing models, including various GCP selection methods in addition to camera positions, and also the case where model georeferencing is based solely on camera positions.

  2. We evaluate the effects of the spatial distribution of camera positions, what we refer to as image array geometry, on model accuracy.

  3. We evaluate the relative accuracies of indirect measurements of bed-parallel foliation orientation versus orientations collected in the field.

  4. We evaluate the relative accuracy of direct mapping on high-resolution terrain models versus the accuracy of mapping on low-resolution terrain models.

To evaluate the absolute positional accuracy of the terrain models we derived from MVS photogrammetry, we compared them quantitatively to the TLS-derived terrain models we obtained for the same locations. In this analysis, the TLS-derived terrain models were used as the reference due to their centimeter-scale georeferencing and relative accuracy. To quantify the mismatch between MVS and TLS models, we used two approaches:

  1. We manually selected ∼7–11 objects that were visible on both TLS and MVS models and queried the point cloud for their (x, y, z) locations, taking care to match the objects selected as closely as possible. The differences in (x, y, z) locations were then computed and visualized using a MATLAB ( script (Fig. 6; Supplement 22). We always began our analyses using this manual selection method to ensure we were comparing the same points without the potential for false correlations in automated procedures. In this manual analysis, we attempted to use both easily recognized points and points well distributed across the model. Note that in this step, the TLS models were not used to georeference or transform the MVS models in any way.

  2. We used an automated function in CloudCompare to evaluate the residual mismatch between TLS- and MVS-derived models from the resulting scalar error cloud.

To evaluate the mismatch between mapping on high-resolution and low-resolution terrain models, we aligned the MVS point clouds to the TLS benchmark using a variety of methods. We used either a point-by-point matching scheme implemented in I-Site Studio to align the point clouds, or a rigid-body translation based on a single point match. Once large shifts such as these between point clouds were removed, we were generally able to further refine the alignment of point clouds using automated functions in I-Site Studio.

It is well known, from decades of photogrammetry work using vertical aerial photography, that the distribution and accuracy of GCPs are critical in constructing accurate terrain models (e.g., Wolf and Dewitt, 2000). This issue extends to photogrammetric MVS (e.g., Carrivick et al., 2016; Agisoft, 2017). Thus, we experimented with four field methods for obtaining GCPs, as described above: (1) using the locations of natural objects as measured with a laser rangefinder linked to a WAAS-enabled GNSS receiver; (2) using the locations of natural objects as extracted directly from the TLS terrain model; (3) placing artificial markers in the scene prior to imaging, which were then located with a WAAS-enabled GNSS receiver; and (4) using only camera positions (no additional GCPs) located with the WAAS-enabled GNSS. Note that the first three methods used camera positions in addition to GCPs to construct the MVS models, and method 4 is used as our basis of comparison against which we evaluate the other three methods. The distribution of the GCPs for all methods was qualitatively evaluated based on the amount of horizontal and vertical spread within the scene (Supplement 1 [footnote 1]; Table 2). In addition, a quantitative approximation of vertical distribution is expressed as the elevation range covered by GCPs divided by the elevation range of the scene (i.e., percent elevation coverage; Table 2).

Another well-known principle in photogrammetry (e.g., Wolf and Dewitt, 2000) is that the effectiveness of depth calculations from photographs falls off markedly with distance (distance between camera and object being photographed) due to simple trigonometry. Parallax decreases with distance such that for a single stereo pair of photographs, depth differences are undetectable at distances of a few times the spacing between photographs (the baseline length) (Wolf and Dewitt, 2000). Quantitatively, depth calculations are most accurate when the baseline-to-distance ratio is 2:1 (Knötzl and Reiterer, 2010). Unfortunately, in a real-world field study like this one, this important geometric parameter is the most difficult to control. For example, the baseline length is commonly dictated less by photogrammetric rigor and more by access, specifics of the terrain, and field time, resulting in imaging conditions that can be suboptimal. Accuracy of the resulting MVS model is also dependent on the variation in the vertical distribution of camera positions, as this variation allows for capturing the scene from a wide variety of viewpoints, ideally surrounding the object (Furukawa and Hernández, 2015). In this study, the distribution of camera positions along the baseline, both horizontally as it relates to object distance and vertically, is described as the image array geometry. Due to the serendipitous nature of the data sets, the image array geometry naturally varied from site to site based on natural limitations (i.e., steepness of terrain, presence of slippery talus slopes, large fallen boulders, etc.). We were able to evaluate the impact of image array geometry in areas where the same camera and GCP method were used to collect the data for the site (see below). Table 3, as well as Figures 6 and 7, summarize the results of these experiments.

GCP Method 1

Method 1, using natural objects geolocated using a laser rangefinder, is seemingly an ideal method where steep terrain affords limited access for placement of ground control, but suffers a tradeoff in accuracy related to steady aiming. It is possible to improve aiming a laser ranging device by bringing a tripod, but this would add weight, and we intended to test a minimalist approach in order to prevent negating one of the greatest benefits of MVS over lidar, which is field mobility for data acquisition (e.g., Pavlis and Mason, 2017). In this study, we chose to experiment with an ad hoc field scenario where portability is favored over precision by using a lightweight (1.6 kg), handheld laser ranging device for this ground control method. When reviewing the results from GCP method 1, it is best to first compare the results from the Wildrose North and Wildrose South sites (upper Wildrose Canyon), as they utilized the same camera and thus make the best direct comparison in this serendipitous data set (Table 3; Fig. 7). Looking at the characteristics of these two sites (Table 2), we would expect the Wildrose North model to have better accuracy than that of Wildrose South because Wildrose North has more GCPs, a more favorable GCP distribution, and a better image array geometry (Fig. 8; Supplement 1 [footnote 1]). According to the results of manual analysis (Table 3) and CloudCompare (Fig. 7), however, the Wildrose South model had better accuracy than that of Wildrose North. The origin of this discrepancy is not obvious. The only characteristic of this method (Table 2) that could be considered unfavorable is GCP position versus object distance. Specifically, Wildrose North has the largest object distance, with a maximum distance outside the maximum targeting distance of the laser rangefinder. Comparison of Figures 7A and 7B shows an important distinction in this context. In Wildrose South, mismatch is generally distributed across the model with some increase in error with distance (Fig. 7A), yet in Wildrose North there is a clear increase in mismatch with distance (Fig. 7B). Note that this is a clear indication that in Wildrose North the MVS model is systematically rotated relative to the TLS model, with a rotation axis subparallel to the long axis of the scene, a geometry also clearly visible in visualizations showing both the TLS and MVS models. It is also important to note, however, that scalar assessments like those in Table 2 may be deceptive in a true 3-D environment. That is, although there is a significant elevation range in camera coverage for Wildrose North (Table 2), the image array is nearly linear, and when that geometry is paired to GCPs with a limited distance range, the 3-D geometry is akin to a cylinder with a degree of freedom about the rotation axis of the cylinder (Fig. 9). Thus, no software could resolve this degree of freedom, producing the observed rigid-body rotation. This relationship is important because were it not for the TLS survey, we would probably have remained ignorant of the error caused by not having enough GCPs at far offsets.

Though the Clair Camp site (middle Pleasant Canyon) was our first test of the photogrammetry method and photos were taken with a different camera, the results of spot analysis show that Clair Camp model had marginally better accuracy than those of both Wildrose sites despite a very low baseline-to-distance ratio and fewer GCPs (Table 3; Fig. 8). Distribution of GCPs was only marginally better than at Wildrose South, but certainly not as well distributed as at Wildrose North (Fig. 8). We suspect that the Clair Camp model had slightly better results due to camera type as the Canon is a higher-caliber camera (large image sensor and high-quality lens) than the Sony, a conclusion supported by Mosbrucker et al. (2017). More significant, however, is that the spatial errors for both Wildrose sites and Clair Camp are systematic, with mismatch increasing with distance between camera and object being imaged, despite a favorable image array geometry for Wildrose South and Wildrose North (Figs. 6E, 7A, 7B, and 8). It is important to recognize that spatial errors for the method 1 sites are significantly larger than errors produced by GPS (in)accuracy alone (<12 m; Department of Transportation of the United States of America and Federal Aviation Administration, 2008; Table 3).

GCP Method 2

Method 2, using natural objects geolocated directly from the TLS model, is meant as a simulation of a GCP approach of locating natural objects with a high-precision (sub-meter) GNSS receiver. That is, outside of a comparative study such as this, it would usually be absurd to collect TLS data solely for the purpose of ground control for MVS. Our serendipitous data set allowed us to use this method to simulate the sub-meter location of natural objects for use as GCPs. Data collection for both the Surprise North and Surprise South sites (lower Surprise Canyon) utilized the Nikon camera, but Surprise North had slightly more GCP elevation coverage but fewer GCPs overall (Table 2; Supplement 1 [footnote 1]). GCPs were considered to be distributed poorly in both sites. However, Surprise North has a slightly more favorable image array geometry, and thus image array geometry may be the primary factor controlling the results from these two sites (Table 3; Fig. 6). Of note is that Surprise South had more GCPs than Surprise North. However, when the number of GCPs is combined with the number of camera positions, Surprise North has more total reference points, which may also contribute to the superior results from this site. When considering results the Noonday site (lower Pleasant Canyon) in conjunction with the results of these two Surprise Canyon sites, we originally suspected that Noonday model would outperform those of the other two sites as it has the most favorable image array geometry (Supplement 1 [footnote 1]). In reality, however, of the three sites that used method 2, the Noonday had the poorest results (Table 3). This could be the result of differences between sites (i.e., Noonday has more holes in the TLS data than the Surprise sites due to the natural limitations of each site), but the alternative explanation is that both Surprise North and Surprise South had more total reference points (GCPs plus camera positions) than Noonday.

To test the impact GCPs have on model accuracy, we took the same Noonday model and improved the vertical distribution of the GCPs by adding three more GCPs farther up on the ridgeline than the initial seven GCPs. Table 3 and Figure 6 show that adding the GCPs, and thereby improving the distribution, yielded better results, indicating that GCP amount and distribution are key factors that require careful consideration when planning large-scale photogrammetry studies. Note that overall, method 2 produced the best results of all four methods, which is not surprising considering that the GCP source was the same TLS model that the photogrammetry model was evaluated against (Table 3).

GCP Method 3

Method 3, using artificial markers as GCPs, is a standard procedure for both photogrammetry studies and lidar, although our implementation has the important difference of using a relatively low-precision (meter) recreational-grade GNSS receiver rather than a more elaborate—and therefore less portable and more complicated—high-precision (sub-meter to centimeter) mapping- or survey-grade system. This method was a standalone experiment presented only to see what kind of results could be achieved by using markers. In this case it was possible to place artificial GCP markers across most of the Surprise West site (lower Surprise Canyon), resulting in discrepancies that are close to the range of GPS errors (<12 m; Department of Transportation of the United States of America and Federal Aviation Administration, 2008; Table 3). The goal was to see if this method produced better results than the handheld laser rangefinder, and it did. There is a small systematic error across the MVS model, with error increasing from south to north (Fig. 7C). However, this systematic error can be virtually eliminated by a small rigid-body rotation, which is consistent with examples we have seen for other sites (Figs. 6 and 7). The real value in the result from Surprise West is that this site produced similar results to method 2 sites, which used the same Nikon camera. It is interesting to note that Surprise West had a comparable baseline-to-distance ratio, GCP distribution, and number of GCPs to the Noonday site, which may explain the similar results between the two sites, with the Noonday site producing marginally better results possibly due to the use of method 2 for GCP acquisition and/or more elevation coverage of camera positions (Table 3). However, more thorough experiments are needed to verify this method.

GCP Method 4

Method 4 was used as a control to see how having no GCPs would affect model accuracy. Instead, ground control was obtained exclusively from camera positions measured by the WAAS-enabled GNSS receiver used with our field GIS mapping system. When the MVS and TLS models are observed together, the MVS models are rotated away from the TLS with the largest errors at points farthest away from camera positions, indicating that distance is not being resolved correctly at greater distances away from the baseline, as expected from the work of Wolf and Dewitt (2000) (Animation 1). When comparing the Surprise North and Surprise South models constructed using method 4 to their method 2 counterparts and the Clair Camp method 4 model to its method 1 counterpart, it is clear that having GCPs, regardless of method, produces models more accurate than those constructed without GCPs (Table 3; Figs. 6A and 6B). At first glance, the results from the three sites that used method 4 (Table 3; Supplement 1 [footnote 1]) suggest that adding more camera positions, thus increasing baseline, improved model absolute positional accuracy, with Surprise North being the most accurate. However, in addition to having more camera positions along the baseline, Surprise North had camera positions at a wider range of elevations (defining the vertical component of the image array) compared to the other two sites. We suggest that having more camera positions spread out both horizontally and vertically provides an image array geometry that is favorable for improving absolute positional accuracy of MVS models. It is possible that Surprise North greatly outperformed the other two sites in this method 4 category simply because it had more camera positions as reference points (Table 2). This suggests that the image array geometry plays a critical but secondary role in determining accuracy, whereas ground control, be it GCPs or camera positions, plays the primary role in georeferencing, consistent with results seen in method 2.

The results from Clair Camp without GCPs were pitiful, as expected because the image array geometry is very poor and Clair Camp employed the fewest camera positions of the three sites that tested this method (Table 3; Animation 1). Surprise South had a slightly better image array geometry than Clair Camp, which may explain why Surprise South outperformed Clair Camp. It is also possible that Surprise South produced slightly better results than Clair Camp due to the differences between cameras; at Surprise South we used the Nikon which has a larger sensor with better pixel resolution than the Canon used at Clair Camp. This conclusion follows with what was suggested by Mosbrucker et al. (2017). The case study presented in Mosbrucker et al., 2017 showed a strong correlation between image quality and accuracy. Sensor size, in addition to pixel resolution, plays a role in image quality. Mosbrucker et al. (2017) concluded that the camera system selection, camera configuration, and image acquisition parameters all play a role in model accuracy.

Method Comparison

Though there is no strict control in this serendipitous field experiment to determine which GCP method performed better, general observations can be made from the results of manual analysis and CloudCompare (Table 3; Figs. 6 and 7). First is that results from sites where we used the same method seem to be grouped into distinct error ranges despite the variations between sites, where method 2 performed the best, followed closely by method 3, then method 1, and finally method 4. This outcome was generally expected based on previous work (Wolf and Dewitt, 2000). For example, it makes sense that method 4, having no GCPs in the scene, would have the worst results when compared to their counterparts that used GCPs in the scene obtained with either method 1 or method 2. That method 2 performed the best is also as expected because the GCPs came from the same TLS model that the MVS model was compared to. More important is the final observation that method 3 performed nearly as well as method 2, indicating that method 3, which uses a low-precision GNSS receiver, generates comparable results to having a high-precision GNSS for ground control. However, more data are needed to support this conclusion.

Orientation Analyses

I-Site Studio, CloudCompare, and Move contain routines that allow estimation of planar surface orientations (strike and dip). These routines include simple three-point analysis as well as multipoint analysis. It is also possible to estimate the orientation of a plane by manually drawing the trace of its intersection with the point cloud or TIN model surface. I-Site Studio can also use multiple points or triangular patches in the TIN to calculate a best-fit orientation, a method referred to as “patch selection”. These routines provide the ability to extract orientation data for cliff faces and other areas inaccessible in the field. However, they require an accurate terrain model and the ability for a user to visualize the surface and evaluate the validity of the geologic interpretation without the benefit of ground-truth measurements (Animation 2).

To evaluate the application of these remote orientation measurement techniques, we analyzed two sites in Pleasant Canyon: Clair Camp and Noonday. For the Clair Camp site (Figs. 10A and 10B), orientations obtained from the TLS terrain model could be directly compared to field measurements, whereas most of the Noonday site (Figs. 10C and 10D) is an inaccessible cliff face (Animation 2). Thus, for the Noonday structure we can only compare the MVS model–derived orientations to field measurements from outcrops at the base of the cliff, from the ridge tops, and from the south side of the canyon. In all cases, the orientation analysis was done either with a simple three- to six-point analysis from the point cloud or using patch selection on a slope that was interpreted to show exposed layering on the terrain model. The TLS model was used at the Clair Camp site because of the inaccuracy of the MVS model (see above), but the results would undoubtedly be similar from a more closely aligned MVS model. At the Noonday site, we chose to use the MVS model to extract orientations because it had been aligned to the TLS model.

A stereographic plot of 30 TLS-derived strike-and-dip orientations of S1 foliation from the Clair Camp site is similar to that of the 30 orientations taken in the field (Figs. 10A and 10B). Both stereograms show: (1) dominantly west-dipping foliation with dips that range from shallow to steep as a result of younger folding of foliation; and (2) a π-pole in the southwest quadrant reflecting the axis of the fold in foliation that is easily seen on the hillslope (Fig. 3A, at north end of the easternmost purple zone [quartzite] there is a fold in foliation [black lines] and the axial trace of that fold is indicated by a blue line). The π-pole for the TLS-derived orientations plunges 35° toward azimuth 215°, whereas the π-pole from the field orientations plunges 18° toward 193°. The scatter, however, is different between the two stereograms. The TLS-derived measurements display a greater angular dispersion around the fold than the field measurements, providing a clearer definition of the fold orientation for this structure. The field data have greater azimuthal scatter over a more constant dip range, primarily reflecting small-scale variations that are not visible on the model. Nonetheless, the results are comparable and reasonably acceptable. The similarity in results between the two techniques in this area probably reflects the relative ease of recognizing foliation surfaces in the TLS model in an area dominated by relatively flaggy schists.

The measurements of foliation at the Noonday structure (Figs. 10C and 10D) are more difficult to interpret. The field data (Fig. 10D) display a classic great-circle distribution of foliation poles with a well-defined π-pole that plunges 0° toward 350°. The MVS model measurements (Fig. 10C) yield a similar π-pole that plunges 9° toward 338°, but the data are markedly more scattered. This scatter could be real and reflect noncylindrical fold systems with curved fold axes, which could be expected because there are known refolded fold systems on this cliff (Fig. 3B). Alternatively, however, the increase in scatter could be an artifact of the measurement method and its associated errors. This is suggested strongly by the azimuthal scatter on steeply dipping surfaces in the model measurements (Fig. 10C) versus the field data (Fig. 10D). Whereas accurate strike measurements are the norm in field data, particularly where dip is steep, in a multipoint analysis, steep dips are the most difficult to measure because small errors are amplified.

Nevertheless, for both the Clair Camp and Noonday sites, the model measurements were consistent with field observations, but the scatter patterns are distinctly different. We suggest that further work is needed to resolve the source of the differences in scatter pattern between field- and model-derived measurements before methods of the type employed in this study are used widely for structural analysis.

Comparison of 3-D Interpretations versus 2-D Mapping

Virtual-globe software like Google Earth ( or National Aeronautics and Space Administration (NASA) WorldWind ( display Earth’s surface by draping satellite imagery onto relatively low-resolution (typically 30–90 m/pixel) DEMs (Mahdavi-Amiri et al., 2015). This “2.5-D” visualization of a 3-D world gives satisfactory representations of Earth’s surface for subdued terrain, but where slopes are >45° this approach produces image distortions and pixel smear that can confuse 3-D interpretation and introduce large errors in geologic models (e.g., Pavlis and Mason, 2017). Here we attempt to quantify this effect by comparing mapping performed using the 2.5-D method versus direct mapping on a 3-D model. In all cases we compare linework drawn directly on MVS point clouds using I-Site Studio (referred to here as 3-D mapping or true 3-D) to 2-D map traces projected onto a 30 m DEM using Move software (referred to here as 2.5-D mapping).

The simplest case for this comparison is the Surprise South site (Fig. 11). Here we compare the 3-D traces of a nearly homoclinal succession of layers that dip subparallel to one another but are slightly steeper than the ∼50° topographic slope, forming a classic dip slope. In our experience (e.g., Pavlis et al., 2012) this classic “law of Vs” visualization is commonly rendered poorly with 2.5-D methods, particularly when topographic details are smaller than the DEM cell size. The law of Vs refers to the V shape that geological contacts, or fault lines, make when they intersect valleys as drawn on a 2-D topographic map. The apex of the V points in the dip direction of the contact or fault. The openness or narrowness of the V indicates the general dip of the plane (i.e., short, open Vs indicate steep dips whereas long, narrow Vs indicate shallow dips). For Surprise South (Fig. 11) this is exactly the case, with mapped bedding trace lines appearing ragged and non-coplanar in a 3-D visualization of the 2.5-D model. In contrast, using the true 3-D method (Fig. 11), the 3-D visualization appears more realistic and the bedding traces are uniformly planar to gently curved in shape. This improvement is produced by the higher spatial resolution of the MVS model, which captures subtle terrain variations that faithfully reproduce the 3-D traces of contact V shapes in the small gullies that are poorly resolved in the lower-resolution DEM.

If a cylindrical fold is accurately mapped in 3-D it should be easily projected to a cross-section line by projection of the linework along the trend of the fold axis, as in classic down-plunge section construction techniques. Thus, this method should provide an assessment of how surface model accuracy, and geologic information derived from it, can effect reconstruction of geologic structure in 2.5-D versus 3-D methods. Figure 12 shows this approach applied to a fold system in lower Surprise Canyon, where one line set is derived from the 2.5-D method and the other from the 3-D method. Both sets of lines were produced by projecting the 3-D positions of points on the line parallel the local fold trend (169° azimuth with a 12° plunge determined from stereogram analysis), onto the same vertical cross-section perpendicular to the local fold trend using the “project to section” module in Move software. In this case, the fold geometry obtained from the 3-D method (red lines) produces a nearly perfect rendering of the fold profile versus the poor rendering of the fold geometry obtained from the 2.5-D method (blue lines). This result is somewhat surprising given that the terrain in this area is not excessively steep, with most of the area easily accessible on foot. The problems with the 2.5-D method on the northern side of lower Surprise Canyon are similar to those of the Surprise South case, in that the scale of subtle terrain variations is comparable to the 30 m resolution of the DEM, which smears the linework in 3-D sufficiently that it cannot be projected faithfully to the line of section.

To further analyze these methods, we explored the complex structure on the north wall of Pleasant Canyon using both the 2.5-D and the 3-D methods to compare maps obtained for near-vertical cliffs. We refer to this feature as the Noonday structure because it is primarily developed in the Noonday Dolomite (lighter colored rock capping the ridge; Animation 2) and is the same area described as the Noonday site above. The two methods yield vastly different results (Fig. 13; Animation 3). The greatest discrepancy is a mismatch along the prominent cliff face where the main Noonday structure is exposed (Fig. 13). Here the contact lines drawn from the 2.5-D visualization are shifted downslope relative to the 3-D lines drawn on the MVS model (Fig. 13; Animation 3). In addition, the contact lines drawn from 3-D visualization are shaped differently than those of lines from 2.5-D visualization. This discrepancy cannot be due to a mismatch between models because the MVS model is co-registered with the TLS model to within centimeters, and the TLS model, in turn, is referenced to the same geographic reference as the DEM used in the 2.5-D method. Discrepancies of this sort on cliff faces are consistent with known issues with the 2.5-D method (e.g., Pavlis and Mason, 2017), but there is a notable subtlety in this particular case. Specifically, the systematic shift in the linework between the two methods suggests a fundamental georeferencing error in the 2.5-D data. We suggest that this shift originates from improper orthorectification of the satellite image that was used to produce the base map image used in the 2.5-D method, information that was unavailable in these downloaded data. This conclusion arises from previous experience with orthorectification of high-resolution, off-nadir satellite imagery in steep terrain (Pavlis et al., 2012), where orthorectification software cannot faithfully orthocorrect the much higher-resolution image using the low-resolution DEM, and this orthocorrection error is propagated into the 3-D line traces in the 2.5-D method. In Pleasant Canyon, the most likely origin of the observed distortion in the 2.5-D model is a combination of georeferencing errors that are artifacts of the orthorectification and an additional effect resulting from the look angle. To understand the latter, consider the case of a satellite image taken off nadir with a look angle from the south. In such an image, the Noonday site cliff face would occupy a larger fraction of the resulting scene relative to a nadir-looking view. If that image were draped onto a high-resolution terrain model with extensive ground control on the image, orthorectification software could correct for this geometry. However, in cases like this one where the image is draped on a low-resolution DEM, the orthorectification introduces a systematic error. This effect is well known in orthocorrection and is essentially the complementary effect of pixel smear; i.e., pixel smear is generated when the look angle of an image is close to parallel to the topographic slope, smearing pixels along the look direction, whereas in the Noonday case, the slope is at a high angle to the view and excess pixels allow a clear but distorted image of the cliff. Note that had we limited our work to the 2.5-D method, this error in 3-D geologic interpretations would have been undetectable and the resultant geologic model distorted from its true geometry.

Finally, we consider the case of Wildrose Canyon where there is significant topographic relief but modest slopes of <45° (Fig. 8B). Here, lithologic units are manifest as conspicuous color bands easily seen on all imagery, allowing direct comparison of 2.5-D and 3-D mapping (Fig. 14). In this case, comparison of the same contacts drawn on the TLS 3-D terrain model versus the 2.5-D method produces line traces that are nearly indistinguishable from each other (Fig. 14; Animation 4). Indeed, in this case the 2.5-D method is arguably superior because the imagery used in that model is of higher resolution than 3-D model and the contacts are more easily seen on the vertical-incident flat-map images than in the 3-D model acquired at ground level. Nonetheless, there is a notable systematic shift of ∼5 m between the two models that could have resulted from inaccuracy of the image drape onto the DEM, an improper vertical datum correction between the models, or both.

Sources of Error Associated with Obtaining Ground Control

The four ground control methods utilized in the uncontrolled photogrammetric modeling scenarios had varying combinations of error sources that effected model accuracy. The discussion here is focused on ground control accuracy and how the process of obtaining ground control might be improved to increase model accuracy. How these errors, combined with image collection errors, affect the resulting models of each method will be discussed in the following section.

The purpose of exploring method 1 was to show the results of an approach that requires the least amount of field and preparation time to perform. Method 1 is subject to location error of the recreational GNSS receiver and the error from the laser rangefinder. The laser rangefinder error is related to a combination of instrument error, aiming issues, and (in)ability to accurately recognize the natural object later on in the imagery. It may be possible to minimize the potential for error caused by recognizing natural objects in the imagery by placing artificial markers in the field (essentially a combination of methods 1 and 3), but this approach would largely eliminate the advantages of method 1. The aiming issue, however, is probably the largest source of error in this method. Aiming a handheld rangefinder and accurately hitting a target is analogous to aiming a handgun and hitting a target at 1 km, virtually impossible for even the best marksman. For this same reason, it is important to make sure the targeting system of the laser ranging device is calibrated (analogous to making sure the gunsight is not off). We suggest that one solution to the aiming problem would be routine use of a tripod for the laser rangefinder, preferably a lightweight collapsible one in order to maintain the mobility advantage of this method, as well as a remote trigger.

Method 2 is subject to location error associated with the TLS data as well as the ability to reliably match objects between the two models. In the case of method 2, the errors that are present are almost certainly controlled primarily by the size of the objects selected as GCPs and the ability of the analyst to precisely locate the same natural object, and the exact selected point on the natural object, in two different models. Objects should ideally be non-rounded shapes such that the analyst can choose a specific corner for the location of the point, uniquely colored relative to the majority of the scene, and ∼1–10 m in length for large-scale scenes such as ours—though it should be noted that the ideal size of objects depends heavily on the overall scale of the scene being imaged.

Method 3 represents the standard method used for obtaining GCPs, which is placing artificial markers in the scene and obtaining the position of those markers with a high-precision GNSS. However, method 3 also has the potential to suffer from recreational-grade GNSS location error, errors associated with locating artificial markers later on in the imagery, and field site limitations that may not permit marker placements in parts of the scene.

Method 4 is subject to recreational-grade GNSS location error, but more importantly, the lack of GCPs in the scene compound other geometric factors like baseline-distance ratios such that this method produces unacceptable results for all ground-based cases we considered. Large rigid-body rotation errors are characteristic of this method, largely due to other geometric factors discussed below.

Sources of Error for Photogrammetrically Derived Terrain Models in Uncontrolled Scenarios

Our experiments comparing the accuracy of MVS models relative to TLS reference models illustrate that significant care should be taken when a terrain model is produced from oblique, ground-based imagery alone. The primary observations from our experiments (Table 3) include the following:

  1. All cases that based ground control solely on camera positions (method 4) produce a large spatial reference error. Use of a well-calibrated camera lens would likely lower this error, but systematic errors, like model rotation, would not be mitigated by using a better camera. It appears that models built without GCPs and that instead only employ ground-based camera positions should never be used for geologic mapping purposes if geographic reference of any kind is needed, but more experiments are needed to test this hypothesis.

  2. Of the cases that used GCPs in addition to camera positions, the laser rangefinder method (method 1) produced the largest spatial error. Also, the error increases with distance from the image baseline unless GCPs are well placed on objects farthest away from the camera positions.

  3. In all cases that derived GCPs from natural objects co-located on both the TLS and MVS models (method 2), the spatial error is small, on the order of the size of the objects used for ground control (1–10 m). The exception is the Noonday site, where GCPs obtained from the TLS data were spatially limited to close offset distances from the baseline, which resulted in errors becoming larger at far distance offsets, an effect that was removed when GCPs at far offsets were added.

  4. In the Surprise West case (method 3) the results were comparable to those of method 2, indicating that this method produces results comparable to having a high-precision GNSS.

Collectively these observations suggest that the accuracy and distribution of GCPs used to reference a MVS model are the most important control on the model’s accuracy, with a second, potentially equally significant role played by the geometry of the image array. This conclusion is consistent with other studies (Carrivick et al., 2016), with some important additions related to modeling steep terrain, large-scale outcrop, and our field-oriented methods.

First, we suggest here that these observations together indicate that certain common image array geometries conspire with GCP error to produce rigid-body rotation errors that are difficult to eliminate in any ground-based MVS study. Regardless of the source of GCP error, consider the geometric problem of an approximately linear imaging array. In this case, the system has cylindrical symmetry with the cylinder axis coincident with the imaging array (Figure 9). Because the array is long in one direction, the array geometry is highly favorable for resolving position along the axis of the cylinder (z) as well as radial distance (r) provided r is on the order of the length of the array (Fig. 9). However, although distance can be accurately assessed, the system is subject to rigid-body rotation errors around the axis of the cylinder (Figs. 6E and 7). This is undoubtedly the primary reason for failure of method 4 with camera positions only; i.e., if we know only camera positions along a linear array, there are an infinite range of solutions distributed as rigid-body rotations about the cylinder axis (Animation 1), hence camera positions alone are subject to large rigid-body rotation errors in almost any scenario where the image array is not well distributed in 3-D. Similarly, when GCPs are not uniformly distributed through the scene, particularly when they are limited to close offsets (e.g., method 1), large errors can result from seemingly tiny rigid-body rotations. For example, consider the case of Wildrose North, where the imaging array is relatively linear and GCPs are limited to ∼500 m maximum distance from the baseline in a scene that extends to nearly 1500 m from the baseline in 3D (Fig. 8B). If the GCPs have an error of 5 m at 500 m, the resultant angular error is small (∼0.5°), but if this error is transferred as a rigid-body rotation, the error at 1500 m is ∼15 m. In the specific case of Wildrose North, the error at ∼500 m is ∼15 m, and thus, a rigid-body rotation error at 1500 m would be ∼45 m, very close to what is observed (Table 3; Fig. 7B). Note that because this rotation is small (∼1.5° for Wildrose North), it would be difficult to recognize in the absence of our TLS data. Given this insight, we have identified similar rigid-body errors in all of our cases where the imaging array was approximately linear. In Pleasant Canyon, for example, the imaging array is approximately linear, and when only close-in GCPs from the TLS model were used, there was a large error, but this error disappeared when GCPs were placed at the far offsets. This relationship also explains why the Surprise Canyon models generally gave better results, even when GCP distribution was marginal, because the imaging arrays in those cases departed significantly from linear.

In retrospect, this conclusion is obvious from a well-known relationship in the MVS literature (e.g., Furukawa and Hernández, 2015) that the ideal imaging array for MVS is 3-D, surrounding the subject. Nonetheless, real-world scenarios of ground-based imaging generally make this ideal geometry nearly impossible to obtain. For example, a common scenario for ground-based imaging in steep terrain is the one we typically did here, largely because of logistical constrains. That is, walking a ridge across from a cliff or walking along a valley below the cliff is logistically straightforward, yet either scenario generates a relatively linear array. This means that in order to best image a cliff face for MVS modeling, a linear image array should be avoided unless extensive GCP placement is possible. If GCP placement is not possible, a vertical component needs to be added to the image array as much as possible—a potentially difficult task if steep terrain makes this very challenging or even dangerous on foot. In a canyon this problem might be solved by imaging along the valley floor and adjacent ridgeline, but in isolated cliffs this would be impossible. Thus, in any realistic ad hoc field scenario like that examined here, this problem is likely to exist, and future studies should keep this inherent geometric limitation in mind when designing a field study.

Second, camera resolution as well as camera lens quality clearly play roles in model accuracy and resolution, but we did not conduct controlled experiments with variations in equipment to analyze this issue. There has been some work on how camera resolution and lens quality affect model accuracy (Carrivick et al., 2016; Mosbrucker et al., 2017), and we see many examples of the problem from experiments in progress with drone-based platforms. Nonetheless, more work is needed on the problem, particularly as camera systems and data processing technologies improve.

Third, in realistic field scenarios, use of natural objects as GCPs is an attractive scenario because it does not require placing and/or recovering artificial markers. Moreover, with method 1, use of a laser rangefinder could allow GCPs to be remotely located, allowing GCP placement on unreachable cliff faces or ridge tops. Unfortunately, our experience with method 1 was discouraging because of large errors, yet this experience indicates that there are solutions to the problem. The use of a tripod and a remote trigger with a rangefinder could minimize aiming errors, and multiple repeat measurements on the same object could allow collection of statistics on location precision, potentially making method 1 viable. Thus, future studies should address this technique. In any case, for all of the sites where we used natural objects co-located on TLS and MVS models as GCPs, the spatial errors are on the order of the size of the object (1–10 m in length). Thus, absolute positioning errors on the order of the size of the objects used for the GCPs can be readily obtained, and for nearly all bedrock field geology studies this level of accuracy is very acceptable.

Sources of Error in 3-D Interpretations

More work is needed to determine the accuracy of 3-D geologic interpretations using different methods, but this study suggests some clear guidance on where errors arise and methods diverge. Aside from spatial errors within the terrain models themselves, the primary source of mapping errors in both 3-D and 2.5-D mapping methods apparently arises from image drape. This source of error is most obvious in 2.5-D mapping methods where vertical imagery is draped onto a low-resolution DEM producing a variety of artifacts ranging from pixel smear to image distortions (Pavlis and Mason, 2017). In Surprise South, one example of errors in the 2.5-D approach is apparent, where the low-resolution DEM leads to inaccuracy in vertical position, producing ragged lines in 3-D rather than smoothly curved to planar surfaces seen in true 3-D representations (Fig. 11). In an area of more subdued terrain such as Wildrose Canyon, the distinction between the 2.5-D and 3-D mapping methods is not significant (Fig. 14; Animation 4), indicating that the 2.5-D approach is acceptable in these conditions.

Problems like those in Surprise and Wildrose Canyons are relatively well known, but more subtle discrepancies are present in the Pleasant Canyon model that could be easily misinterpreted. Here pixel smear is minimal in the 2.5-D model and the structure seems visually reasonable when the 2.5-D mapping is shown in 3-D visualizations (Animation 3; Fig. 13), suggesting that the 2.5-D method is producing a reasonable rendering of the 3-D geometry. This impression, however, is false. When lithologic boundaries are compared (Fig. 13), there is systematic shift in 3-D positions between the linework obtained using 3-D mapping versus that drawn using the 2.5-D method. Moreover, smoothing of the terrain in the low-resolution DEM gives the appearance of a relatively smooth, albeit steep, slope where a prominent isoclinal fold is visible in both models (Fig. 13). In fact, this structure is actually exposed entirely on a near-vertical cliff, which is visible clearly in the MVS view (Fig. 13A; Animation 2). We suggest that the absence of pixel smear on the 2.5-D model actually hints at the source of the systematic error seen in this visualization. That is, the original satellite image used in the 2.5-D method had a look angle at a relatively high angle to the cliff face, which faithfully preserved the image of the cliff but distorted the image when draped onto the low-resolution DEM. We reiterate that this distortion would have been impossible to detect were it not for the true 3-D representation provided by the MVS model.

It is important to note that image drape errors are not limited to the case of near-vertical images draped onto a DEM. This is most obvious at the Clair Camp site, where we recognize deviation in interpretations of the TLS-derived terrain model relative to map-based orthophoto interpretations that appears to result from stretching of the draped photograph on the TLS model (Animation 5). This arises because the photographs were taken from near the canyon floor, leading to a view that is highly oblique to the target surface, generating pixel smear and distortions analogous to pixel smear and image distortions in the 2.5-D method where vertical imagery is draped onto an elevation model in steep terrain. That is, because of look angle, this method remains a 2.5-D method, despite the high-resolution terrain model. Thus, in this case, line positions mapped on the nadir-looking orthophoto are probably as well constrained, or better constrained, than line positions based on the image drape to the TLS terrain model.

Because image drape errors appear to be the major issue in both TLS and the conventional 2.5-D methods with vertical imagery, our limited data set suggests strongly that direct mapping on an MVS point cloud is a superior method for 3-D mapping, provided the MVS model is accurately georeferenced. This conclusion is relatively obvious from the basic distinction between any model that requires a photographic drape (e.g., TLS or the 2.5-D method) versus MVS. In any colored terrain model where the imagery is draped on the model, the image drape is subject to look-angle distortions, whereas in MVS, every point in the point cloud is in its true position and has the proper color for its position because it is made from the same photographs that were used to generate the model. In essence, this means that in a MVS point cloud each point is a 3-D pixel that is not subject to pixel smear. Its spatial position may be misplaced due to model errors, but it will always be the proper color for its relative position.

Based on these observations, we suggest that different methods should be considered based on local terrain. Where terrain is relatively subdued (slopes generally <45°) and steeper escarpments are smaller than the scale of geologic features being analyzed, 2.5-D methods are preferred due to their simplicity and their tie to well-established methods. As terrain becomes steep, particularly where features to be analyzed are smaller than the scale of escarpments, a true 3-D mapping approach is needed. MVS modeling provides the simplest and generally superior method for generating the terrain model base, provided good spatial referencing can assure an accurate terrain model.

Remote Sensing of Orientations

The similarities in orientations obtained from analyzing points on the TLS terrain model relative to field measurements suggest that these digital techniques show great promise in analyzing orientations in inaccessible sites. Nonetheless, the model-based measurements versus the field measurements (Fig. 10) show different types of scatter. Some of this scatter may be real, but much of the difference appears to be measurement error on the model that depends on outcrop geometry. This suggests that spatial analysis tools of this type are useful, but caution is needed in assessing the 3-D validity of the analysis. Because these data can be acquired rapidly, one solution to the problem may be to obtain very large numbers of measurements from the model, even repeat measurements of the same feature, to obtain a better representation of geometry. Similarly, in our experience, visual inspection of the results in a 3-D rendering can greatly aid quality control, but this assessment is dependent on user skill and the quality of the visualization. Thus, this area is an important avenue for future research.

Developing Photogrammetric Terrain Models from Oblique Imagery: Best Practices

The data presented here suggest strongly that care should be taken in developing MVS models from ground-based photographs alone, particularly in areas with steep terrain and large outcrops, but that the problems are not insurmountable. We suggest the following best practices:

  • Use a high-quality camera (such as the Nikon or the Canon used here) along with a recreational-grade GNSS receiver for locating camera positions. Camera settings and image acquisition practices suggested by Mosbrucker et al. (2017) as well as data acquisition guidelines synthesized by Smith et al. (2016) would be ideal to follow.

  • Plan both the image array geometry and the placement of GCPs prior to acquisition of imagery for a MVS model using a digital globe system (e.g., Google Earth, ArcGIS Pro, etc.). This step will also allow for evaluation of terrain and access to the field area. The image array geometry should ideally have a 2:1 baseline-to-distance ratio as well as significant elevational variation (ideally the same length as the baseline) to avoid linear imaging arrays as much as possible. A drone with a GNSS receiver is recommended in areas where the terrain prevents obtaining camera positions at sufficiently varied elevations. Make sure GCPs are well distributed across the scene.

  • If a laser rangefinder is to be used for location of GCPs, it should be tripod mounted and make use of a remote trigger. Calibrate the laser rangefinder at each new location. Even if the GCPs are preplanned, take photos and detailed notes (or annotate photos if using a tablet for mapping) in order to ensure positive identification during image processing.

  • If field time allows and/or absolute accuracy is critical to the project, place several artificial markers in the scene and obtain positions using the recreational-grade GNSS receiver or higher-precision receiver if possible. These GCPs will be an addition to the GCPs obtained with the laser rangefinder. However, if relative accuracy is all that is required or field time is limited, this step may be skipped, potentially at the expense of model accuracy.

  • After models are constructed and relative accuracy is determined to be reasonable, the model may be used to draw 3-D interpretations and obtain orientations. Three-dimensional mapping is a learned skill. It is critical to manipulate the model from multiple views to avoid interpretation errors. For example, in folded rocks, views down the plunge of folds are informative and help resolve pseudo-folds generated by surfaces intersecting terrain when models are manipulated. After a line is drawn, it is advisable to turn off the background view to search for artifacts; e.g., when interpreting a point cloud, stray points or holes can produce errors.

  • For obtaining orientations, begin with taking several measurements at locations on the model that correspond to known measurements in the field. This will allow a quality check of the model to ensure reliable measurements are obtained. Three-dimensional contacts can also be used to evaluate orientations when carefully drawn.

  • If the quality check passes, obtain orientation measurements on clear and recognizable surfaces within the scene. Avoid areas with vegetation or data gaps. Make sure the surface selection method is selecting the surface intended by the user prior to saving the measurement (most software packages use a three-point pixel selection method, a triangular drag-and-draw patch selection method, or an automated surface detection method).

Potential of Aerial Imagery with Ground-Based Photography for MVS Models

Terrestrial photogrammetry and terrestrial laser scanning work best when the features being photographed or scanned are oriented perpendicular to the optic axis of the camera or scanner. With recent advances in inexpensive, small unmanned aerial vehicles (UAVs), many of the issues from an insufficient range of look angles should disappear. With these platforms, a combination of nadir-looking aerial imagery, oblique aerial imagery, and ground-based oblique imagery could easily be merged into a terrain model of unprecedented detail and accuracy. Unmanned aerial vehicles can be equipped with submeter positioning, potentially allowing high-resolution models from camera positions alone, which could further revolutionize use of this technology. Already, the terrain models generated from MVS technology are something field geologists a generation ago would not have thought possible, but the addition of airborne imagery to these models will, we believe, revolutionize field geology, particularly where high-resolution mapping is needed to solve a problem.

Multiview stereo photogrammetry has drawn attention as a more portable, cost-effective alternative to lidar, and studies have shown that MVS photogrammetry produces high-resolution models with absolute and relative positional accuracy that are comparable to lidar (see summaries by Carrivick et al. [2016]). However, many of the existing studies have been controlled experiments, focused on hand sample– and/or outcrop-scale features. We present a real ad hoc field scenario where MVS photogrammetry can be applied to macroscale features such as entire canyon walls that would be difficult and impractical to map on foot. We demonstrate here that though MVS photogrammetry produces high-resolution models allowing 3-D mapping in unprecedented detail, caution is needed where absolute position is concerned because it is strongly controlled by the quality and distribution of GCPs as well as the geometry of the image array. These problems are not insurmountable but need to be considered as this technology is applied. Future studies may show that the addition of oblique aerial imagery obtained from UAVs may reduce some of the errors shown here. Nevertheless, field geology is poised for change and is closer to making a substantial transition into the digital world of 3-D visualization and mapping.

This project was supported by U.S. National Science Foundation grant EAR-1250388 to Pavlis. We thank UNAVCO for providing TLS equipment. We thank Midland Valley Exploration Ltd. and Maptek Ltd. for software donations of Move and I-Site Studio software, respectively, which made this project possible. Josh Cobb, Warren Allen, Tai Subia, and Samantha Ramirez helped us during different phases of the field work, and we thank Laura Serpa for input on the manuscript. Two anonymous reviewers’ comments led us to realize an initial draft of this manuscript could be easily misunderstood on its purpose and scope, and led to this improved version of the paper.

1Supplement 1. Shapefiles of the camera positions and ground control points from the seven multiview stereo sites used in this study: Clair Camp; Noonday; Surprise North; Surprise South; Surprise West; Wildrose North; and Wildrose South. All files are projected in North American Datum 1983, Universal Transverse Mercator zone 11N. Please visit or access the full-text article on to view Supplement 1.
2Supplement 2. The trigonometry-based MATLAB code for calculating the residual distance between terrestrial laser scanner (TLS) and multiview stereo (MVS) terrain models. Included are two CSV files containing the coordinates of the ten selected reference points used to evaluate the results for the Noonday site: one for the TLS coordinates, and the other for the MVS coordinates. Please visit or access the full-text article on to view Supplement 2.
Science Editor: Raymond M. Russo
Associate Editor: Francesco Mazzarini
Gold Open Access: This paper is published under the terms of the CC-BY-NC license.

Supplementary data