Digital outcrop models (DOMs) have revolutionized the way twenty-first century geoscientists work. DOMs are georeferenced three-dimensional (3-D) digital representations of outcrops that facilitate quantitative work on outcrops at various scales. Outcrop digitalization has been traditionally conducted using laser scanners, but in the past decade, it has seen an exponential growth because of efficient and consumer-friendly structure-from-motion (SfM) algorithms concurrent with the rapid development of cost-effective aerial drones with high-resolution onboard cameras. While DOMs are routinely used in geoscientific research, education, and industry, enhanced DOM usage is restricted because raw data (e.g., photographs) and metadata are often incomplete and/or unavailable. In this contribution, we present the Svalbox Digital Model Database (Svalbox DMDb), a database of metadata and openly available data packages for individual DOMs. The Svalbox DMDb is a regional DOM database geographically constrained to the Norwegian High Arctic archipelago of Svalbard at 74°N–81°N and 10°E–35°E. Svalbard offers exceptional-quality, vegetation-free outcrops with a wide range of lithologies and tectono-magmatic styles, including extension, compression, and magmatism. Data and metadata of the systematically digitalized outcrops across Svalbard are shared according to FAIR principles through the Svalbox DMDb. Fully open-access and downloadable DOMs include not just the DOMs themselves, but also the input data, processing reports and projects, and other data products such as footprints and orthomosaics. Rich metadata for each DOM include both the technical and geological parameters (metadata), enabling visualization and integration with regional geoscientific data available through the Norwegian Polar Institute and the Svalbox online portal. The current release of Svalbox DMDb, documented in this contribution, covers 135 DOMs cumulatively covering 114 km of Proterozoic to Cenozoic stratigraphy.

Outcrops provide an important bridge between multiscale data sets (e.g., borehole = high resolution, low spatial extent; seismic data = lower resolution, significant spatial coverage). Digital outcrop models (DOMs) complement traditional field work by extending the field season indefinitely and providing safe access to steep and remote sites that are otherwise inaccessible through traditional field work. Used as subsurface analogues, qualitative and quantitative outcrop data are critical for petroleum exploration, CO2 storage, and groundwater production (Howell et al., 2014; Marques et al., 2020). Correctly georeferenced digital outcrops streamline the process of digital data extraction and allow integration of quantitative data that are suitable for mapping (Jacquemyn et al., 2015; Nesbit et al., 2018; Martinelli et al., 2020) and geomodeling (Pringle et al., 2006; Howell et al., 2014).

Rapid advances in three-dimensional (3-D) modeling methods and imaging platforms have driven the uptake of DOMs as an important addition to real-world analogues in geology and the wider geosciences (James et al., 2019; Marques et al., 2020; Over et al., 2021; Bistacchi et al., 2022). In the past, DOMs were collected using terrestrial LiDAR scanners (TLS; where LiDAR denotes light detection and ranging; Buckley et al., 2008a; Howell et al., 2014; Howell and Burnham, 2021). Large outcrops that span several kilometers, such as the Book Cliffs, Utah, and the Billefjorden Trough in Svalbard (Buckley et al., 2008b), were scanned with helicopter-mounted LiDAR scanners (e.g., Rittersbacher et al., 2014a; Smyrak-Sikora et al., 2021). LiDAR scanning, be it from the ground or from the air, often involves expensive equipment, significant mobilization costs, and lengthy processing times with specialized software (Howell and Burnham, 2021).

In recent years, surface reconstruction methods that utilize structure-from-motion (SfM) and multiview stereo-photogrammetry techniques have been applied to digital outcrop data collection, and they represent a cost-effective approach to generate DOMs with off-the-shelf components (e.g., Westoby et al., 2012; Chesley et al., 2017; Nesbit et al., 2020; Bistacchi et al., 2022). The exponential growth of cost-effective consumer-grade unmanned aerial vehicles (UAVs, or drones) with high-quality cameras revolutionized safe access to previously inaccessible outcrops and has established the use of UAV-based SfM photogrammetry as a useful field tool in the geosciences (Colomina and Molina, 2014; Chesley et al., 2017; Nesbit and Hugenholtz, 2019; Śledź et al., 2021; Asadzadeh et al., 2022; Betlem et al., 2023a). Increasingly, 3-D surface modeling methods are also applied to rocks, hand samples, and fossils, as well as in archaeology and other disciplines, to generate high-resolution 3-D models of small objects (e.g., Westoby et al., 2012; Pepe and Costantino, 2020; Betlem et al., 2020a; Kaneda et al., 2022).

High-resolution DOMs provide a means for (semi-)automated workflows, including the interpretation of planar structures (e.g., Gomes et al., 2016; Prabhakaran et al., 2021), identification of sedimentary features, and quantitative mapping from local and regional to basin scales (e.g., Fabuel-Perez et al., 2010; Rittersbacher et al., 2014b; Rabbel et al., 2018; Burnham and Hodgetts, 2018; Burnham et al., 2020; Marques et al., 2020). DOMs also facilitate the analysis of temporal processes such as glacial mass-balance studies and geomorphic change (e.g., Eltner et al., 2017; Geissler et al., 2021). Importantly, the image-based data sets are suitable for big data and machine learning algorithms that aid in automated interpretations (e.g., Mohamed et al., 2020; dos Santos et al., 2022; Alzubaidi et al., 2022; Wu et al., 2022).

The Svalbox initiative aims to systematically digitalize Svalbard’s outcrops and integrate them with other geoscientific data (Fig. 1). The Svalbard archipelago comprises all islands located in the Norwegian High Arctic at 74°N–81°N and 10°E–35°E and features pristine outcrops with limited vegetation owing to its harsh climate. Less than 57% of the archipelago is covered by glaciers (Nuth et al., 2013), with the remainder featuring a tundra climate affected by permafrost and periglacial processes (Humlum et al., 2003). The outcropping strata record a nearly complete Devonian to Paleogene stratigraphic record, during which Svalbard drifted northward from the equator to its current position and experienced both a changing global climate and shifts in global and local tectonic configurations (Henriksen et al., 2011; Olaussen et al., 2022). Svalbard’s outcrops illustrate both extensional and compressional tectonics and feature a wide range of lithologies, including sedimentary, igneous, and metamorphic rocks. It is thus unsurprising that Svalbard has been used extensively as an analogue for the Barents Sea and other Arctic Basins (Henriksen et al., 2011; Olaussen et al., 2022). To fully realize Svalbard’s geoscientific potential and (partly) safeguard said potential from the lasting changes associated with a rapidly changing Arctic climate, as many of the archipelago’s unique outcrops as possible must be (digitally) documented and made accessible to the geoscientific community. While there are several online global databases of DOMs, notably V3Geo (Buckley et al., 2022) and e-Rock (Cawood and Bond, 2019), most do not allow users to (openly) download input data, processing reports, or DOMs for further usage. This does not follow the FAIR principles (i.e., findable, accessible, interoperable, and reusable; Wilkinson et al., 2016), as open accessibility of the entire data package (including input data) is essential for reprocessing the models and reevaluating previous interpretations, both of which are identified as key aspects in future research (Burnham et al., 2022).

Herein, we present the Svalbox Digital Model Database (Svalbox DMDb), a database composed of metadata and data (input, processing, and output) that are openly available. Specifically, we focus on the database framework and illustrate our efforts to digitalize and publish DOMs from Svalbard under the FAIR principles. First, we highlight the technical aspects of the database with sufficient detail such that others can recreate it for other study areas and clarify such aspects as DOM accuracy and usability. We then provide examples of how DOMs facilitate data integration and quantitative data extraction across scales in different case studies. Finally, we discuss the broader implications of un-FAIR DOMs within the geosciences and suggest a way forward for data and metadata publishing.

Data Acquisition and Processing

Established guidelines were followed to acquire and process data used to construct DOMs within the Svalbox DMDb (e.g., Roth et al., 2018; Over et al., 2021; Howell et al., 2021). A user-friendly format suitable for teaching was provided by Betlem and Rodes (2022) and is summarized in the following sections.

Data Acquisition

Images of targeted outcrops were typically collected with UAVs, though some were collected with smartphones or handheld cameras with integrated GPS receivers. Special care was given to lighting and cover (e.g., scree, snow) conditions, and acquisition parameters were adjusted to reflect this and the targeted resolution and scale of the outcrops.

Consumer-grade drones have been used for UAV-based data acquisition since 2016, with the DJI Mavic 2 Pro (with Smart Controller) primarily used since 2020. At least two transects were flown for each outcrop at various distances, elevations, and camera angles to ensure complete imagery coverage (>70% between images) of the outcrop. In spite of Svalbard’s treeless exposures, all UAV operations were conducted manually. Svalbard is known for its challenging field conditions and limited data communication. The barren ground and glacial and periglacial processes greatly affect the weathering, erosion, and deposition rates. Drastic landscape changes thus take place over short time spans (e.g., a surging glacier may move several tens of meters per day). As a result, available digital terrain and digital elevation models (DEMs; Aas and Moholdt, 2020) are locally outdated, have very low confidence, and are not suitable for advanced 3-D waypoint planning software. The weather, flight (e.g., icing risk), and lighting conditions change frequently and further affect the difficulty level of automation and image quality.

Svalbox DMDb data were typically acquired for two purposes: high-resolution, proximal data suitable for centimeter-scale interpretation and distal data suitable for basin-scale analyses. Proximal data were collected at distances <30 m from the outcrop and in stationary mode or at velocities that did not exceed 1 m/s. Distal data were collected at larger distances from the outcrop and at higher velocities of up to 10 m/s. All handheld camera data acquisition was from stationary positions. Shutter speed remained above the minimum estimated from the ratio of ground sampling distance (GSD) resolution and UAV velocity.

Ground control points (GCPs) and check points (CPs) were only acquired for some of the DOMs to reduce the positional errors associated with internal GPS sensors. ArUco markers (Garrido-Jurado et al., 2014) and the OpenCV image library (Bradski, 2000) were used for the automated detection of markers, implemented through the Automated Metashape Python package (Betlem, 2022). Automated Metashape was also used to create standardized project folders and automate part of the SfM photogrammetry processing.

Data Processing

Key processing steps included quality assurance validation of tie points (i.e., through filtering based on reprojection error, reconstruction uncertainty, and projection accuracy), dense point cloud (i.e., through filtering based on confidence), and mesh (i.e., through filtering based on connected component size). An in-depth example of the processing and data assessment of a high-resolution DOM was provided by Betlem et al. (2022) and an associated data submission (see also Fig. 2 for a schematic based on Svalbox DMDb DOM 2021–0015).

All digital outcrops were reconstructed with the Agisoft Metashape photogrammetric software package (v.1.5.x to v.2.0.x). Photo matching and alignment implemented the upscaled or full-scale images (settings “highest” and “high”) for the greatest accuracy of the sparse cloud. The generation of depth maps and dense point clouds either used the full-scale or downscaled (×2) images, a trade-off between resolution/accuracy and processing/storage cost. Textured meshes and tiled models were generated from either depth maps or dense point clouds. DEMs and orthomosaics were generated for a subset of the data sets but can be easily generated for the remainder from input data, which are provided as part of the data packages.

For each of the individual DOM data packages, special attention was given to the associated horizontal and vertical datums. The Svalbox DMDb internally uses the World Geodetic System 1984 (WGS 84) Universal Transverse Mercator (UTM) Zone 33N (European Petroleum Survey Group [EPSG] code 32633) projection for the DOMs’ footprints in the metadata table. The metadata table excludes a vertical datum due to the lack of a standardized regional vertical datum for Svalbard. With the exception of differential global navigation satellite system (dGNSS; e.g., GPS, Galileo, GLONASS) positioned models, the vertical datum is thus implicitly based on the built-in GNSS receiver of the camera system, corresponding to EPSG:4979. Earth Gravitational Model 2008 (EGM2008; Pavlis et al., 2012) was used during processing of the three DOMs georeferenced through dGNSS.

Svalbox DMDb Architecture and Framework

The Svalbox DMDb was implemented as a PostgreSQL database with PostGIS extension, both of which are well-established open-source systems (Fig. 3). Model metadata are grouped by model type (e.g., DOM, digital sample/drill-core models) and stored within type-specific tables. Each table row is given a unique identifier and holds the metadata, including digital object identifier (DOI), for a single DOM data package that is uploaded and registered with an external data repository (e.g., Zenodo [European Organization for Nuclear Research and OpenAIRE, 2013], Norwegian National Infrastructure for Research Data [NIRD]). Accepted key-value pairs have developed over time to accommodate more complex settings, with several parameters (“columns”) currently featuring parameter dictionaries in the JSON format to circumvent ArcGIS Server limitations, specifically key lengths. In the future, existing columns will likely be converted to the JSON format to facilitate the capture of additional metadata, such as the inclusion of processing-specific software parameters (e.g., name and build number/version of external software and processes, data set version control). DOM footprints or outlines are automatically calculated from the model data with the Point Data Abstraction Library (PDAL; PDAL Contributors, 2020), though this can also be done manually in geographic information system (GIS) applications by tracing raster (e.g., DEM, orthomosaic) outlines. An ArcGIS Server REST application programming interface (API) facilitates access to the underlying geospatial database. A calendar-versioned database release is also hosted on Zenodo (Betlem et al., 2023b), which itself features a well-documented API for direct access to the archived data packages.

All DOM data packages are checked for accurate georeference and registration information and model resolution errors following the guidelines by Buckley et al. (2022) and Howell et al. (2021). Metadata that detail acquisition and processing parameters of the source data and output are required for all models (Table S11). All geoscientific metadata are integrated with the Svalbox DMDb and accessible through the ArcGIS Server REST API and calendar-versioned database release. Geoscientific metadata include observations, interpretations, and references to further work provided by the principal data contributor. An initial assessment of data and metadata ensures that a submission meets a minimum technical standard. It does not reflect a scientific review in the traditional sense. Accuracy of geoscientific metadata therefore remains the responsibility of the principal data contributor.

Svalbox DMDb Availability

The calendar-versioned database release (Betlem et al., 2023b) details digital outcrop packages and related metadata that are available in a variety of formats (i.e., GeoPackage, GeoJSON, and comma-separated values). The version-controlled database follows an annual release schedule and includes a reference to the data package along with acquisition, processing, and publishing metadata. Daily instances are available through the Svalbox ArcGIS REST API, enabling geospatial integration with the latest data sets. An overview of available metadata key-value pairs is available in the Supplemental Material (Table S1).

Digital outcrop data contributed to the Svalbox DMDb are available for download, reuse, and reprocessing (Table S2 in the Supplemental Material). Most data packages have been made available through the Zenodo repository (European Organization for Nuclear Research and OpenAIRE, 2013) and are limited to 50 GB (SI units). Data packages exceeding the 50 GB limit have been made available through Norway’s NIRD. Each data package is structured as follows:

  • An input folder with raw data, with subfolders that may include:

    • Image folders (some sorted per flight),

    • GCP/CP data/coordinates, and

    • Raw and processed GNSS information;

  • A processing/metashape folder that includes the Metashape project and processing files;

  • An export folder that includes processed output data, such as point clouds, orthomosaics, textured meshes, and DEMs (either in the root directory or accordingly named subdirectories); and

  • Processing reports and overview images (which are found in the root folder).

Subitems within the main directories are systematically ordered in subdirectories to enable automated data handling. The processing reports provide a complementary processing metadata record that supplements the metadata recorded in the Svalbox DMDb. DOMs have been made available through commercial online platforms such as SketchFab and V3Geo (Buckley et al., 2022), albeit with reduced polygon counts, to facilitate visualization and integration with the Svalbox digital portal (Senger et al., 2021). By openly providing data, visualization, integration, and interpretation are possible through both local and online solutions and through freeware and licensed software.

Model Classification and Geological Tags

Geological tags were attributed to all data sets to document geological features (e.g., rock type, lithology, structural information; Table 1). Geological tags were supplemented with metadata keys that indicate the degree of surface cover. Combined with model processing metadata (e.g., point density, resolution) afforded from processing, surface cover metadata provide crucial information on the exposure of outcrops in the models. In Svalbard, vegetation is limited, and surface blanketing of the outcrops is primarily caused by snow and scree. Cover extent was visually assessed, and the areal extent was classified through discrete bins with 20% intervals, ranging from 0 (absent) to 1 (0%–20%) up to 5 (80%–100%) (Table 1).

Unlike surface cover and geological tagging, geological age and formation classifications were automated and relied on the footprint-clipping of vectorized regional geological maps (1:250,000 scales; Norwegian Polar Institute, 2016). The geological polygons were clipped to reveal attributes enclosed within the model footprints. Where applicable, Era, Period, Group, and Formation attributes were recorded in the Svalbox DMDb. A similar method was used to determine the overlap with stratigraphic type localities as defined by the Lithostratigraphic Lexicon of Svalbard (Dallmann, 1999).

Data Availability, Scripts, and External Resources

Table 2 provides an overview of the scripts that were used for characterization and appraisal of the Svalbox DMDb content. Table 2 also provides links to the Svalbox online portal and links to the ArcGIS Server REST API end point and relevant geoscientific data sets published by the Norwegian Polar Institute (NPI).

The Svalbox DMDb provides a unique resource for geoscientists working in Svalbard. It is also, as far as we are aware, the first DOM database globally that publishes the full suite of digital outcrop data. All input (raw imagery, GNSS data), processing, and output data are available following FAIR principles. At present, the Svalbox DMDb (v2023.3) contains 135 DOMs (Table S2), the majority of which are located across Spitsbergen, Svalbard’s largest island. Herein, we present the latest release of the Svalbox DMDb (Betlem et al., 2023b) and focus on the database’s quality, spatial extent, geological/topographic coverage, diversity, and case studies.

Data Overview, Quality, and Statistics

The 135 DOMs cover 114 km2 of Svalbard’s estimated ice-free landmass (20,000 km2). DOMs range from just 0.386 m2 to as large as 6.71 km2, with a mean size of 0.846 km2. Most DOMs were acquired as part of University Centre in Svalbard (UNIS) courses or UNIS-led research projects across western and central Spitsbergen (Figs. 1 and 4), where access by either ship or snowmobile is straightforward. Nathorst Land was primarily targeted through a dedicated boat-based data acquisition campaign to Van Mijenfjorden and Van Keulenfjorden in summer 2021. The campaign was specifically planned to acquire more DOMs and serves as a model for future acquisition campaigns. Digital data were typically acquired from April to September to maximize available sunlight and avoid the long polar night from October to February. DOMs located inland (and mainly accessible by snowmobile) generally feature more snow and were acquired in late spring (April–May), while more accessible outcrops (e.g., nearshore, proximity to settlements) were imaged during the short snow-free summer field season (June–September). For the integration of DOMs with other geospatial data, the absolute positional error, quality of the input imagery, and DOM resolution are important parameters to consider.

Data Quality

The quality of digital outcrop data is directly tied to the quality of the input imagery. Remotely sensed image quality (sharpness) is affected by the sensor, lens, aperture, focal length, and shutter speed as well as image compression (Roth et al., 2018). Low image quality and suboptimal imagery cover were found to be the main error sources affecting data submitted to the Svalbox DMDb curation process.

Snow and ice affect the SfM processing and are a typical source of data paucity (i.e., holes) in data sets. Outcrops with minimal surface cover are targeted. Models acquired during the summer season tend to be larger and contain fewer holes than those acquired in spring. Table 3 provides the number of models and estimated areal extent affected by scree and snow cover. Both parameters relate to the percentage of “usable” outcrop versus the DOM’s full extent, though they do not document the resolution or quality of the exposures per se. In total, 81 DOMs are devoid of snow cover or contain snow across less than 20% of the surface, while 8 DOMs have >60% snow cover. The distribution of scree cover is similar, with 12 DOMs that feature >60% scree cover and isolated outcrop sections.

While surface cover such as snow adversely affects overall data usability, surface cover occasionally has its benefits. Snow and scree accentuate larger features such as cliff-forming units that hold implicit information about the rock’s properties. In the case of the Productustoppen (DOM 2019–0004) and Braganzatoppen (DOM 2021–0033) DOMs, the snow cover emphasizes the large-scale structural deformations that were mapped as part of regional investigations of the West Spitsbergen fold-and-thrust belt (see Evolution of the West Spitsbergen Fold-and-Thrust Belt and Foreland Basin Sedimentation section). Even at smaller scales, such as the DOM of the ice cave at Passfjellbreen (DOM 2021–0055), important geomorphological and sedimentological clues can be extracted.

Like snow and ice, complex terrain features such as overhanging sections, steep slopes, and obscured surfaces may also lead to data paucity and locally introduce errors (Cawood et al., 2017). In these situations, the positional accuracy of features can be affected by compression, distortion, and other factors that warp or transform features out of place. Model edges are good examples of this, as are difficult-to-reach bedding planes atop ridges and steep cliffs. In general, UAV-based acquisition improves both the completeness of reconstruction and the relative accuracy because it affords greater unoccluded coverage from multiple viewing angles and distances (e.g., James and Robson, 2012; Cawood et al., 2017). Amongst the DOMs within the Svalbox DMDb, this is reflected by a larger number of distortions in handheld- and distal UAV-acquired data sets. Where present, the errors are typically minor and limited to local areas, and they do not affect the interpretation of the models beyond the affected areas.

Because the majority of issues stem from the acquisition stage, we list here the most common pitfalls we observed during the data acquisition stage. Please see Howell et al. (2021) for a more comprehensive acquisition tutorial.

  1. Hard shadows and variable lighting conditions (e.g., partially overcast) must be avoided as they negatively affect the performance of image matching algorithms and texture quality.

  2. The same applies to bright blue sky; image acquisition should avoid capturing the sky as much as possible.

  3. Camera image capture settings must be fine-tuned to the target, lighting conditions, and available acquisition time. Auto exposure typically works fine when the lighting is relatively uniform and low contrast exists between the outcrop and the background. Elsewhere, the use of small aperture (high f-number), low shutter speed, and low International Organization for Standardization (ISO) camera sensitivity setting is preferred to maximize the depth of field and minimize sensor noise (Betlem et al., 2020a).

  4. Camera movement must be accounted for during flight planning to reduce blur (Morgenthal and Hallermann, 2014; Roth et al., 2018). Motion blur degrades the image sharpness (Sieberth et al., 2014) and typically occurs when the flight speed is high relative to shutter speed. This may result in an object that is imaged by more than one pixel in the same frame.

  5. Targeted outcrops must be imaged with sufficient overlap between individual photos (>70%) and with at least two flight passes or transects parallel to the cliff face at similar outcrop-UAV distances. The camera tilt (e.g., a set of 10°, 40°, 70°) and height above ground should be varied during the transects. Additional transects captured at greater distances to the target aid in the georeferencing (especially tilt) of the digital data where the targeted outcrop is linearly shaped.

  6. The data acquisition plan must consider the scientific objective. Unless specifically needed, a kilometer-scale, regional overview model does not require a sub-centimeter GSD resolution, which adversely affects processing time and data storage requirements.

Positional and Spatial Reconstruction Errors

The absolute positional quality of the DOMs is best estimated through the calculated total camera errors. This provides an indication of how precisely georeferenced features correspond to their real-world location. The total camera error parameter is calculated as the root-mean-square error of the differences between the measured camera positions and the camera positions calculated from the SfM photogrammetry processing. The median total camera error for the Svalbox DMDb is 3.30 m, with a minimum of 1.6 cm and maximum of 410 m. This thus indicates the high positional quality of the data set. The positional accuracies of the Svalbox DMDb DOMs correspond with the known GNSS limitations, and 75% of the DOMs have total camera errors that are <10 m (Figs. 5 and 6).

GNSS constellations and GNSS positioning are limited by the 55°–65° inclinations of their orbits, which coincides with the highest latitude at which they can be seen overhead. In Svalbard, and the rest of the Arctic, this causes satellites to be seen low on the horizon, though more are visible at the same time than at lower latitudes (Reid et al., 2016; Zhang et al., 2020). The low elevation angle is suboptimal for horizontal positioning but especially leads to poor vertical positioning and larger daily variations (Zhang et al., 2020). Furthermore, complex terrain features such as steep mountain sides may obstruct precise GNSS control and cause a paucity of data (e.g., Betlem et al., 2022). Indeed, we observe the largest total camera errors in DOMs that are generated from (1) geometrically complex terrain features such as overhanging cliffs that prevented precise GNSS satellite location solutions; (2) images captured over multiple days with varying GNSS errors; and (3) images captured with handheld camera systems with low-quality GNSS receivers such as smartphones, camera systems coupled with external GNSS receivers, and DJI’s Mavic Air UAV (Fig. 6). The latter category mostly affected the quality of the vertical accuracy of the camera positions while maintaining a high degree of precision for geological features within the DOM extent (thus not affecting the quality of thickness or structural measurements within the reference frame of the DOM).

We did not identify any correlations between the total camera error and either the areal size or the data acquisition distance (Fig. 6). Neither did we identify distortions in the DOM scaling or distortions in overall size. However, DOMs with very large total camera errors (more than tens of meters; e.g., Criocerasdalen, DOM 2019–0019) were found to be rotated or warped. DOMs that are positioned through internal GNSS receivers typically have a very high degree of precision as long as there is sufficient overlap between images and varying image capture angles, and therefore they afford reliable measurements of internal structures such as bed thicknesses and orientations (Howell et al., 2021). Thus, even distorted DOMs permit the relative interpretation of features, though they will need to be transformed to their accurate, real-world position prior to their integration with other georeferenced data sets and DOMs.

Differential positioning, whether built-in or through GCPs, typically provides the highest accuracies. While beneficial for reducing the absolute positional error, GCP use may significantly increase associated costs in field time, logistical demands, and expenses. This is especially true when the targeted outcrops are of seismic scale, and the terrain must be traversed on foot for the physical placement of markers across the study area. Most of Svalbard’s seismic-scale outcrops are difficult to reach, and the data are typically captured from further away. GCP-supported surveys were therefore conducted only where purpose and data needs exceeded the accuracy of integrated (built-in) GNSS. Only three of the 135 Svalbox DMDb models were constrained by dGNSS-positioned GCPs (DOM 2020–0039, Konusdalen West [Betlem et al., 2022]; DOM 2020–0040, Criocerasdalen [Ogata et al., 2023]; DOM 2021–0056, Konusdalen [Betlem et al., 2023c]). The sites had previously been mapped at lower resolutions, and these reconnaissance data were used to constrain subsequent dedicated UAV operations. With further technological advances that enable the onboard implementation of real-time and postprocessed kinematic GNSS positions, the constraints of using high-quality positioning are likely to become less of a concern.

Resolution, Scale, and Geological Mapping Potential

The GSD and quality of the digital outcrops typically depend on the size of the outcrop and camera specifications. The GSD parameter indicates the smallest pixel size that can be extracted from the data and quantifies the resolution of the DOM. The parameter thus resembles the imaging limit of the (visible) data: Within a pixel’s bounds, all information is averaged, and subpixel features may not be confidently interpreted. As in seismic data, the geological interpretation of DOMs faces constraints related to the detection limit, specifically to extracting data beyond the GSD. Technologically similar obstacles are encountered in the field of (satellite-based) remote sensing and have led to various subpixel visualization and imaging processing methods (e.g., Kemeny and Post, 2003; Pardo-Pascual et al., 2012; Guo et al., 2019; Toomanian, 2022). Given the similarities, such methods can also be applied to DOMs, especially where high-contrast geological features such as veins and weathered fractures dominate the recorded pixel value. Subpixel features can also be inferred from small-scale offsets and terminations. This is similar to the imaging of faults in seismic data (Faleide et al., 2021). More opaque transitions and less distinctive features (e.g., coarsening trends) require higher-resolution input data (i.e., lower GSD) at a resolution that is several times higher than the dimension of the targeted feature. Alternatively, features such as gradual changes in lithology can be implicitly derived from trends that span distances several times that of the pixel dimensions.

During manual screening of the DOMs, sub-centimeter GSDs were typically found to be sufficient for the identification of joints. Additionally, DOMs derived from proximal data acquisitions and close-up imaging were also found to have sufficient detail to identify sub-centimeter and centimeter-scale geological features such as pebbles in conglomerates and bed-bound fractures. With a median GSD value of 3.33 cm/pixel, many Svalbox DMDb DOMs are suitable for this purpose (75th percentile: 7.07 cm/pixel), and these statistics show the resolution, quality, and geological usability of the DOMs. Of the 79 DOMs with GSD <5 cm, 33 are sub-centimeter in scale and thus may facilitate identification of joints. In addition, 29 DOMs were determined to be suitable for fracture mapping. Mineralized joints, or veins, appear less frequently and were observed in only nine DOMs.

In addition to centimeter-scale structural elements, the Svalbox DMDb DOMs facilitate the interpretation of a large part of Svalbard’s geology and tectono-stratigraphic features (Table 4) at a scale and resolution unavailable in other resources (e.g., NPI public geodata [NPI, 2016]). Structural elements (including centimeter-scale elements) have been identified in 103 DOMs (76%) and successfully capture the regional impact of the Paleogene West Spitsbergen fold-and-thrust belt and older tectonic events (Olaussen et al., 2022, and references therein). The identified structural elements comprise both reverse and normal faults (41 DOMs) and associated folding (35 DOMs) structures.

Svalbard’s well-exposed stratigraphic successions are well covered by the Svalbox DMDb DOMs. The models cover Neoproterozoic- (14), Paleozoic- (60 DOMs), Mesozoic- (51 DOMs), Cenozoic- (31 DOMs), and Quaternary-age (40) outcrops or sections. Paleozoic DOMs mainly belong to the Tempelfjorden (36) and Gipsdalen (37) Groups, which are dominated by carbonate and evaporite sediments. Key examples include Fjordnibba (DOM 2020–0021) and Gerardfjella (2020–0024) in the Tempelfjorden area. DOMs that capture Mesozoic strata are dominated by the Sassendalen (21 DOMs), Kapp Toscana (15 DOMs), and Adventdalen (23 DOMs) groups, and 11 DOMs feature igneous sequences of the Early Cretaceous High Arctic large igneous province (HALIP), known locally as the Diabasodden Suite. The Early Cretaceous igneous sills are identified as rigid cliffs; in central Spitsbergen, they often cap mechanically weaker sedimentary strata and were emplaced predominantly in shale-rich units. Tschermakfjellet (DOM 2016–0001) provides an example of a cup-shaped sill emplaced in the shale-dominated sequences of the Botneheia and Tschermakfjellet formations (Betlem et al., 2020b). Cenozoic DOMs (27) have been primarily acquired within the Central Spitsbergen Basin.

In total, 42 established geological type localities (Dallmann, 1999) lie within close proximity (500 m) of one or several DOMs. Multiple type localities overlap with DOMs at Adriabukta (2021–0054; 4 localities), Bravaisberget (2021–0006; 5 localities), Festningen (DOM 2020–0001; 10 localities [Senger et al., 2022]), and Landnørdingsvika (2021–0025; 7 localities) and supplement the existing lithostratigraphic lexicon of Svalbard (Dallmann, 1999) with immersive, 3-D data suitable for teaching, outreach, and research.

Manual screening and accurate annotation of geological features such as formation boundaries consume more resources than computer-aided interpretation through the correlation of available GIS data. As the latter is limited to the quality and extent of existing data, we deemed it to be a more useful approach to publish the models with semi-automated geological tags, or geotags, that are based on the available GIS data and then manually screen these models. The comparison of feature counts calculated by manual screening and automatic spatial-joining of existing geodata (NPI, 2016) produced mismatched results. We attribute this to the relatively low resolution (1:250,000 scale) of publicly available source material. Differences were observed for the occurrence of igneous intrusions (11 manual vs. 7 automatically joined), unconsolidated Holocene deposits (34 vs. 98), and other formations. For instance, DOM 2016–0006 (Konusdalen), 2021–0028 (Muninelva), and 2021–0029 (Medalen) were each automatically classified as unconsolidated Holocene deposits, but they instead feature Triassic, Devonian, and Paleogene outcrops, respectively. These classifications were added manually.

The geotags facilitate the selection of relevant data and provide a starting point for high-resolution mapping. Most models are suitable for the interpretation of geological features and formation boundaries at resolutions higher than currently available map data. Svalbox DMDb DOMs are frequently used for geological mapping purposes in remote settings such as Billefjorden (Smyrak-Sikora et al., 2021), Deltaneset (Betlem et al., 2022), and the Ekmanfjorden-Dicksonfjorden (Sartell, 2021) areas. The Svalbox DMDb is thus a key regional asset that facilitates systematic improvement of available low-resolution map data.

Data Interpretation, Integration, and Case Studies

The Svalbox DMDb captures many aspects of Svalbard’s diverse geology and type localities across the archipelago that are crucial for understanding the Norwegian continental shelf and other circum-Arctic sedimentary basins, and thus it acts as an interactive reference framework for analogue studies and data integration. The collection of surface data (e.g., DOMs, orthomosaics, and high-resolution DEMs) is most powerful when integrated with the expansive multiphysical geoscientific data sets collected across Svalbard and the Norwegian continental shelf. The photorealistic 3-D digital models enable geoscientists to analyze and revisit data at multiple scales and types, bridging a crucial resolution gap between seismic and well data (Nesbit et al., 2020).

Svalbard (and the Svalbox DMDb) is of significant scientific interest with a rich geoscience literature base (e.g., Olaussen et al., 2022, and references therein) because it is the exposed part of the Barents Shelf. Petroleum and coal exploration boreholes have contributed significantly to our understanding of the stratigraphic evolution of Svalbard, the greater Barents Shelf, and other Arctic basins. Wildcat, scientific, and coal-exploration wells cover most of the Paleozoic and Mesozoic interval and provide physical samples and downhole logging data that supplement the Svalbox DMDb (e.g., Johannessen et al., 2011; Olaussen et al., 2019, and references therein; Senger et al., 2019; Zuchuat et al., 2020). Some of Svalbard’s best-known geoscientific outcrops include the near-vertical Permian–Cenozoic sequence at Festningen (DOM 2020–0001 [Senger et al., 2022]), the paleokarst systems found at Fortet (2019–0001), the normal faults at Kvalhovden (DOM 2019–0018), and thick-skinned tectonics at Lagmannstoppen (DOM 2020–0015; Fig. 7), all of which have digital twins available through the Svalbox DMDb.

Many tools and environments exist that enable the integration, annotation, and interpretation of digital surface data, including DOMs and DEMs. Spatially aware surface data can easily be integrated with other geospatial data and are compatible with existing GIS applications (such as QGIS and ESRI’s ArcGIS products) and visualization toolsets, many of which are openly available and free to use. Several tools are specifically designed for 3-D geometric and geoscientific data, some of which are open-access and open-source software. Blender (Blender Online Community, 2018), CloudCompare (Girardeau-Montaut, 2016), Lime (Buckley et al., 2019), VRGS (Hodgetts et al., 2015), MOSIS (Gonzaga et al., 2018), and VTK-derived software solutions (e.g., Schroeder et al., 2006; Sullivan and Kaszynski, 2019) have been used to visualize, interpret, and characterize Svalbox DMDb data. The data management is inherently straightforward because the data are correctly georeferenced.

Manual and automatic mapping methods exist for the structural characterization of DOMs that work in either two-dimensional (2-D) or 3-D space (e.g., Lato and Vöge, 2012; Vöge et al., 2013; Thiele et al., 2017; Drews et al., 2018). Notably, the physical compass method is the most time-consuming and accurate, while the digital characterization of large digital data sets is increasingly seen as an alternative that is orders of magnitude faster (Maerten et al., 2001; Vasuki et al., 2014; Novakova and Pavlis, 2017; Walter et al., 2022). Cawood et al. (2017) provided an extensive comparison between field-based measurements and those derived from LiDAR- and SfM-based DOMs. While single digital measurements may show significant errors, the grouped data often lie within the confidence limit of physical compass measurements, and few physical measurements suffice for the calibration of large digital data sets (e.g., Cawood et al., 2017; Drews et al., 2018). Furthermore, the resulting data are directly compatible as input for geomodeling, including semi-automated fracture network analysis workflows (Fig. 8) such as those employed by Larssen et al. (2020) for a carbonate reservoir and Betlem et al. (2022) for the appraisal of the Longyearbyen CO Laboratory cap rock.

Through several case studies and data set examples, we illustrate the usability and limitations of the digital outcrop data for geological analyses, ranging from structural measurements and the logging of facies and marker beds to using DOMs as input to synthetic seismic data modeling. The examples also highlight thematic groupings and existing legacy data sets that complement the DOMs and facilitate multiphysical data integration.

Longyearbyen CO2 Laboratory

The Longyearbyen CO2 Laboratory drilled and fully cored eight wells in central Spitsbergen to assess a heavily fractured Mesozoic reservoir (De Geerdalen Formation and Wilhelmøya Subgroup) and cap rock (Agardhfjellet Formation, Janusfjellet Subgroup) succession (Braathen et al., 2012; Olaussen et al., 2019, and references therein). The project aim was to capture CO2 produced at the coal-fueled power plant in Longyearbyen and store it in fracture-dominated sandstone successions of Late Triassic to mid-Jurassic age, which are the condensed time-equivalent to the Realgrunnen Subgroup on the southern Barents Sea shelf. The Agardhfjellet Formation is the source of a technical gas discovery in Svalbard (Ohm et al., 2019) and has time-equivalent stratigraphic intervals that represent a major source rock and the primary top seal of many oil and gas fields and CO2 sequestration sites across the Norwegian continental shelf (Spencer et al., 2008). The reservoir and cap rock form a gentle regional monocline that tilts the stratification toward the southwest, so relevant outcrops are found ~15 km northeast of the drill site in the Deltaneset area, and these facilitate direct correlation of the exhumed sequences (Fig. 9; Braathen et al., 2012; Olaussen et al., 2019, and references therein). While no CO2 was ever injected because of high capture costs and the impending power plant closure, the integrated data sets of fully cored boreholes, water injection tests, geophysical profiles, and the nearby outcrop analogues make this a unique data set with which to characterize both the reservoir and cap rock (Olaussen et al., 2019, and references therein).

Thousands of structural and sedimentary measurements have been collected in the outcropping strata and in the cored material at known, georeferenced locations (e.g., Ogata et al., 2014; Mulrooney et al., 2018; Løvlie, 2020; Nakken, 2020). Although the field data provide very good and quantitative constraints, the recordings themselves are not digital and thus are not straightforward to replicate. This severely restricts their use and integration with other geospatial data available to the Longyearbyen CO2 Laboratory (Olaussen et al., 2019, and references therein).

DOMs are increasingly used to digitize legacy data and to acquire additional data for the appraisal of the targeted reservoir, cap rock, and their time-equivalent deposits on the Norwegian shelf. Across Nordenskiöld Land, DOMs capture the reservoir (Kapp Toscana Group: 15 DOMs; Wilhelmøya Subgroup: 6 DOMs) and cap rock (Janusfjellet Subgroup: 23 DOMs) in sufficient detail to map sedimentary and structural features at centimeter-scale resolutions.

In the Deltaneset area (Fig. 9), both the upper part of the reservoir (e.g., Konusdalen, DOM 2021–0056) and lower part of the cap rock (Agardhfjellet Formation; e.g., Konusdalen West, DOM 2020–0039) are affected by horst-and-graben–type, mesoscale normal fault systems, siliciclastic dikes and veins, and mechanical deformation linked to the emplacement of the Diabasodden Suite mafic igneous intrusions (Senger et al., 2013; Ogata et al., 2014; Mulrooney et al., 2018; Betlem et al., 2022). Locally, the Early Cretaceous–age intrusions (e.g., Hyperittfossen, DOM 2020–0006) may have played an important role in terms of diagenesis and compartmentalization of Svalbard’s Mesozoic succession (Senger et al., 2013), including that of the fracture-dominated reservoir (e.g., Agardhbukta beach, DOM 2019–0002; Deltaneset, DOM 2019–0020).

High-resolution DOMs (e.g., Fig. 9) have been used to correlate the thousands of field measurements with data available to the Longyearbyen CO2 Laboratory in well data and regional geophysics, in some cases through the use of synthetic seismic techniques (see From Outcrop Model to Synthetic Seismic Models section). Mulrooney et al. (2018) largely focused their discussion of the reservoir’s fluid-flow potential on the Konusdalen (DOM 2021–0056) exposure. The reservoir-revealing Konusdalen outcrop is well studied and, like the Konusdalen West cap-rock DOM (2020–0039), has dGNSS-calibrated DOM data available, thus favoring cross-site lithostratigraphic integration and the extraction of structural measurements of the highest quality. Indeed, comparison between field (Ogata et al., 2014) and DOM-derived measurements in VRGS (DOM 2021–0056, this study) along roughly the same intervals shows significant similarities in structural orientations (Fig. 10), including variations between different lithological sequences. Similar relationships are observed at the mesoscale, where synthetic faults within the shale-dominated cap rock (DOM 2020–0039) are observed to be shallower than their reservoir counterparts (DOM 2021–0056). Larssen et al. (2020) previously noted that the use of automatic mapping methods favors outcrop-parallel fracture planes, which is different from measurements acquired in the field, which are typically oriented perpendicular to the outcrop (Senger et al., 2015). Although we did not observe the described bias in the manually measured digital data of the Konusdalen and Konusdalen West DOMs, we did note other differences between field-based and digital data sets. Primarily, natural changes (e.g., erosion, weathering) of the outcrop in the decade between field measurements (in 2011) and digitalization (2021) make it difficult to directly correlate the different data sets. Based on photographs alone, it is difficult to find the original scan-line positions because some of the measured beds have been fully eroded or have since been covered by scree, while other portions of the outcrop are now exposed. Most scan lines are measured where accessible, typically limited to the lowermost 2 m of the outcrop face. In addition to being prone to erosion (as is the case at Konusdalen) or local burial by scree or snow, accessibility spatially limits data acquisition to the accessible parts of an outcrop. DOMs mitigate this limitation and negate the need for composite data sets; in other words, they facilitate the use of outcrops to the fullest.

The lithostratigraphic integration of the DOMs with the borehole data facilitates core-outcrop-wireline comparisons, which are further facilitated by the availability of digital drill-core models (DCMs) and samples. Correctly scaled, DCMs that target the reservoir and cap-rock sequences have previously been contributed to the Svalbox database (Betlem et al., 2020a). The DCMs enable the identification of disaggregation-deformation bands, fault planes, and slickensides within the shale-dominated cap rocks at millimetric resolutions and facilitate improved integration of structural measurements with wireline televiewer data. Digital drill-core data complement the current offering of digital outcrop data and, alongside digital sample models, supplement the computer-readable outcrop data available within the Svalbox DMDb with sample-scale data sets.

Besides extensive fault systems and fracture networks, the Longyearbyen CO2 Laboratory cap rock is also affected by igneous and siliciclastic intrusions that may facilitate fluid migration. Using Svalbox DMDb DOMs and Longyearbyen CO2 Laboratory data sets, two siliciclastic injection complexes were mapped that intrude from the sequences into the cap-rock reservoir (Ogata et al., 2023). The investigation into the intrusion complexes provided useful insights on the influence of fluid migration in and across the shale-dominated cap rock and explored migration pathways other than the well-studied fracture networks and fault systems. The spatial evolution of the upper complex, which comprises two dikes, can be traced in DOM 2019–0023. Herein, the sandstone dikes are found to taper out vertically within a 50 m stratigraphic thickness and wedge out near the contact with the overlying Rurikfjellet Formation (i.e., the upper part of the Janusfjellet Subgroup). DOM-derived structural measurements match those from the field and show that the dikes are several tens of centimeters wide and extend laterally for more than 200 m. The second complex comprises a network of interconnected dikes and sills that shoot off from isolated bodies, including a sand volcano that is mapped in detail by DOM 2020–0040 (Ogata et al., 2023). The analysis and partial digitalization of both complexes provide a unique opportunity to address the paucity in finer-scale data from siliciclastic intrusion complexes, which are typically investigated from seismic profiles (Grippa et al., 2019; Ogata et al., 2023).

Evolution of the West Spitsbergen Fold-and-Thrust Belt and Foreland Basin Sedimentation

Many of the fractures, folds, and through-going faults that affect the reservoir (e.g., Konusdalen, DOM 2016–0002/0006), cap rock (Konusdalen West, DOM 2020–0039), and overburden (Janusfjellet, DOM 2020–0020) of the Longyearbyen CO2 Laboratory (Ogata et al., 2014; Mulrooney et al., 2018) show the impact of the early Cenozoic West Spitsbergen fold-and-thrust belt. The West Spitsbergen fold-and-thrust belt and its associated foreland basin, the Central Spitsbergen Basin, are a direct result of Eurekan transpressional deformation, which was related to the opening and spreading of the North Atlantic and Arctic Oceans (Helland-Hansen and Grundvåg, 2021). East-west crustal shortening during the transpression is estimated at 20–40 km and has significantly altered Svalbard’s tectonic settings (Bergh et al., 1997; Leever et al., 2011). The spatial variability of the tectonic zones is best highlighted across east-west cross sections (Fig. 11) and shows the eastward change in tectonic style from thick-skinned to thin-skinned features (Braathen et al., 1995, 1999; Bergh et al., 1997; Horota et al., 2023). Horota et al. (2023) provided a digital educational package in structural geology on the West Spitsbergen fold-and-thrust belt. Notably, the analyses were primarily conducted using Svalbox DMDb DOMs, photospheres, and relevant published literature, and the approach highlights the importance of digital outcrop data in facilitating the interlinkage of multidimensional observations across field sites and time (4-D).

Integration of the Festningen model (DOM 2020–0001), profile (Mørk and Grundvåg, 2020, and references therein), and regionally available subsurface data by Senger et al. (2022) is a key example of how Svalbox DMDb DOMs enable the spatiotemporal interlinkage of observations across disciplines. In this case, the Festningen DOM was used to constrain Svalbard’s regional tectonostratigraphy, even where the quality of regional seismic lines is impeded by structural complexity and high sediment velocities that reduce seismic resolution. The renowned section is a regionally important stratigraphic reference profile and one of many Svalbox DMDb DOMs that illustrate the effects of tectonic deformations associated with the West Spitsbergen fold-and-thrust belt on Svalbard’s sequences. Herein, as elsewhere across the eastern limb of the West Spitsbergen fold- and-thrust belt, Lower Carboniferous to Cenozoic sequences are nearly vertically tilted and provide easy access to Svalbard’s geology along an ~7 km profile. Further westward toward the western hinterland, the heavily deformed Alkhornet Formation at the Lagmannstoppen outcrop (DOM 2020–0015) is another good example of thick-skinned tectonics. Here, Z-folding and the structural orientations of a prominent antiformal, east-closing fold on the northeastern slopes of Lagmannstoppen (DOM 2020–0015; Fig. 7) suggest an eastward tectonic transport direction, which corresponds with the belt’s east-west crustal shortening (Bergh et al., 1997; Burzyński et al., 2018).

There are 24 relevant DOMs that span the section between Festningen and Akseløya and provide information on the transition between the hinterland and the thick-skinned part of the West Spitsbergen fold-and-thrust belt. Along the north-south section, thrusts can be traced through the Vardeborg (2019–0003), Productustoppen (e.g., DOM 2021–0035/0036), Vøringen (2021–0043), Braganzatoppen (2021–0033), and Strandlinuten (2021–0042) DOMs, and these models provide regional constraints on the faults within the thick-skinned part of the West Spitsbergen fold-and-thrust belt.

The Akseløya (DOM 2021–0002) DOM and the 10 DOMs that form the Van Keulenfjorden digital model transect (DMT; Fig. 11) illustrate the transition into thin-skinned tectonics and especially the sedimentation in the foreland basin developing in front of the orogenic belt. The DOMs of the Van Keulenfjorden DMT are a key resource with which to understand the West Spitsbergen fold-and-thrust belt and its associated foreland basin (e.g., Senger et al., 2022; Horota et al., 2023), not least as a digital platform for the integration of legacy data and observations. Additionally, the DOMs are an ideal framework in which to study the paleogeographic change within the basin because they improve quantification and certainty in areas that are otherwise inaccessible and difficult to investigate (Jensen et al., 2021; Bøgh, 2021). The transect spans the northern shore of Van Keulenfjorden and begins with nearly vertically dipping Permian carbonates that sharply turn into mudstones at the Permian-Triassic boundary on Akseløya. The near-vertical sequences can be mapped in detail and traced at the bed level between DOMs of Akseløya (DOM 2021–0002), Festningen (2020–0001) further north, and both Midterhuken (2021–0018) and Bravaisberget (2021–0006) to the south.

The Midterhuken DOM highlights several major unconformities, faults, detachments, folds, and Cretaceous intrusions that affect the basement and Carboniferous-, Permian-, and Triassic-age sequences (Maher et al., 1986). DOM-based, digital analysis of Midterhuken’s northern cliff face illustrated the principle of parasitic folds, in particular, Z-folds, which are also present in Lagmannstoppen (DOM 2020–0015; Horota et al., 2023). Uplifted metamorphic basement is exposed in the western part of the outcrop, which is separated from the overlying Carboniferous strata by an unconformity. Deformation is apparent from the thrusting of early Carboniferous over late Carboniferous successions along a detachment fault and from slip surfaces located at the onset of gypsum-rich sequences of the early Permian Gipshuken Formation, which correspond with the lower regional detachment (Maher et al., 1986; Horota et al., 2023). The middle regional detachment is expressed along the base of the organic-rich, mudstone-dominated Bravaisberget Formation of Middle Triassic age, locally known as the Midterhukbreen detachment (Braathen et al., 1999; Horota et al., 2023).

The Middle Triassic Bravaisberget Formation in western Nathorst Land forms a 10-km-long belt stretching NNW-SSE between Van Mijenfjorden and Van Keulenfjorden. The section has been defined as the stratotype for the formation (Dallmann, 1999; Krajewski et al., 2007), and it is almost fully captured by the Midterhuken and Bravaisberget DOMs. An extensive discussion of the type section was provided by Krajewski et al. (2007), whose many observations can be directly integrated with the Bravaisberget DOM and tied to DOMs of the same sequences elsewhere to facilitate regional correlations.

Further east, the Annaberget-Ullaberget outcrop area marks the transition into the Cretaceous. The scree-covered mudstone slopes emphasize the overlying presence of a mechanically strong succession that forms an erosional unconformity between the Rurikfjellet Formation and the Festningen Member of the Helvetiafjellet Formation (Midtkandal et al., 2008). The latter is an important marker across Spitsbergen and can be observed in, e.g., the Festningen (2020–0001) and Janusfjellet (2020–0002; Fig. 9) DOMs, where it is folded nearly vertically and cut by a brittle thrust fault, respectively. A low-angle thrust fault has cut up section through the Rurikfjellet Formation in the area between the Annaberget and Ullaberget DOMs, displacing the hanging wall toward the east (Midtkandal et al., 2008). Integrated with the observations, logs, and profiles provided by Midtkandal et al. (2008), the Annaberget and Ullaberget DOMs provide a scope for further investigations into the regional subaerial unconformity between the Rurikfjellet and Helvetiafjellet Formations.

The Firkanten, Pallen, Brogniartfjella, and Storvolla DOMs illustrate the preserved Central Spitsbergen Basin foreland infill (Helland-Hansen and Grundvåg, 2021). The kilometer-scale thick Paleogene progradational succession contains the Frysjaodden (offshore), Battfjellet (shallow marine), and the Aspelintoppen (continental) Formations and has long been used as a scientific and educational laboratory (Helland-Hansen and Grundvåg, 2021, and references therein), which the Svalbox DMDb extends with a fully digital component. The DOMs capture the extraordinary exposures and facilitate their use as accessible, digital analogues to subsurface systems ranging from centimeter to decimeter facies-scale to seismic-scale geometries (Helland-Hansen and Grundvåg, 2021). Both Brogniartfjella and Storvolla are world-class examples of seismic-scale clinoforms, here formed by Eocene deltaic sediments that filled the foreland basin as the active West Spitsbergen fold-and-thrust belt became a source for clastic material. The unique viewing angles afforded by the DOMs shine new light on existing interpretations and resulted in previously unnoticed or hard to make observations, such as the ~20 m offset fault in Storvola (Fig. 12).

Both seismic data and boreholes penetrate the sequences captured by the Van Keulenfjorden DMT. The 1085-m-deep BH10–2008 (also known as Sysselmannbreen) research borehole drilled, logged, and fully cored the entire clinoform succession. The borehole, along with numerous coal exploration wells, facilitates core-log-outcrop and multiphysical integration, as previously detailed by Johannessen et al. (2011). Higher up in the stratigraphy, the integration of legacy West Spitsbergen fold-and-thrust belt core-log-outcrop data with Svalbox DMDb DOMs was nicely demonstrated by Bøgh (2021). Bøgh (2021) used DOMs to interpret erosive channels, shales, and sandstone sheets of varying thicknesses in the Aspelintoppen Formation and quantified the distribution across outcrops. Combined with structural measurements on other DOMs, field observations, and other data, these features were used as input to improve the depositional environment model of the infill of the Central Spitsbergen Basin.

Landnørdingsvika: A Barents Sea Analogue from Bjørnøya

The Landnørdingsvika model (DOM 2021–0025; Fig. 7) is one of the few DOMs available from Bjørnøya and another DOM suitable for stratigraphic and paleoenvironmental studies. It is an example of the detail with which Svalbox DMDb models can be interpreted, and it represents an excellently exposed near-vertical outcrop section. With a GSD of 2 cm/pixel, it is possible to interpret thin bedding, such as centimeter-thick mudstone/shale beds, and structural heterogeneities. This allows for interpretation well below seismic resolution, even down to the scale of digital cores/samples and thin sections. At larger scales, the DOM facilitates the analysis of geometries and the spatial evolution of beds and stacking patterns of the Carboniferous–Permian sequences, which have been of interest for hydrocarbon exploration in the Barents Sea, particularly because of the Gohta and Alta discoveries (Matapour et al., 2018).

The DOM nicely illustrates the changing depositional environment from east to west. The (semi-)arid red floodplain sediments, indicated from paleosol formation (Kraus, 1999), and conglomerates of the Landnørdingsvika Formation transition upward to the tidally influenced shallow-marine sandstones and carbonates that are associated with the Kapp Kåre Formation. The transition took place during a period in which tectonic activity slowed down and gave way to a regional transgression (Worsley et al., 2001). The base of the Kapp Kåre Formation can be identified from the DOM and is defined as the base of the first distinct carbonate bed. Within the Kapp Kåre Formation west of the transition, three karstified areas can be seen that may be used as exposed, onshore analogues to the karstified carbonate reservoirs of the Alta discovery.

From Outcrop Model to Synthetic Seismic Models

It is generally accepted that geophysical imaging of the subsurface provides nonunique solutions (Schaaf and Bond, 2019; Faleide et al., 2021). Uncertainty further arises from limitations in data acquisition and the highly complex Earth system (Schaaf and Bond, 2019; Faleide et al., 2021). Outcrops provide key constraints on geometries, sequences, and processes (Howell et al., 2014), and we underline that conventional field work remains a necessity to complement, constrain, and ground truth geophysical data, including DOM interpretations. However, DOMs, like outcrops, form the ideal digital data bridge between scales and disciplines (Fig. 13). This is especially true where DOMs span the dimensions from facies to seismic scale, such as in the Van Keulenfjorden DMT.

Apart from the well-established geological and stratigraphical framework that Svalbard has to offer, the remoteness of the area from signal-corrupting noise makes the archipelago well suited for geophysical studies and outcrop-driven synthetic geophysics (Beka et al., 2016). The benefit of vegetation-free exposures facilitates data acquisition close to outcrops and facilitates direct integration between surface and subsurface information. Within this setting, shallow geophysical techniques such as ground-penetrating radar (GPR), seismic hammer surveys, and electrical resistivity tomography result in subsurface data that closest approach the spatial extent and resolution offered by outcrops and DOMs. Indeed, high-resolution DOMs have been integrated with shallow subsurface geophysical data to investigate paleokarst systems (Janocha et al., 2021) and a faulted, shale-dominated sequence (Betlem et al., 2022). Both studies highlight the potential of DOM-geophysics integration to better constrain the internal architecture, dimensions, and composition of the subsurface beyond the outcrop. In the former, outcrop features were correlated with GPR reflectors to extract 3-D subsurface geometries of carbonate sequences and paleokarst breccias at Rudmosepynten (Janocha et al., 2021). These studies illustrate the potential of Svalbard’s exceptional exposures when combined with geophysics, and they open up additional use cases for the Svalbox DMDb DOMs as accessible, analogue data sets.

Integration of DOM data sets and geophysics extends beyond the immediate vicinity of the outcrop and near-subsurface geophysics. Broad 2-D seismic data are available for most of Svalbard’s fjords and Spitsbergen’s largest valleys (e.g., Bælum et al., 2012), with shallow and deep electromagnetic data (e.g., magnetotellurics) more sparsely collected across the latter (Beka et al., 2016, 2017). Seismic data in Svalbard and the northern Barents Shelf are characterized by high seismic velocities and typically are of poor quality (Anell et al., 2016). The combination of core-log-outcrop integration and synthetic seismic modeling has proven to be useful to constrain and predict the geophysical response to the scale, resolution, and detail of geological features observed in outcrops (Lecomte et al., 2015; Anell et al., 2016). This is also the case in Svalbard, where uncertainty remains high, and synthetic seismic modeling has been actively used to aid in the interpretation of subsurface features:

  1. Synthetic seismic modeling of the Kvalpynten DOM on Edgeøya, for example, was used to bridge the gap between the onshore and offshore realms on the northwestern Barents Shelf and illuminate the effect of small-scale geological features such as channels, intrusions, and faults on seismic data (Anell et al., 2016).

  2. Lubrano-Lavadera et al. (2018) tested the seismic response to along-fault fluid migration of brine and CO2 through the reservoir and cap-rock sequences of the Longyearbyen CO2 Laboratory. The simplified reservoir and cap-rock model was inspired by the fault systems documented in the Konusdalen (DOM 2016–0002) and Konusdalen West (DOM 2019–0013) localities.

  3. Dynamic geomodels of the contact aureole of the Tschermakfjellet igneous sill complex were extracted from the Tschermakfjellet DOM (2016–0001). They were subsequently used to assess the impact of contact metamorphism and the contact aureole on the seismic response (Betlem et al., 2020b). The geomodels implemented realistic rock properties that varied based on the sill thickness to examine the influence of elastic property variations between the intrusions, metamorphosed zone, and host rock.

(Meta)Data Publishing Standards: The Way Forward

While digital models have seen a significant uptake in the geosciences, archaic methods of static screen grabs, webpage referencing styles, and poor archiving of the source material still dominate scientific literature (Buckley et al., 2022). Fortunately, the geoscientific community has taken note of the interlinked role that digital models play in providing enhanced accessibility, inclusivity, and reproducibility (Burnham et al., 2022). Increasingly, models are contributed to online digital model databases (Burnham et al., 2022). While certainly an improvement over not being available at all, these data are rarely accompanied by source data (e.g., photographs) and processing parameters.

The general lack of published source material significantly impedes future reprocessing and reinterpretation of data because only complete knowledge of the digital resource allows for efficient reuse and wider usage of the resources. To date, no major platforms facilitate the inclusion of source data and processing parameters alongside the published digital models. Furthermore, community-developed guidelines for standardized formats for metadata and data sharing are currently minimal.

The implementation of modern visualization tools with direct reference to the version-controlled source material (e.g., input data, digital models) using a versioned DOI provides users a scientific citation and the ability to interpret the data in a manner of their choosing. The use of persistent identifiers (i.e., DOIs) further provides statistics on data use, which may direct future acquisition campaigns and iterative improvement of data quality over time. The latter requires known acquisition and processing histories, both of which are often overlooked yet critical for reproducibility. Advances in processing and imaging techniques in related fields have significantly improved the quality and resolution of legacy data sets, and improved archiving procedures and policies may facilitate reinterpretations that are otherwise difficult or impossible (e.g., Beccaletto et al., 2011). Best practices exist in related relevant fields such as archaeology (D’Andrea and Fernie, 2013; Mi and Pollock, 2018) that may in the future lead the way to fully interoperable digital model databases in the geosciences.

With this contribution, we have taken the first step to address current shortcomings in digital model data publishing and have documented the Svalbox DMDb as a proof of concept. In doing so, we hope to encourage other repositories and the geoscientific community to adopt a standardized approach to digital outcrop data description, including methods and techniques chosen to generate the models, and crucially, record and openly share all source data. While still a work in progress, a reasonable compromise was sought between the need to store a complete record of data generation on the one hand and a practical metadata standardization process for digital outcrop models on the other hand, one that is also friendly to end users. As a result of this approach, multiple Svalbox DMDb models have been used in key scientific publications (Larssen et al., 2020; Janocha et al., 2021; Senger et al., 2022; previously given examples). The number of available models is expected to grow significantly as we further standardize the inclusion of Svalbard-acquired data in the Svalbox DMDb as part of the submission process in the coming years.

Herein, we present a description of the Svalbox DMDb, which currently includes 135 DOMs across the Svalbard archipelago, and our best practices for the acquisition, processing, and publishing of data. By providing all input, processing, and output data under FAIR principles, we hope to establish a precedent and encourage scientific publication of digital outcrop data, which often lack input imagery and processing parameters. The calendar-versioned release of the Svalbox DMDb provides technical and geological metadata that enhance geospatial integration of the DOMs. The data cover Proterozoic (14), Paleozoic (60), Mesozoic (51), and Cenozoic (31) outcrops, and they facilitate appraisal and characterization of several regionally important stratigraphic packages, including:

  1. 42 digitalized NPI type localities and

  2. 103 DOMs with structural elements, of which 11 DOMs comprise the Van Keulenfjorden DMT, visualizing a key cross section through the West Spitsbergen fold-and-thrust belt; and

  3. 44 DOMs capture regionally important organic-rich source and cap-rock sequences, highlighting the presence of structural (e.g., faults and folding) and sedimentary heterogeneities (e.g., sand volcano, sandstone dike, and injectites) across the Longyearbyen CO2 Laboratory reservoir, cap rock, and overburden.

The DOMs enable multiscale quantitative data extraction and integration from localized fracture mapping to regional geomodeling, backstripping, and forward geophysical modeling. We also demonstrate that the Svalbox DMDb is readily integrated with the suite of available multiscale and multiphysical data that has been recorded on Svalbard. The database is expected to grow significantly as more of Svalbard’s geology is digitalized through dedicated campaigns and the submission of Svalbard-acquired data, including digital drill-core models and digital sample models.

Data provided by Svalbox DMDb have significant implications and usefulness because of Svalbard’s geological similarities to the Barents Shelf. Finally, we hope that the Svalbox DMDb may bring the geoscientific community one step closer to adopting a standardized approach to publishing digital model data, with equal emphasis on final data products (e.g., orthomosaics, DEMs, DOMs) and the associated input and metadata (e.g., source imagery and processing parameters).

1Supplemental Material. Table S1: Svalbox Digital Model Database metadata parameters, values, and descriptors. Table S2: Svalbox Digital Model Database digital outcrop models and identifiers. Table S3: Additional details for the framework schematic provided in Figure 3 and steps on how to recreate the Svalbox Digital Model Database elsewhere. Please visit https://doi.org/10.1130/GEOS.S.23739885 to access the supplemental material, and contact editing@geosociety.org with any questions.
Science Editor: Christopher J. Spencer
Associate Editor: Francesco Mazzarini

We sincerely thank our colleagues, in particular, Andy Hodson, Anna Bøgh, Anna Sartell, Annelotte Weert, Erik Schytt Mannerfelt, Gabby Kleber, Gareth Lord, Julian Janocha, Karoline Løvlie, Kei Ogata, Kristine Larssen, Lilith Kuckeroo, Lise Nakken, Lotte van Hazendonk, Marjolein Gevers, Marte Hergot Festøy, Matthijs Nuus, Niklas Schaaf, Rafael Horota, Snorre Olaussen, Simon Oldfield, Sondre Hagevold, Tereza Mosočiová, Trine Andersen, and Veerle van Winden, for meaningful discussions and their contributions during several seasons of extensive field campaigns, some plagued by polar bears. We also thank Nils Nolde for the initial work on the database infrastructure. We further appreciated the data provided by the University Centre in Svalbard (UNIS) CO Laboratory (http://CO2-ccs.unis.no/) and acknowledge the academic licenses provided by Agisoft (Metashape) and VRGeoscience Limited (VRGS; https://www.vrgeoscience.com/). This study was partly funded by the Norwegian CCS Research Centre (NCCS; industry partners and the Research Council of Norway [RCN] no. 257579), the Suprabasins project (industry partners and RCN no. 295208), and the Research Centre for Arctic Petroleum Exploration (ARCEx; industry partners and RCN no. 228107), in addition to various Svalbard Science Forum (SSF) Arctic Field Grants, Svalbard Strategic Grants, and other grants provided by the Research Council of Norway (295627, 310638, 310639, 317492, 322259, 322398, 331679, 333145, 342078). The University of the Arctic (UArctic) funded additional field work and most of the drones used in the campaigns, and the Ørnen project funded by Lundin Norway AS financed the acquisition of the Festningen (DOM 2012–0001) outcrop. We thank Brian Burnham, an anonymous reviewer, and Associate Editor Francesco Mazzarini for constructive comments that improved the original manuscript.

Gold Open Access: This paper is published under the terms of the CC-BY license.