Recent Advancement in Geoinformatics and Data Science
CONTAINS OPEN ACCESS
Geoscience is now facing the huge potential enabled by the cyberinfrastructure, sensor network, big data, cloud computing, and data science. In this new era, what skills should geoscientists know and what actions can they take to foster new research topics? Are there already successful stories of data science in geosciences and what are the experiences? Can data science bring fresh ideas to geosciences, and vice versa? The chapters in this Special Paper present the latest progress and discoveries in both the methodology and technology of geoinformatics, and provide answers to those questions. The presented methodologies, technologies, and best practices will make this volume a useful reference with long-term impacts for data-intensive geoscience in the next decade and beyond.
Improving reproducibility of geoscience models with Sciunit
*Erratum: In the first version of this chapter published online, “Ayman Nassar” was misspelled as “Ayam Nassar.” GSA sincerely regrets this error.
Published:March 22, 2023
Raza Ahmad, Young Don Choi, Jonathan L. Goodall, David Tarboton, Ayman Nassar*, Tanu Malik, 2023. "Improving reproducibility of geoscience models with Sciunit", Recent Advancement in Geoinformatics and Data Science, Xiaogang Ma, Matty Mookerjee, Leslie Hsu, Denise Hills
Download citation file:
For science to reliably support new discoveries, its results must be reproducible. Assessing reproducibility is a challenge in many fields—including the geosciences—that rely on computational methods to support these discoveries. Reproducibility in these studies is particularly difficult; the researchers conducting studies must agree to openly share research artifacts, provide documentation of underlying hardware and software dependencies, ensure that computational procedures executed by the original researcher are portable and execute in different environments, and, finally, verify if the results produced are consistent. Often these tasks prove to be tedious and challenging for researchers.
Sciunit (https://sciunit.run) is a system for easily containerizing, sharing, and tracking deterministic computational applications across environments. Geoscience applications in the fields of hydrology, solid Earth, and space science have actively used Sciunit to encapsulate, port, and repeat workflows across computational environments. In this chapter, we provide a comprehensive survey of geoscience applications that have used Sciunit to improve sharing and reproducibility. We classify the applications based on their reproducibility requirements and show how Sciunit accommodates relevant interfaces and architectural components to support reproducibility requirements within each application. We aim to provide these applications as a Sciunit compendium of use cases for replicability, benchmarking, and improving the conduct of reproducible science in other fields.