Skip to Main Content
Skip Nav Destination
GEOREF RECORD

Advancing geologic document digitalization and information retrieval with generative AI

Song Hou, Tianhao Dong, Ojasvi Sancheti and Hao Liu
Advancing geologic document digitalization and information retrieval with generative AI (in Generative and physics-informed AI, Oleg Ovcharenko (prefacer), Haibin Di (prefacer), Umair Bin Waheed (prefacer) and Vladimir Kazei (prefacer))
Leading Edge (Tulsa, OK) (February 2025) 44 (2): 108-113

Abstract

This paper demonstrates how generative artificial intelligence (AI) enhances geoscientific document processing by improving text analysis, table extraction, and figure classification. Traditional workflows struggle with domain-specific terminology, poor-quality inputs, and rare formats. To address these challenges, we employ domain fine-tuned bidirectional encoder representations from transformers (BERT) models to enhance text processing. Additionally, we utilize multimodal large language models for precise table recognition and context-aware image classification. Finally, a domain-optimized retrieval system, GeoRAG, improves the relevance and accuracy of information retrieval. These AI-driven advancements streamline digitalization, enhance data extraction, and enable efficient handling of complex geoscientific documents. While challenges such as hallucinations, interpretability, and output consistency remain, this study highlights the transformative potential of generative AI for geoscience workflows and decision-making processes.


ISSN: 1070-485X
EISSN: 1938-3789
Serial Title: Leading Edge (Tulsa, OK)
Serial Volume: 44
Serial Issue: 2
Title: Advancing geologic document digitalization and information retrieval with generative AI
Title: Generative and physics-informed AI
Author(s): Hou, SongDong, TianhaoSancheti, OjasviLiu, Hao
Author(s): Ovcharenko, Olegprefacer
Author(s): Di, Haibinprefacer
Author(s): Bin Waheed, Umairprefacer
Author(s): Kazei, Vladimirprefacer
Affiliation: Viridien, London, United Kingdom
Affiliation: NVIDIA, Dubai, United Arab Emirates
Pages: 108-113
Published: 202502
Text Language: English
Publisher: Society of Exploration Geophysicists, Tulsa, OK, United States
References: 13
Accession Number: 2025-022118
Categories: Miscellaneous
Document Type: Serial
Bibliographic Level: Analytic
Illustration Description: illus. incl. 1 table
Secondary Affiliation: Calgary, AB, CAN, Canada
Country of Publication: United States
Secondary Affiliation: GeoRef, Copyright 2025, American Geosciences Institute. Reference includes data from GeoScienceWorld, Alexandria, VA, United States. Reference includes data supplied by Society of Exploration Geophysicists, Tulsa, OK, United States
Update Code: 2025

or Create an Account

Close Modal
Close Modal