ABSTRACT

Recent developments in distributed data systems now allow faster ingestion and processing of larger quantities of time‐series data than available in current seismic, hydroacoustic, and infrasonic analysis platforms. However, the data model and storage architecture of these systems are significantly different than those in use today. We developed a data acquisition and signal analysis platform using a relatively inexpensive cluster of commodity computing hardware running a Hadoop Distributed File System, an Accumulo database infrastructure, and the Zeppelin web‐based analytics tool suite. The Accumulo data model allows individual waveform samples and their associated metadata to be stored as discrete rows in the database. This is a significant departure from traditional storage practices, in which continuous waveform segments are stored with their associated metadata as a single entity. Our design allows for rapid table scans of large data archives within the Accumulo database for locating, retrieving, and analyzing specific waveform segments directly. The system infrastructure is horizontally scalable, which allows additional data acquisition and processing resources to be added simply through the addition of new data nodes. Easy scalability permits the system to accommodate the ingestion and analysis of new data as the network of seismic, hydroacoustic, and infrasonic sensors grows. The barrier of entry for establishing a functional data acquisition system is relatively low for which a proof of concept prototype can be developed on a limited budget and later scaled as storage and processing requirements dictate.

You do not currently have access to this article.