Exploration for natural hydrogen seepages has grown rapidly, with surface seepages being seen as a primary indication of an underlying resource. While soil gas sampling has proven successful in detecting hydrogen in the soil, current understanding of hydrogen in the subsurface remains incomplete, with various potential sources of origin.

This paper presents a selected review of published hydrogen hotspots around the world, examining their distribution with respect to two main features commonly associated with natural seepages: sub-circular depressions (SCD) and fault zones. Based on the findings, the paper proposes a conceptual soil-gas survey design for efficiently identifying potential hydrogen hotspots. The proposed scheme is tested on a SCD and a fault zone located in Western Australia. The study findings near the Perth Basin and the Yilgarn Craton highlight the presence of anthropogenic artifacts in hydrogen measurement, necessitating further investigation to constrain the possible sources of hydrogen generation.

The study of the SCD in the Perth Basin supports the development of a statistical understanding of hydrogen distribution around surface features associated with hydrogen hotspots. Such a framework can guide soil-gas surveys and target areas with a higher likelihood of detecting natural hydrogen seepage. By addressing these points, prospective areas for natural hydrogen seepage can be better identified and evaluated, ultimately contributing to the development of hydrogen as a sustainable energy resource.

Thematic collection: This article is part of the Hydrogen as a future energy source collection available at: https://www.lyellcollection.org/topic/collections/hydrogen

Natural hydrogen is a significant component of natural gas, but it is not generally considered in the accounting of hydrogen resources for industrial usage (Nikolaidis and Poullikkas 2017). Most of the hydrogen currently produced comes from hydrocarbons (96%, IRENA 2023), principally produced via the reforming of natural gas, with coal gasification playing a lesser role. Although the coupling of natural gas reforming with carbon capture and storage (CCS) represents a low carbon alternative it is not yet widely commercial (Wood Mackenzie 2021). Green hydrogen produced by renewables-powered electrolysis accounts for only about 1% of global hydrogen output (IRENA 2023). As a result, there is a growing interest in identifying natural hydrogen accumulations and potential hydrogen gas fields. The production of natural hydrogen represents a promising method for obtaining large quantities of low-carbon hydrogen at a price point similar to that of hydrogen derived from hydrocarbons (Deville and Prinzhofer 2016; Gaucher 2020).

Our current understanding of hydrogen in the subsurface remains partial and fragmented, with various potential sources of origin (Deville and Prinzhofer 2016; Donze et al. 2020). Zgonnik (2020) classified natural hydrogen sources into two groups: primordial (or primary) hydrogen stored within the earth's core and mantle (Larin 1993; Walshe et al. 2005; Walshe 2006), and secondary hydrogen generated from chemical reactions within the mantle and crust. Natural secondary hydrogen can have a biogenic source, including both thermogenic and microbial processes (Tissot and Welte 1984; Hunt 1996; Nandi and Sengupta 1998; Hallenbeck and Benemann 2002) or an abiogenic source mainly relying on water–rock interaction in the crust (Reeves and Fiebig 2020 and references therein). The latter mainly involves hydrogen resulting from (e.g. Klein et al. 2020; Boreham et al. 2021): (1) hydrothermal alteration or the reduction of water during the oxidation of iron in minerals, which includes sepentinisation of ultramafic-mafic rocks; (2) radiolysis induced by the decay of K, Th, and U in minerals which generates α, β, and γ radiation, which breaks apart the H–O bonds of water and generates hydrogen; (3) mechanoradical reactions on the wet surfaces of faults and the dissociation of the Si–O bond in silicate minerals by mechanical forces. Alternatively, emission of magmatic hydrogen is observed during, or after, volcanic eruptions and is related to the oxidation of hydrogen sulfide.

Since the fortuitous discoveries of natural hydrogen accumulations in various places and the initial years of production in Mali (Prinzhofer et al. 2018), exploration for natural hydrogen has grown rapidly. Initially, a top-down approach has been adopted, primarily focusing on surface seepages as an indication of an underlying resource. The mobility of hydrogen and its vertical migration can be responsible for surface hydrogen seepages, which can be measured directly or indirectly observed in the form of characteristic sub-circular topographic features referred to as ‘Zapadyny,’ ‘Carolina Bays,’ ‘fairy circles,’ ‘sub-circular depressions’ (i.e. SCD), or other local terms (e.g. Sukhanova et al. 2013; Larin et al. 2015; Zgonnik et al. 2015; Prinzhofer et al. 2018; Rezaee 2020; Frery et al. 2021, 2022; Moretti et al. 2021a, b). These SCDs have been attributed to localized collapse caused by the alteration of rock along vertical hydrogen migration pathways (Myagkiy et al. 2020). While a correlation between SCDs and unusual hydrogen content in soil has been found at some locations, there is limited data to support a direct causation, as not all of them may be related to hydrogen or other gas leakages. The review of McMahon et al. (2023) identified 18 SCDs with hydrogen measurements, and when combined with other studies (e.g. Sukhanova et al. 2013; Frery et al. 2022), and non-English literature, we likely have less than 100 SCDs with hydrogen measurements. This is only a small portion of the hundreds or even thousands of SCDs that can be identified from satellite data (e.g. Moretti et al. 2021a).

Alternatively, surface hydrogen seepages have also been reported near outcropping fault zones (e.g. Sugisaki et al. 1983; McCarthy and Kiilsgaard 2001; Dogan et al. 2007; Singh and Mukherjee 2020; Xiang et al. 2020; Etiope 2022).

Although recent exploration programs have shown that continuous monitoring is more effective in characterizing hydrogen emissions by accounting for flux and periodicity (e.g. Prinzhofer et al. 2019; Moretti et al. 2021b; Etiope 2022), spot surface measurement or sampling remains one of the most practical and cost-effective method for early exploration due to the scarcity of long-term monitoring capabilities. This technique involves deploying a gas analyser in the soil to measure hydrogen concentration and has been widely used to identify hydrogen hotspots in the soil; this can be coupled with gas sampling for composition analysis. Several studies have employed this approach, including Larin et al. (2015), Sukhanova et al. (2013), Zgonnik et al. (2015), Prinzhofer et al. (2018, 2019), Frery et al. (2021, 2022), and Xiang et al. (2020). Soil gas sampling has been successful in identifying pockets of hydrogen in the soil potentially link to a deeper trap or sources (e.g. Larin et al. 2015; Zgonnik et al. 2015; Prinzhofer et al. 2018, 2019; Moretti et al. 2021b). To minimize the cost of operators’ deployment and time on site, careful survey design and selection of sampling points and areas are crucial.

While there are evidence supporting the presence of natural hydrogen sources in the subsurface (Smith 2002; Zgonnik 2020; Boreham et al. 2021; Milkov 2022), it is important to note that hydrogen detected in soil or during drilling can have an anthropogenic origin resulting from artificial processes. These include:

  • Corrosion of steel casing from exposure to H2S (Lyon and Hulston 1984).

  • Corrosion reactions between acids that are either naturally present in groundwaters (Zinger 1962) or introduced during wellbore operations (Feldbusch et al. 2018).

  • Decomposition of borehole water due to aqueous reactions with drilled sediment and steel from the drilling tools or casing (Bjornstad et al. 1994).

  • Metamorphism of drill bits resulting in drilling-induced heat cracking organic matters (Halas et al. 2021).

The last two processes are particularly relevant for evaluating natural hydrogen seeps, as they can potentially occur during spot soil-gas measurements when using rotary or percussion drills. Therefore, it is important to consider both natural and anthropogenic sources when studying the origin of hydrogen in the subsurface.

Another element to consider when investigating hydrogen seep through soil gas measurement is that hydrogen seems ubiquitous within the soil and can be produced by multiple microbially mediated processes with concentrations and signatures overlapping those of geologic origins (Etiope 2022). Interpreting hydrogen sources requires support from geochemical data, correct contextualization, and analyses of associated gases (e.g. CO2, CH4, N2, He) (Etiope 2022).

Despite these uncertainties and limitations, the definition of natural seeps still represents a critical step in the current exploration workflow for the validation of a hydrogen system and migration processes (e.g. Prinzhofer et al. 2019; Hand 2023). Therefore, this paper presents a comprehensive review of published hydrogen hotspots around the world, examining their distribution with respect to two main features commonly associated with natural seepages: sub-circular depressions (SCDs) and fault zones. Based on the findings, the paper proposes a conceptual soil-gas survey design for efficiently identifying potential hydrogen hotspots. The proposed scheme is tested on a SCD and a fault zone located in the Perth Basin, Western Australia, with the study findings highlighting the likely presence of anthropogenic artifacts in the measurement of hydrogen in soil. The results of this study offer valuable insights for researchers, policymakers, and industry stakeholders interested in exploring and developing hydrogen as a clean energy source.

Soil-gas spot measurements

The hydrogen content in soil gas is often measured using gas analysers, gas chromatographs, or mass spectrometers at the surface of natural hydrogen seeps (e.g. Sukhanova et al. 2013; Larin et al. 2015; Zgonnik et al. 2015; Susanto et al. 2016; Prinzhofer et al. 2018, 2019; Dugamin et al. 2019; Xiang et al. 2020; Frery et al. 2021, 2022; Lefeuvre et al. 2021).

Soil acts as a major sink for hydrogen, which can mask geologically supplied inputs, making it difficult to identify subsurface hydrogen systems (Zgonnik 2020). However, when the influx of hydrogen from a point source outweighs the sink, hydrogen seepage occurs, which can be a positive indicator of subsurface hydrogen systems. To accurately evaluate hydrogen emanation, it is essential to sample it from strata below any zones of elevated biological activity, which are usually concentrated in the topsoil, varying from a few millimeters to 60 cm of soil depth. For instance, in the coastal plain environment of the eastern USA, Zgonnik et al. (2015) reported an increase in hydrogen concentration with depth, with a threshold observed at the transition from peat to sand, within a 50–100 cm interval Successful field measurements of hydrogen concentration in soils have been made at depths of 10, 30, 100 cm (Larin et al. 2015), 120 cm (Sukhanova et al. 2013; Larin et al. 2015), and even depths ranging from 15 to 240 cm (Shimada et al. 2008; Xiang et al. 2020). A classic approach for spot surface soil-gas sampling is to drill the soil and couple a portable gas analyser to measure hydrogen at a depth between 80 and 120 cm (Prinzhofer et al. 2018; Dugamin et al. 2019; Frery et al. 2021, 2022; Moretti et al. 2021b).

Rotary drilling has traditionally been a widely used method for drilling into the subsurface to collect hydrogen samples. However, some authors recommend the use of percussion drilling instead, such as Lefeuvre et al. (2021) and Halas et al. (2021), to prevent any production of hydrogen through heating and cracking of organic matter (Lewan 1997; Lorant and Behar 2002; Li et al. 2017). Furthermore, there is evidence to suggest that mechanoradical processes linked to the dissociation of silicates in water-saturated rocks can also produce hydrogen during drilling (Kameda et al. 2003; Takehiro et al. 2011), which may contribute to high concentrations of hydrogen in soil gas associated with active tectonic faults (Kita et al. 1982; Sato et al. 1984). As such, to minimize the potential for hydrogen production during drilling, percussion drilling may be a preferable option over rotary drilling.

Expressions of hydrogen surface seepage

Surface hydrogen seepage is a phenomenon that can be observed directly or indirectly (e.g. Prinzhofer et al. 2019; Myagkiy et al. 2020; Rezaee 2020; Frery et al. 2021; Moretti et al. 2021a, b). Such seepage can be an indicator of an ineffective seal, and can be associated with circular to elliptic depressions (i.e. SCDs). On the other hand, elevated concentrations of hydrogen may be linked to the presence of a basement-rooted fault zone (e.g. Dugamin et al. 2019).

Hydrogen and sub-circular depressions

Hydrogen has been measured seeping from surface depressions with characteristic sub-circular shape (e.g. Sukhanova et al. 2013; Larin et al. 2015; Zgonnik et al. 2015; Prinzhofer et al. 2018, 2019; Rezaee 2020; Frery et al. 2021, 2022; Moretti et al. 2021a, b; Mainson et al. 2022) (Fig. 1). The source of this hydrogen is believed to be deep beneath the sedimentary section, migrating upward to the Earth's surface. The SCDs observed are thought to be caused by localized collapse resulting from the alteration of rocks along hydrogen migration pathways. The hydrogenation of rocks produces acid that can dissolve the rocks during upward movement by gas, creating a preferred vertical migration pathway (Larin et al. 2015; Zgonnik et al. 2015; Myagkiy et al. 2020). In some cases, the depressions aligned along structural trends corresponding to basement faults (Larin et al. 2015).

The depth of the depression is usually less than 10 m below the baseline elevation (Moretti et al. 2021a), and changes in soil and vegetation have been associated with surface hydrogen emanations. Vegetation is often absent or sparsely distributed within the depression, with soil bleaching or discoloration possibly related to hydrogen seepages outlining the depression (Sukhanova et al. 2013; Zgonnik et al. 2015), and/or rim(s) of denser and/or taller vegetation found at the edges of the depression (e.g. Larin et al. 2015; Zgonnik et al. 2015; Rezaee 2020; Frery et al. 2021). The average diameter of these emitting structures varies from few meters to several kilometres. According to 2D advective-diffusive modelling conducted by Myagkiy et al. (2020), the radius of the structure is linked to the time spent by the hydrogen in the soil, which is dependent on the soil permeability, the depth of gas leakage point, and the pressure of the bubble.

In certain areas, hydrogen accumulations have been detected in the subsurface without any visible surface morphology indicating the presence of such accumulations. This has been observed in the Amadeus Basin in Australia, above the Bourakebougou accumulation in Mali, and in New Caledonia (Moretti et al. 2021a).

Etiope (2022) emphasized the difficulty in determining the source of hydrogen gas present in soil, specifically in SCDs that occur in arid grasslands. While some studies suggest that the hydrogen in such features is of geological origin (e.g. Larin et al. 2015; Frery et al. 2021), other research indicates that it may be produced by microbial processes in waterlogged soils (e.g. Kramer and Conrad 1993). Furthermore, the hydrogen concentrations found in SCDs are similar to those produced by biological processes in surface ecosystems, such as wetlands and agricultural soils (100–103 ppmv, Etiope 2022). Therefore, it's important to be cautious when assuming a deep origin for hydrogen in these cases.

Hydrogen and fault zones

The available data indicates that fault zones can serve as pathways for the migration of hydrogen gas from deep sources or from in-situ mechanoradical reactions within the fault zone itself. This hypothesis is supported by the observation of elevated concentrations of hydrogen in soil gases sampled along fault zones (Wakita et al. 1980; Sugisaki et al. 1983). It is possible for both in-situ hydrogen and hydrogen migrating from deeper sources to coexist or alternate within fault zones, as has been demonstrated in previous studies (Sugisaki et al. 1983; Fang et al. 2018).

In many instances, a strong correlation has been observed between hydrogen anomalies and fault zones, particularly those intersecting the crystalline basement (Larin et al. 2015, and references therein). In the absence of soil cover, hydrogen emissions are frequently observed along these fault zones, which are believed to act as migration pathways for hydrogen (Myagkiy et al. 2020). It has been proposed that many of the circular depressions in Western Australia could be aligned with linear deep faults and doleritic dykes where hydrogen generation and migration can occur (Rezaee 2020). Xiang et al. (2020) observed that hydrogen migrates through fractures in deformation zones associated with active thrust faults and can charge the soil up to several kilometres away from the main slip surface. Similarly, Donze et al. (2020) demonstrated a correlation between hydrogen occurrences and deep faults delineating horst and graben structures, which affect the entire sedimentary sequence of the São Francisco Basin. Hydrogen seepage within fault zone have also been reported from the East African Rift (Pasquet et al. 2022, 2023) and along the Pyreneans Frontal Thrust (Lefeuvre et al. 2021).

A study of hydrogen mud logging in the Wenchuan Earthquake Fault (China) revealed that there was a notable absence of hydrogen in the centre of the fault zone, while high concentrations of hydrogen were observed in the fractured zones, which showed a positive correlation with fracture density (Fang et al. 2018). Hydrogen is thought to have originated from the interaction between water and fresh silicate mineral surfaces exposed during faulting as well as from the mantle. Experimental studies have indicated that hydrogen concentrations are significantly associated with faults and their activity (Wakita et al. 1980). Active Japanese faults have recorded hydrogen concentrations of up to 9.36% (Sugisaki et al. 1983). Soil gases collected along the Yamasaki fault in Japan exhibited molecular hydrogen concentrations of up to 3% (Kita et al. 1980). Studies have revealed that historically active faults in Japan contain more hydrogen than ‘prehistorically’ active faults. Laboratory experiments have also shown that hydrogen in fault breccia and gouge can be generated from a paste made of newly pulverized rocks and water, indicating that the fresh mineral surface produced by tectonic stresses reacts with groundwater to generate hydrogen (Sugisaki et al. 1983). Concentrations of up to 25% of hydrogen were observed in gases extracted from a drill core obtained in the fracture zone of the Nojima Fault Zone in Japan (Arai et al. 2001). Furthermore, drill-mud gases collected from fractured zones surrounding the San Andreas Fault revealed hydrogen concentrations of up to 6% (Wiersberg and Erzinger 2008).

These findings suggest that fault zones can act as migration pathways for hydrogen, highlighting its potential as a tracer of fault activity. This is in line with legacy work from the oil and gas industry that shows that active or critically-stressed faults are more likely to be fluid conduits whereas inactive or non-critically stressed faults are thought more likely to act as barriers (e.g. Sibson 1987; Muir-Wood and King 1993; Anderson et al. 1994; Barton et al. 1995; Sanderson and Zhang 1999; Wiprut and Zoback 2000; Zoback and Townend 2001; Revil and Cathles 2002; Wilkins and Naruk 2007).

Hydrogen distribution in sub-circular depressions

Measurements of hydrogen concentrations at a depth of c. 1 meter beneath the surface have been obtained from circular depressions located in various regions across the globe, including Russia, Ukraine, the United States of America, Brazil, Mali, and Australia (Larin et al. 2015; Zgonnik et al. 2015; Prinzhofer et al. 2018, 2019; Zgonnik 2020; Frery et al. 2021; Moretti et al. 2021a, b; Mainson et al. 2022).

The selection of sampling locations for soil-gas studies in these features has varied among researchers and sites, primarily influenced by land accessibility, water distribution, and vegetation. Myagkiy et al. (2020) have reported that the distribution of hydrogen concentration within the circular structures is not uniform, with higher concentrations observed in the rim regions and near-zero concentrations detected in the centre. This spatial variation in hydrogen concentration is attributed to the increased hydrogen flow in areas of high permeability.

In Figure 2, we present the distribution of normalized hydrogen readings (normalized to [0;1]) plotted against the relative distance from the centre of circular depressions based on selected published hydrogen soil-gas sampling surveys (Sukhanova et al. 2013; Larin et al. 2015; Zgonnik et al. 2015; Prinzhofer et al. 2019; Frery et al. 2021; Moretti et al. 2022). It is assumed that the relative distance represents the ratio of the distance from the centre of the SCD over the radius at that location; this approach is used to normalize the distance to the centre and enable the comparison of samples in elliptical depression (i.e. with varying radius) and between SCDs of different size. In this study we define the outer limit of SCDs based on variation of vegetation from satellite images. However, it should be noted that this choice is somewhat subjective and that uncertainties remain around the definition of the outer limit of SCDs which can be defined by the surface morphology and slope inflection (e.g. in arid environment) or by the limit of denser vegetation (e.g. Moretti et al. 2021a, 2023); gamma spectrometry data also suggests that hydrogen emanation in SCDs might not be always restricted to areas with anomalous vegetation (Rigollet and Prinzhofer 2022). The results of the distribution of normalized hydrogen readings show a scattered distribution with no clear trend between hydrogen concentration and location within the features (i.e. radius ratio). Simple linear regression analysis conducted on eleven structures revealed four negative and five positive relationships between hydrogen concentration and distance from the centre, while the trendline for all the data (i.e. Aggregate in Fig. 2) showed a very slight positive relationship between hydrogen concentration and distance from the centre.

Figure 3 illustrates the frequency distribution of hydrogen soil-gas sampling surveys conducted in circular depressions, plotted against their relative distance from the centre of the depression (Sukhanova et al. 2013; Larin et al. 2015; Zgonnik et al. 2015; Prinzhofer et al. 2019; Frery et al. 2021; Moretti et al. 2022). Harmonized hydrogen frequency values for percentiles P90, P75, and P50 are presented as frequency histograms and normal distributions. The P90 samples, representing those with hydrogen concentration values greater than 90% of the sample population, are found to be predominantly located near the rim of the circular depression, with a maximum at c. 0.75 of the feature radius. In contrast, the P75 and P50 samples are closely distributed, however the maxima on the fitted normal distributions is around 0.65 for P75, suggesting a statistically preferential location near the rim of the depression compared to the P50 samples, whose maxima is around 0.55. Overall, this analysis reveals that the hydrogen concentration distribution is moderately correlated with the location in the circular features, with a discernible trend observed between hydrogen concentration and distance from the centre.

The observed trend in the data is consistent with the findings from Myagkiy et al. (2020) where the hydrogen concentration is higher near the rim of the circular features. However, caution is needed in interpreting these results due to the possibility of sampling bias affecting the statistical analysis.

Hydrogen distribution in fault zones

Studies investigating soil-gas surveys near fault zones have reported substantial spatial and temporal variability in hydrogen concentrations (Dogan et al. 2007; Xiang et al. 2020). Notably, studies have demonstrated a correlation between maximum hydrogen concentration and fault activity, providing evidence that hydrogen surveys can serve as a proxy for monitoring active fault zones (Dogan et al. 2007). However, the efficacy of such studies is limited by variations in sampling protocols, including differences in sampling steps, profile lengths, and orientations, which are largely dictated by factors such as site access and the size and geometry of the fault zone (e.g. McCarthy and Kiilsgaard 2001; Dogan et al. 2007; Singh and Mukherjee 2020; Xiang et al. 2020).

Although hydrogen produced by frictional mechanisms is expected to be present within the high strain fault core, hydrogen that migrates along fault zones from deeper sources is more likely to be present in the fractured damage zone (Fang et al. 2018). The migration of hydrogen along fault zones can result in complex surface expressions, as demonstrated by Xiang et al. (2020) in a large thrust fault in Xinjiang (China). Soil-gas data from around the fault revealed that a highly deformed hangingwall with extensional fractures facilitated the upward escape of hydrogen, whereas lower permeability fractures in the footwall led to the accumulation of hydrogen gas in shallow soil. No hydrogen anomaly was observed directly above the main slip surface. Figures 46 illustrate the frequency distribution of hydrogen soil-gas sampling surveys conducted along fault zones (McCarthy and Kiilsgaard 2001; Dogan et al. 2007; Singh and Mukherjee 2020). The distance from the faults core is plotted against the frequency of P90 samples (i.e. samples with hydrogen concentration value greater than 90% of the sample population) and <P90 samples.

In Figure 4, we present the frequency distribution for hydrogen soil-gas sampling surveys conducted along the Trans-Challis Fault System in Idaho, USA (McCarthy and Kiilsgaard 2001). Our analysis shows that the highest hydrogen concentrations (P90) are found within 2000 m of the fault core. The frequency of lower hydrogen concentration samples (<P90) peaks at 2000 m from the fault and gradually decreases to less than 5% at 4000 m from the fault. This pattern suggests a gradual decrease in hydrogen soil-gas concentration away from the fault zone, possibly due to the decreasing structural permeability in the damage zone and/or conductive secondary faults.

The sparse data from the Yamasaki fault (Japan) (Dogan et al. 2007) (Fig. 5) indicates that the highest hydrogen concentrations (P90) are observed within 10 m of the fault core, and the lower hydrogen concentration samples (<P90) are distributed over a 70 m wide zone from the fault core. This pattern reveals a rapid decline in hydrogen soil-gas concentration away from the 60 m thick fault zone, with the majority of structural permeability and/or conductive secondary faults likely situated <10 m from the mapped fault.

The data from Singh and Mukherjee's (2020) study of a dense network of active faults, fractures, and lineaments in Haryana and Delhi states, India (Fig. 6), reveal that the highest hydrogen peroxide (H2O2) concentrations (P90) are situated within 5000 m of the fault zones, and the lower H2O2 concentration samples (<P90) are mostly dispersed in an 11 000 m wide zone from the fault core. This pattern indicates a gradual decrease in H2O2 soil concentration away from the fault zone, with most structural permeability and/or conductive secondary faults probably located within <10 km from the fault.

The data analysis of the fault zones indicates a loose inverse relationship between hydrogen soil-gas concentration and proximity to the fault core, which could be attributed to the diminishing structural permeability in the damage zone and/or conductive secondary faults. However, caution should be taken in interpreting the results due to limited data, likely rheological and structural complexities in fault zones and the possibility of sampling bias affecting the statistical analysis.

Effect of soil and drilling on spot measurement

The potential existence of anthropogenic hydrogen generated as a result of artificial processes during sampling represents a source of uncertainty when characterizing natural hydrogen seeps. Recent soil crushing experiments conducted by Halas et al. (2021) have demonstrated that drilling can generate enough energy to crack organic matter naturally present in soil through a milling action or excessive heating, resulting in the artificial production of hydrogen and alkenes during rock crushing. The discovery of hydrogen in a shallow borehole (Bjornstad et al. 1994), along with laboratory experiments, supports the notion that hydrogen can be generated during percussion drilling through the decomposition of borehole waters resulting from aqueous reactions with drilled sediments and steel from the drilling tool.

Recent experimental data from the Perth Basin and Yilgarn Craton in Western Australia have highlighted the potential for false positive results in spot soil-gas measurements due to anthropogenic hydrogen generated during the sampling process. Random spot measurements were conducted at various locations with different soil types (Fig. 7), and hydrogen concentrations were measured both before and after drilling. The drill duration varied depending on the soil resistance, but typically consisted of three measurements with a maximum drilling duration of 60 seconds.

At nine locations no hydrogen was measured (Fig. 7). At four locations the measurements revealed a positive correlation between hydrogen concentration in the drillhole and drilling duration (Figs 7, 8).

The hydrogen concentration in a drillhole located in the southern Perth Basin, east of Golden Bay (Fig. 7), was found to increase significantly with drilling duration, with concentrations rising from 30 ppm after 30 seconds of drilling to 280 ppm after 60 seconds of drilling (Fig. 8). The soil type present at this location is the Southern River soil (Department of Agriculture and Food, WA 2003), which is rich in peat and clay, and is likely responsible for the generation of hydrogen due to the cracking of organic matter.

A weaker, but similar correlation between hydrogen concentration and drilling duration was observed at three additional locations on the Darling Scarp, separating the coastal plain and Perth Basin from the Archean Yilgarn Craton (Figs 7, 8). These locations were characterized by a thin layer of red and yellow earth directly above an Archean granitic basement that posed significant resistance to drilling. Anthropogenic hydrogen production resulting from aqueous reactions with drilled sediments and steel from the drilling tools could possibly be responsible for the observed hydrogen concentrations at these locations.

Survey design for spot measurement in sub-circular depressions

Based on the data presented in Figure 3, it appears that the probability of high hydrogen concentration in surface depression decreases as one moves from the rim towards the centre. The zone with the highest probability of including high hydrogen concentration in soil is the first ring, which has a width of c. 25% of the radius and starts from the rim of the depression. This first ring encompasses 64 and 53% of the P90 and P75 hydrogen concentration values, respectively, as shown in Figure 3. Moving towards the centre of the depression, a second ring with a width of around 35% of the radius can be observed, which includes 16, 32, and 41% of the P90, P75, and P50 hydrogen concentration values, respectively. This ring has the second-highest likelihood of including high hydrogen concentration in soil. Additionally, Figure 3 demonstrates that 34% of the P90 hydrogen concentration values are within a distance of less than 10% of the radius from the centre of the depression.

Therefore, it is suggested that a soil-gas survey conducted in a surface depression should prioritize targeting the outermost ring, with a width of 25% of the depression's radius, to maximize the likelihood of sampling the highest hydrogen concentration (Fig. 9). Subsequently, the survey should proceed towards the centre to sample the next ring, which has a width of around 35% of the radius and encompasses 25% of the hydrogen measurement greater than P75 (Fig. 9). It is also observed from Figure 3 that high hydrogen concentration values can be found near the centre of the depression.

A survey design with radial transects and targeted sampling in the outer half of a depression and near its centre would increase the likelihood of detecting local hydrogen hotspots (Fig. 9). Alternatively, a circular survey design with higher sample density in the first ring and lower sample density in the second ring would allow for targeted sampling in areas most likely to contain high hydrogen concentrations. It may also be informative to collect samples immediately outside the visible feature since previous studies have shown that hydrogen soil-gas values can remain elevated beyond the depression rim (e.g. Larin et al. 2014; Zgonnik et al. 2015; Rigollet and Prinzhofer 2022).

Currently, there is a lack of data regarding the optimal sampling step or sample density for hydrogen soil-gas surveys, as lateral variability patterns in hydrogen soil-gas values vary widely in the literature. However, it is reasonable to assume that a regular sampling step, ranging from tens to hundreds of meters, in each of the rings would produce the most informative dataset.

Sampling of the Lake Beermullah sub-circular depression, Western Australia

In the northern Perth Basin of Western Australia, a series of swamps and lakes are aligned along a NNW–SSE structural trend that is subparallel to the Darling Fault. One such swamp, forming the Lake Beermullah, is an example of a sub-circular depression with a major axis of 512 m, oriented NW–SE, and a minor axis of 407 m (Fig. 10). To evaluate the hydrogen distribution at this site, a soil-gas survey was conducted. Based on the global dataset presented earlier, the survey was designed to include one transect and a circular track around the first ring (Figs 9, 11). To measure the hydrogen concentration, a portable gas analyser with a resolution of 1 ppm and an 80 cm inox tube was used. The tube was introduced into a hole made with a percussion and rotary drill. Due to land access limitations, the transect was located c. 60 m from the centre of the SCD (Fig. 11). The transect consisted of 18 samples spaced at an average distance of 24 m along a length of c. 340 m. The circular track around the first ring included 23 samples spaced at an average distance of 56 m along a length of c. 1300 m (Fig. 11).

In Figure 12, the frequency distribution of hydrogen soil-gas concentrations for SCD at Lake Beermullah is presented. The hydrogen concentrations are divided into percentiles (P90, P75, and P50) to better analyse the distribution of the data. The results show that the P90 samples, which have the highest hydrogen concentration values, are primarily located near the rim of the SCD, in the first ring (as shown in Fig. 6). Meanwhile, the P75 and P50 samples, with hydrogen concentration values greater than 75 and 50% of the sample population, respectively, are located in both the first and second rings.

The distribution of hydrogen concentrations in the Lake Beermullah samples is consistent with the global dataset, as indicated by the correlation between the histogram and the normal distribution calculated from the global dataset. This finding suggests that the hydrogen distribution at this site is similar to that of other locations surveyed worldwide.

Survey design for spot measurement in fault zones

Based on the presented data (Figs 46), it appears that the probability of encountering hydrogen hotspots diminishes with increasing distance from the fault core. Additionally, the data reveals substantial variability in the distance at which hydrogen (or H2O2) hotspots occur. This variability likely reflects differences in fault size, structural style, distribution, geometry, and size of the structural elements (e.g. fault core, damage zone, secondary faults), which are dependent on the accumulated strain, lithological composition, and tectonic regime. Specifically, P90 concentration hotspots have been observed within distances ranging from a few meters to several kilometres from the zone of maximum strain. Lower concentrations exhibit a decline away from the P90 values, with the rate of decrease varying. The extent over which the hydrogen associated with the fault zone is detectable likely depends on the thickness of the damage zone. Investigations into normal faulting over the past few decades have revealed distinct displacement-thickness scaling relationships (Robertson 1982; Evans 1990; Shipton et al. 2006; Wibberley et al. 2008; Childs et al. 2009). Therefore, it is plausible that the presence of hydrogen in the fault zone is dependent on the degree of accommodated strain.

The data from Xiang et al. (2020) also suggests that there may be distribution variability between the hangingwall and footwall, particularly in cases where there are different lithologies present and where differential strain accommodation occurs on each side of the fault core.

To effectively capture hydrogen seeps in proximity to fault zones, a soil-gas survey should employ transects perpendicular to the primary deformation plane, with a length that approximates the estimated width of the fault zone and its corresponding damage zone. Although the damage zone of normal faults in siliciclastic environments can be roughly estimated to be similar in width to the displacement (Childs et al. 2009), compilations of published fault displacement-thickness (e.g. Shipton et al. 2006; Childs et al. 2009) indicate considerable variability with up to three orders of magnitude of measured thickness scatter at a single displacement value. This finding highlights the level of uncertainty surrounding the thickness of the damage zone and the appropriate length of the survey required to accurately capture hydrogen seeps in proximity to fault zones.

In order to address potential lateral variability of strain in the fault zone, a systematic sampling approach such as serial transects normal to the fault strike could be employed, leading to a more comprehensive understanding of the fault zone as a potential migration pathway for hydrogen. The sampling strategy should target all fault zone components, with a focus on the damage zone due to its higher likelihood of exhibiting increased structural permeability. Subsequently, attention should be paid to visible or mapped fault intersections, which are expected to display enhanced vertical permeability (e.g. Gartrell et al. 2004), as well as secondary or conjugate faults.

Currently, there is a lack of data regarding the optimal sampling step or sample density for hydrogen soil-gas surveys near fault zones, as lateral variability patterns in hydrogen soil-gas values vary widely in the literature. However, it is reasonable to assume that a regular sampling step, ranging from tens to hundreds of meters, in each of the fault zone components would produce the most informative dataset.

Sampling of the Darling Fault zone, Western Australia

In the Permian, Triassic, and Jurassic periods, the north–south to NNW–SSE trending Darling Fault played a dominant role in shaping the Perth Basin and continued to do so until the Cretaceous when Australia separated from Greater India. The fault's displacement exhibits lateral variability, yet the overall offset is estimated to be in the range of tens of kilometres (Dentith et al. 1993). The footwall compartment, comprising the Yilgarn Craton, predominantly consists of gneiss and gneissitic granite, displaying mylonitic and cataclastic textures that are suggestive of a gradual reduction in deformational activity with increasing distance from the fault core. The hangingwall, on the basin side, is characterized by at least 6 km of Paleozoic to Mesozoic sediments overlying the Archean basement (Mory and Iasky 1996).

A soil-gas survey was implemented to assess the distribution of hydrogen and determine the potential for natural seepage. The survey was designed to cover an area perpendicular to the Darling Fault, with dimensions of 13 km (NNW–SSE) by 17 km (WSW–ENE). The survey included a total of 15 samples, with 6 located in the hangingwall and 9 in the footwall. However, due to restricted land access, the distribution of samples is highly irregular, with sample distances ranging from 1.9 to 5.7 km as shown in Figure 13. To measure the hydrogen concentration, a portable gas analyser with a resolution of 1 ppm and an 80 cm inox tube was used. The tube was introduced into a hole made with a percussion and rotary drill.

Figures 14, 15 illustrate the results of the soil-gas survey conducted in the Darling Fault zone, displaying the frequency distribution and absolute concentration of hydrogen categorized by compartment (i.e. hangingwall and footwall) and divided into percentiles P90 and <P90. The data shows that the highest hydrogen concentration (P90, 47 ppm) was found in samples located near the fault core (<1000 m) and on the Yilgarn Craton (i.e. footwall) (Figs 13, 14). However, the remaining measurements in the footwall do not provide substantial evidence to support a significant correlation between hydrogen concentration and proximity to the fault core (Figs 14, 15). Furthermore, no clear association was observed between hydrogen concentration and distance from the fault core on the basin side (i.e. hangingwall) (Figs 14, 15).

The observation data collected during the soil-gas survey reveal a strong correlation between soil type, resistance to drilling and hydrogen concentration. Notably, the four highest hydrogen concentrations (ranging from 47–19 ppm; Fig. 13) were detected in thin laterite overlying Archean gneiss. The drilling penetration at these locations was found to be approximately an order of magnitude lower compared to that observed in sediments of the Perth Basin. Moreover, the drilling time at these sites was frequently observed to be greater than 40 seconds. The soil type, substratum and drilling pattern are similar to those encountered on the Darling Scarp in the southern Perth Basin (Fig. 7), where anthropogenic hydrogen production resulting from aqueous reactions with drilled sediments and steel from the drilling tools are suspected (Fig. 8).

Importantly, while the current interpretation of the result from the Darling Fault cannot rule out the presence anthropogenic hydrogen, the observed disparity between hydrogen concentrations in hangingwall and footwall zones may also be partially attributed to differences in the deformation and rheology of the compartments as proposed by Xiang et al. (2020). Moreover, the distribution of heightened hydrogen concentration (>19 ppm) near the fault core may be influenced by the presence of high deformation zones, such as cataclasite and mylonite.

The findings presented here support the notion that interpretation of hydrogen soil-gas measurements should be approached with caution. Further investigation, such as mineralogical analysis, and isotopic analysis of the hydrogen and associated compounds should be carried out to more fully characterize the soil composition and gain a better understanding of the possible sources of hydrogen generation (e.g. Etiope 2022).

The discussions and findings from the soil-gas measurements near various surface depressions and faults, highlight the importance of understanding the distribution of hydrogen in relation to these features for identifying and evaluating prospective areas for natural hydrogen seepage. The following points summarize the conclusions:

  1. Current technology limitations prevent the performance of spot soil-gas measurements or continuous monitoring of hydrogen concentrations on all potential surface seeps. As a result, it is essential to develop a statistical understanding or framework of hydrogen distribution around surface features commonly associated with hydrogen hotspots.

  2. This statistical framework can help guide the design of soil-gas surveys and target areas with a higher likelihood of detecting natural hydrogen seepage, which in turn would contribute to a more efficient evaluation of prospective regions for hydrogen exploration and potential resource development.

  3. The relationship between hydrogen concentration and proximity to geological features, such as faults and surface depressions, can be complex and influenced by factors such as lithology, deformation, and rheology.

  4. A soil-gas survey near surface depressions should prioritize targeting the outermost ring, which has a width of c. 25% of the depression's radius, and proceed towards the centre, as this area has the highest probability of encountering high hydrogen concentrations in soil.

  5. Soil-gas surveys near fault zones should employ transects perpendicular to the primary deformation plane and cover the fault zone and its corresponding damage zone to effectively capture natural hydrogen seeps and account for variability in fault size, structural style, distribution, geometry, and size of the structural elements.

  6. The presence of anthropogenic hydrogen, which may be generated artificially during the sampling process, adds uncertainty to the interpretation of soil-gas measurements; thus, careful analysis and further investigation, such as mineralogical analysis and, isotopic analysis of the hydrogen and other compounds should be carried out to more constrain the possible sources of hydrogen generation.

By addressing these points and developing a robust statistical framework for hydrogen distribution, researchers and industry professionals can better identify and evaluate prospective areas for natural hydrogen seepage, ultimately contributing to the development of hydrogen as a sustainable energy resource.

The authors would like to thank the valuable comments from Juanma García-Ruiz and the anonymous reviewer.

LL: conceptualization (lead), data curation (lead), formal analysis (lead), investigation (lead), methodology (lead); JS: data curation (equal), investigation (supporting), methodology (equal)

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

The datasets generated during and/or analysed during the current study are not publicly available due to ongoing research activities but are available from the corresponding author on reasonable request.