## Abstract

We describe a method for laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) elemental analysis of geological materials using low-dilution Li-borate fused glass wavelength dispersive X-ray fluorescence (WDXRF) pellets, with samples, drift monitor and 18 reference materials (RMs) identically prepared. After analysis for 46 elements by WDXRF, LA-ICP-MS intensities from samples and RMs are collected, and background corrected with *Iolite* software. *HALite*, a new software application, was developed to derive the elemental compositions from the LA-ICP-MS net signals. In *HALite*, elements are drift-corrected using polynomial functions, and flux-fused RM element sensitivities are calculated from known mass fractions. Multiple internal standard (IS) elements are used to model each sample's laser response. Analyte mass fractions in unknowns are determined using the calibrated sensitivity correlation models for multiple IS elements. Either the WDXRF mass fractions or the initial round of calculated LA-ICP-MS mass fractions are used to calculate weighted mean sensitivities. Validation experiments with flux-fused RMs run as unknowns yield results with less than 5–10% total relative uncertainty for most analytes. We derive equations which allow calculation of the precision and total uncertainty as a function of mass fraction for each analyte element.

**Supplementary material**: Table 1 - RMs used; Table 2 - Operating parameters; Table 3 - Model v. accepted mass fractions; Table 4 - NIST 610 v. multiple IS models; Table 5 - Fitting parameters; Appendix 1 - HALite description; Appendix A - Summary calibration graphs; Appendix B - Validation results; Appendix C - WDXRF comparisons; Appendix D - Repeatability uncertainties; Appendix E - RM uncertainties; and Appendix F - Total uncertainties for this article are available at https://doi.org/10.6084/m9.figshare.c.6639885

The trace element compositions of rocks are often measured by dissolving rock powders into solutions, then analysing the solutions using inductively coupled plasma mass spectrometry (ICP-MS). The digestion may be directly conducted using multiple-acid methods, or by making Li-borate flux-fused glass first, then dissolving the powdered glass (e.g. Awaji *et al.* 2006; Park *et al.* 2013; Amosova *et al.* 2016). The latter is a method used by some research labs for combined major element (by X-ray fluorescence; XRF) and trace element (by solution ICP-MS) analyses. With the availability of laser ablation (LA) micro-sampling for ICP-MS analysis, we report a method, a variation of that proposed earlier by several others (e.g. Ødegård and Hamester 1997; Ødegård *et al.* 1998; Eggins 2003; Orihashi and Hirata 2003; Yu *et al.* 2003; Vogel *et al.* 2006; Kon *et al.* 2011; Kil and Jung 2015) to analyse the Li-borate glasses after wavelength dispersive XRF (WDXRF) analysis by LA-ICP-MS. The method bypasses the digestion procedure of solution ICP-MS to expedite the whole-rock trace element analysis, with equivalent accuracy and precision relative to solution ICP-MS analysis.

Over the past few decades LA-ICP-MS methodology has been developed and widely used to determine the elemental mass fractions of minerals (e.g. Arrowsmith 1987; Fryer *et al.* 1995; Longerich *et al.* 1996; Günther and Hattendorf 2005; Sylvester 2008). LA-ICP-MS was also used to measure the trace and sometimes major element mass fractions in specific bulk geologic materials (e.g. Imai 1990; Jeffries *et al.* 1995; Li *et al.* 2011; Peng and Hu 2012). Bulk analyses of natural glasses, either MORB or obsidian, for example, lend themselves well to laser ablation determinations because of the inherent homogeneity in such materials (Pereira *et al.* 2001; Jenner and O'Neill 2012*a*, *b*). Methods employing ablation of solid rock or pressed rock powders have been proposed but suffer from the inherent inhomogeneity of natural samples at the micron scale, the sampling scale of the ablation technique, even when finely ground (e.g. Van Heuzen and Morsink 1991; Durrant 1992; Jarvis and Williams 1993; Jeffries *et al.* 1995; Tibi and Heumann 2003; Fernández *et al.* 2008; Peng and Hu 2012; Macholdt *et al.* 2016). Typical pressed pellet methods suffer from poor repeatability and accuracy; improved repeatability and accuracy are reported from ablations of nano-ground powders (Garbe-Schönberg and Müller 2014; Mukherjee *et al.* 2014; Peters and Pettke 2017; Wu *et al.* 2018). However, nano-grinding requires expensive agate milling and extensive washing with ultra-pure water and ethanol to make nanoparticle pressed pellets, and accuracy and precision are still not ideal. For example, Peters and Pettke (2017) report very low mass fractions for Zr in GSP-2 (445 v. expected 621 $\mu $g g^{–1}) and Mo (67 v. expected 250 $\mu $g g^{–1}) in BCR-2, which is not surprising given the small sample masses they employed (2.25 g for nano-milling; 120 mg in pressed pellet). Such small aliquots are subject to nugget effects, which result in low mass fraction determinations.

To increase sample homogeneity, high-temperature fusions have been used to create synthetic glasses for whole-rock composition analysis using LA-ICP-MS (e.g. Van Heuzen 1991; Fedorowich *et al.* 1993; Perkins *et al.* 1993; Lichte 1995; Chen *et al.* 1997; Ødegård and Hamester 1997; Ødegård *et al.* 1998; Becker and Dietze 1999; Becker *et al.* 2000; Pickhardt *et al.* 2000; Günther *et al.* 2001; Eggins 2003; Orihashi and Hirata 2003; Yu *et al.* 2003; Kurosawa *et al.* 2006; Vogel *et al.* 2006; Petrelli *et al.* 2007; Nehring *et al.* 2008; Stoll *et al.* 2008; Kon *et al.* 2011; Leite *et al.* 2011; Zhu *et al.* 2013; Kil and Jung 2015; Kon and Hirata 2015; He *et al.* 2016; Reading *et al.* 2017; Monsels *et al.* 2018; Wu *et al.* 2018). These methods also sometimes suffer from the use of small sample masses, and often from uncertainties no better than 10–30% relative. In our early experiments with adaptations of published methods we found uncertainties for many elements to exceed our uncertainty goal of 10% relative. The high uncertainty of traditional LA-ICP-MS mineral or bulk composition analysis is chiefly due to the use of only one external standard, often not matrix matched with the samples, and one internal standard element. In this paper, we propose a method using a combination of multiple external standards and multiple internal standard elements to establish calibrations. In addition, we reiterate the calibration models to further improve the analytical accuracy and precision, to levels approaching those of solution ICP-MS. We have also developed software to handle the data reduction and multiple internal standard element modelling. In addition, we provide a detailed analysis of the relative uncertainty for each element as a function of concentration.

## Materials

Rock samples and reference material (RM) powders were made into low-dilution (1:2, sample:flux with Li-tetraborate only) doubly fused glasses in this study. The reason and procedures are explained below. To make glasses, the choice is between flux-free and fluxed fusions. Flux-free fusions (e.g. Fedorowich *et al.* 1993; Becker *et al.* 2000; Kurosawa *et al.* 2006; Nehring *et al.* 2008; Stoll *et al.* 2008; Zhu *et al.* 2013; He *et al.* 2016; Reading *et al.* 2017; Wu *et al.* 2018) require very high fusion temperatures, which can lead to loss of volatile analytes such as Cs, Rb, Ge, Cu, Pb, Sn and Zn (e.g. Brown 1977; Fedorowich *et al.* 1993; Nehring *et al.* 2008; Stoll *et al.* 2008; Reading *et al.* 2017). Fluxed fusions with Li-borates are another means of generating homogeneous glasses and eliminating the particle size and mineralogic effects in pressed powders, at much lower fusion temperatures than flux-free fusions. Most Li-borate fusions are done with high dilution factors (1:5 to 1:24.8), likely because of the potential for crucible damage with low-dilution fusion in Pt crucibles. Some workers used low-dilution borate glasses (1:1.3 to 1:3; Lichte 1995; Orihashi and Hirata 2003; Yu *et al.* 2003; Vogel *et al.* 2006; Petrelli *et al.* 2007), because low-dilution fusion improves signal strength, lowers blanks and reduces deposits on the sampling cones (Lichte 1995). In this study, we used low dilution, 1:2, Li-borate glasses; the sample mass employed was 3.5 g.

One advantage of borate-fluxed fusion over flux-free fusion (or nano-grinding) is the reduction of potential nugget effects due to the larger sample masses employed. Sylvester (2001) summarized other advantages of Li-borate flux-fused rock powders for LA-ICP-MS calibration and analysis: (1) the ability of Li-borate fluxes to dissolve resistant minerals, (2) improved matrix matching between calibration materials and samples, and (3) the fact that the flux blank is built into the calibration curve so it is automatically subtracted if multiple external standards are used. Lower fluxed fusion temperatures (*c.* 1000°C) also favour improved volatile element retention compared with flux-free fusion (e.g. Norman *et al.* 2003).

The glasses were doubly fused to attain the best homogeneity, using an adaptation of the preparation method described by Johnson *et al.* (1999). The chief difference is in our use of diamond polishing, rather than SiC grit, to create the analytical surface. First fused beads were re-ground to powder in a tungsten carbide ring mill for 30 seconds and fused again; both fusions were under static conditions in a 1000°C muffle furnace. The flat surface of the doubly fused bead was ground to a 15 $\mu $m finish on diamond lapping plates and sonicated in ethanol for WDXRF analysis. For subsequent LA-ICP-MS analysis, the glasses were further polished to a 1 $\mu $m finish.

This study used 18 diverse RMs for LA-ICP-MS calibration and 17 diverse RMs for validation as shown in Supplementary Table 1. RMs were obtained from the USGS (United States Geological Survey), the GSJ (Geological Survey of Japan), the IAG-GeoPT (International Association of Geoanalysts-GeoProficiency test), the CCRMP (Canadian Certified Reference Materials Project), the GIT-IWG (Groupe International de Travail-International Working Group), and the South African Mintek.

## Choice of drift monitor

A commonly employed drift monitor in LA-ICP-MS analysis is the trace element doped soda lime glass NIST 610. However, a low-dilution Li-tetraborate glass has a much lower and more uniform ablation sensitivity (net analyte LA-ICP-MS intensity divided by mass fraction) than a soda lime glass (Fig. 1). In our initial work we found the accuracy of many elements to be compromised due to the use of NIST 610 as the drift monitor. To drift correct with a more closely matrix-matched monitor, with ablation characteristics like our Li-borate flux-fused glasses, we cast an artificial drift monitor with basaltic andesite bulk composition, doped with several powdered minerals such as zircon and garnet to provide strong trace element signals, as a low-dilution (1:2) Li-tetraborate glass pellet. This produced a drift monitor with very similar ablation characteristics to the RMs and unknowns being measured: the sensitivity of any element in the monitor equals the median of the sensitivities of the same element in the flux-fused RMs (Fig. 1).

## RM mass fraction choices

In selecting RMs to employ for both calibration and validation, the ideal requirements are: (1) RMs that are well-characterized for more than 40 element mass fractions, and (2) RMs that would provide a wide range of mass fraction values for every element of interest. These constraints are not always attainable; satisfaction of the second goal sometimes requires relaxation of the first. Most especially, problems with uncertainty are encountered with the published values for several RMs, especially HREEs in several GSJ reference samples (Fig. 2); these issues were noted in earlier work by Orihashi and Hirata (2003). For example, the initial published data (Imai *et al.* 1999) for JR-3 for Dy and Ho were derived from fewer than five reporting laboratories and had very large uncertainties. Data for Er, with extremely large uncertainties, were derived from only three reporting laboratories, and no ‘recommended’ value for Tm was proposed. In all these cases collection and evaluation of post-publication data compiled by GeoREM (Jochum *et al.* 2005) were used to update the values. Most often, the median of the data determined during the past two decades was selected, but in a few cases the values reported by labs that we regard as highly reliable (e.g. Korotev 1996) were chosen. The RM element mass fractions adopted are compiled in Supplementary Table 3.

## Analytical methods

The fused glasses were analysed by WDXRF first, then LA-ICP-MS. For WDXRF determinations, single backgrounds for each element were measured, and background was subtracted with formulas derived from pure-element spike experiments. Equal time was spent counting peak and background positions for all elements. Total analytical time per pellet was *c.* 130 minutes. All intensities were collected at 45 kV and 45 mA with a Rh target tube. Over 200 spectral interferences were corrected with net intensity ratio factors or formulas derived from experiments with pure element spikes doped into either SiO_{2} or Al_{2}O_{3} matrices. Loss on ignition-eliminated influence coefficients were used for matrix correction. WDXRF calibration was done using the intensities gathered from 72 flux-fused RMs, chiefly those issued by the USGS and GSJ, but also including RMs from the CRPG, GIT-IWG, NIST, BAS, Mintek, and other sources. Further details of the calibration procedure and extensive validation of the WDXRF method can be accessed at https://www.hamilton.edu/academics/analytical-lab.

LA and ICP-MS operating parameters at the two LA-ICP-MS labs are compared in Supplementary Table 2. Both labs used raster lines of the same length, with each track laid far enough from the others to avoid spallation debris. At Rensselaer Polytechnic Institute (RPI), line speeds were higher (100 $\mu $m s^{−1}), but the beam diameter was larger (150 µm), and four tracks were ablated; at Colorado School of Mines (CSM) only two tracks were ablated with a smaller diameter beam (100 $\mu $m), but line speed was much slower (20 $\mu $m s^{−1}) so total ablation time per sample was 2.5 times that at RPI. At RPI the maximum laser fluence (10 J cm^{–2}) was used to improve coupling with transparent, low-Fe samples. The higher fluence facilitated good ablation on a wide range of compositions. At CSM, fewer difficulties were encountered with transparent samples, but rhyolites and carbonates with less than 1%m/m FeO* were still found to couple poorly to the laser. For these samples a small addition (10–50 mg) of spec-pure (five ‘9s’) Fe-oxide was added to the glass, requiring a third fusion following the WDXRF analysis. This was effective at improving signal strength and reducing variability in element sensitivities. Also, at CSM, all samples were put in a vacuum desiccator for at least 12 hours before analysis. After the samples were loaded in the sample chamber, the chamber was pumped to −80 kpa then filled with He for three cycles during the backfill process.

During each analytical run the custom drift monitor was ablated 15–20 (CSM) or 30–40 (RPI) times (on an un-ablated surface each time) in alternation with the samples and the flux-fused RMs. The *Iolite* software package (Paton *et al.* 2011) was used to select the most stable portions of each ablation signal, and to correct for background. The latter was measured on each trough by turning off the laser before or after collecting signal with power on. New *HALite* software (Supplementary material Appendix 1) was used for further data reduction including (1) drift correction (using higher-order polynomial functions as suggested by Cheatham *et al.* 1993), (2) averaging of the multiple track signals per unknown and flux-fused RMs, (3) fitting of sensitivity calibration equations, (4) selection of multiple internal standard (IS) elements, and (5) calculation of mass fractions in unknowns.

### Sensitivity calibrations

Linear fitting of the sensitivity correlations established using the 18 external flux-fused RMs is the basis of our calibration method. Use of multiple external RMs allows the relationship between the analyte sensitivity and the element sensitivities of every potential IS element to be determined and used for prediction of the analyte element sensitivity of both RMs and unknowns. Some examples of the many possible sensitivity relationships are shown in Figure 3. Establishing the sensitivity of the IS element allows prediction of the analyte element sensitivity using the linear least-squares fit to the correlation between the two sensitivities. For RMs the IS element sensitivities are calculated from the known mass fractions and the drift-corrected ablation intensities. For unknowns the WDXRF mass fraction data are employed for sensitivity predictions in a first-round model, and the first-round modelled mass fractions are then used in a second-round model (see below). Linear fits to sensitivity correlations in general do not zero well, although the differences are typically small. As expected, the sensitivities of elements with similar chemical behaviour (e.g. LREEs, or HREEs, *etc*.) are well-correlated (e.g. Ho and Dy in Fig. 3). Trimming of outliers in the correlations (Fig. 3) is sometimes needed to prevent RMs with very low analyte mass fractions (and attendant higher uncertainties in calculated analyte sensitivities) from dominating the linear least-squares fitting; the new software *HALite* includes tools for handling those situations (Supplementary material Appendix 1).

Prediction of the analyte element sensitivity is improved by pooling the predictions from multiple IS elements, and calculation of a weighted mean predicted sensitivity. In all models, more weight is given to the sensitivity correlations with the highest coefficients of determination.

*et al.*(1996), with the exception that a weighted mean sensitivity is employed in place of a single IS element sensitivity. The improvement in the determination of analyte mass fraction with an increase in the number of internal standard elements used is shown for Dy (Fig. 4). The figure also illustrates the improvement in calculated analyte mass fractions in models that employ additional IS elements whose concentrations were calculated in the first-round models.

### Multiple IS element modelling and calibration summaries

For each analyte element a summary graph of the predicted flux-fused RM mass fractions v. accepted mass fractions is always examined, and is an essential tool used in the modelling. The summary graphs serve as targets for optimizing the multiple IS element choices; selection of multiple IS elements which maximize the linear fit of the model and actual RM mass fractions is the goal of the modelling. Alternatively, some reduction of the maximum fit can be accepted to include pooling of less optimal IS element data in the model results. That alternative may be desirable simply to include more data in the predictions, and for lithologies such as peridotites and carbonates which are provided with few IS element options from the WDXRF analysis. In addition, the slope of the linear fit indicates any significant bias in need of secondary correction (see below). Some examples of summary graphs are shown in Figure 5, graphs for the analyte elements not displayed can be found in Supplementary material Appendix A. *HALite* (Supplementary material Appendix 1) includes an auto-picking option which selects the set of IS elements which maximizes the linear least-squares fit of the model and actual RM mass fractions.

### Reiteration method for LA-ICP-MS

Following the initial round of model building for all analytes, the mass fraction data are exported and a new input file for *HALite* is constructed. The choice of IS elements in the first round of models is restricted to those well-determined by WDXRF. Once preliminary mass fractions are calculated for all determined analytes with the initial models, a second or ‘reiteration’ (‘RE’) round of models is constructed with no restriction on IS element choice. Typically, the ‘RE’ models out-perform the initial models (Supplementary Tables 3 and 4) and further reduce the standard error of the 18 flux-fused RM calibration summary linear fits (Figs 4, 5). The ‘RE’ model results are then exported to an external spreadsheet for any necessary nonlinear detector or secondary corrections (see below). As an example (Fig. 4), five IS elements with sensitivities calculated from WDXRF data reduce the standard error for Dy in the 18 RMs to *c.* 0.15 $\mu $g g^{–1}; further reduction in uncertainty of *c.* 50% is attainable by including an additional five IS elements in ‘RE’ models. Similar reductions in mass-fraction uncertainties are common for all analytes using weighted mean multiple IS element predicted sensitivities compared with using single IS element sensitivities.

### Secondary corrections: correction of nonlinear detector response

Manufacturers claim that the dynamic range of modern ICP-MS instruments is linear over many orders of magnitude. However, in our experience, the crossover from pulse counting to analogue mode is not sufficiently linear, even after closely following the suggested method for linearizing the two counting modes. This problem is well-known (e.g. Campbell and Burns 2001; Jenner and O'Neill 2012*a*). The analogue mode invariably appears to over-count compared to the pulse counter (Fig. 6). The older ICP-MS at RPI did not suffer from this problem because it lacked an analogue counter (but the highest count rates for Sr would sometimes flood the pulse counter, thus rendering Sr mass fractions >*c.* 700 $\mu $g g^{–1} undeterminable), while the newer instrument at CSM reaches the crossover threshold for the elements Ce, Nb, Sr and Zr, and for all the major and minor elements. The crossover is automatic (and unrecorded in software) and occurs around 1000 000–1500 000 cps. Because the calibration set includes flux-fused RMs on both sides of the crossover in most cases, it is possible to correct for the nonlinearity. Correction by separate linear fitting on both sides of the crossover (Fig. 6) removes the bias introduced by the analogue counter and improves the model fitting v. the accepted values. If samples exceed the crossover count rate for elements calibrated solely with pulse counting, positive bias typically of 3–5% relative will be present in the results. In the absence of a software record of which detector is employed when flux-fused RM or sample intensities fall within the crossover range, correction decisions are made upon the basis of comparison to expected mass fractions for the flux-fused RMs, or the WDXRF values for unknowns. Modelled mass fractions consistently higher than the expected values indicate the analogue detector; those consistently equal to or lower than the expected values indicate the pulse counter.

### Secondary corrections: use of secondary calibration

For unknown reasons, but most likely related either to volatility during ablation or during double fusions, the elements As, Bi, Cr and Sb have significant biases against the given mass fractions even though they are calibrated to flux-fused RM sensitivities using multiple IS elements (Fig. 7). Secondary correction or calibration based upon linear fitting of the model mass fractions to the accepted ones are applied to remove the consistent biases. Slight biases in other elements (e.g. Pb, Sc, V, etc.) can be similarly corrected by linear fitting of the model results to the given RM values. The need for such corrections is evident when using multiple external RMs, and easily made to both calibration flux-fused RMs and unknowns.

## Validation

To validate the method, 17 flux-fused RMs were analysed, and element mass fractions were calculated using the procedure for unknown samples. Validation model results for Cs, Hf, Nb and Ta are shown in Figure 8; the full set of data is found in Supplementary material Appendix B and Supplementary Table 3. Statistics for all determined analyte models are in Supplementary Table 4. A visual means of evaluating the accuracy of the method is to compare all REE mass fractions for validation RMs using logarithmic chondrite-normalized plots v. atomic number (Fig. 9). Both figures (8 and 9) illustrate the ability of the method to yield accurate mass fraction data for diverse geologic samples.

### Comparison with WDXRF data

As a further check on the validity of the unknown sample LA-ICP-MS model mass fractions, comparison to WDXRF mass fractions is readily made for analyte elements well-determined by both techniques (Fig. 10 and Supplementary material Appendix C). The ability of both methods to yield highly correlated mass fractions is a strong validation of both techniques. Figure 10 also illustrates two major improvements in the method, the first the development and use of *HALite*, which facilitated incorporation of multiple IS elements in the calibration models, compared with the initial tedious use of Excel spreadsheets, and the second the switch to CSM with a newer laser and ICP-MS and increased ablation time per sample, resulting in improved precision by roughly a factor of two. While it is common to compare data with linear plots such as Figure 10a and b, such plots obscure the relative percentage differences in the data. A plot of the LA-ICP-MS data divided by the WDXRF data for the same samples illustrates the relative ratios much more clearly (Fig. 10c through f, and Supplementary material Appendix C). On such plots the mass fraction plotted on a log scale compresses the dataset and allows easy inclusion of the uncertainty ‘horn’ (asymmetric when plotted on a log scale of ratios) of the WDXRF data defined by the approximate WDXRF uncertainty in mass fractions of each element (Fig. 10). Most data should fall within the horn at low mass fractions, reflecting the much higher relative uncertainties in the WDXRF data than in the LA-ICP-MS data. The ideal case is present for Nd, as shown in Figure 10f.

### Comparison with results calculated with NIST 610

NIST 610 glass is frequently the default choice (both of our partner LA-ICP-MS labs initially suggested that option) as the sole drift monitor and internal calibration standard in LA-ICP-MS analysis, so the validation experiment included this customary use of NIST 610 in addition to the instrumental drift and data handling procedures using our custom drift monitor and *HALite* software that allows for multiple IS elements. A comparison of results calculated using NIST 610 v. *HALite* is shown in Supplementary Table 4. Typically, we do not determine major and minor element mass fractions using LA-ICP-MS, because they are better determined using WDXRF, but for this comparison we included them in the modelling and in Supplementary Table 4. NIST 610 performed well for selected elements including Al, Ca, P, Ba, Cr, Mo, Nb, Ni, Sc, Sr, V and Zr. Determinations for those elements agreed well with RM mass fractions and had little or no bias. Cerium was the single REE well-determined using NIST 610; the other REEs had significant biases. Overall, the superiority of *HALite* determinations, which have better agreement with RM mass fractions and less bias, and which can be corrected for calibration bias and detector nonlinearity, is clear. NIST 610 determinations cannot be bias-corrected unless matrix-matched RMs are ablated in each run.

## Assessment of LA-ICP-MS measurement uncertainty

Total measurement uncertainty includes the uncertainties in RM values, measurement repeatability (‘precision’), and method validation (‘accuracy’ or ‘bias’). Several methods of calculating total measurement uncertainty have been proposed and are in common usage (e.g. Joint Committee for Guides in Metrology 2008; Ellison and Williams 2012; Magnusson *et al.* 2017). Unfortunately, we have found the recommendations of these sources to be of little use when considering the problem of expressing LA-ICP-MS uncertainty across the wide mass fraction ranges found in geologic materials. For example, Magnusson *et al.* (2017) suggest that mass fraction data be binned, and bin means fit with linear least squares. While an effective method, it is a tedious process when dealing with dozens of analytes with wide-ranging mass fractions.

The goal of our uncertainty analysis is to establish the total LA-ICP-MS measurement uncertainty as a function of mass fraction for each analyte; such functions have been termed ‘characteristic’ by Thompson and Wood (2006). Instead of the binning proposed by Magnusson *et al.* (2017), we chose to use a functional fitting method. The simplest functions suggested are that the uncertainty is constant, or scales linearly with mass fraction (e.g. Zitter and God 1971; Thompson 1988; Magnusson *et al.* 2017). More complex relationships of uncertainty as a function of mass fraction have been suggested, and include quadratic (e.g. Bubert and Klockenkämper 1983), power (Horwitz and Albert 1995; Sawlan 2017) and logarithmic (Erlich *et al.* 1969) functions. Quadratic functions are more likely to reflect the reality of the several forms of ‘noise’ (electronic or ‘white’, shot or quantum, and flicker), which produce uncertainty in any analytical system (Bubert and Klockenkämper 1983; Prudnikov and Shapkina 1984). Zitter and God (1971) demonstrated the utility of plotting the logarithm of the standard deviation of repeat measurements v. the logarithm of mass fraction; repeatability uncertainties from diverse analytical methods they examined displayed gently or sharply curving arrays on such plots. Thompson (1988) also constructed this type of plot with similar results using ICP optical emission spectrometry repeatability data and noted that the use of logarithms removed the need to use the weighting factors he found necessary in fitting linear uncertainty models.

*n*= 184) was restricted to determinations obtained at CSM. Repeated determinations of the calibration flux-fused RMs modelled as unknowns (using the calibrated sensitivity relationships) from run to run were included to collect between run as well as within run uncertainties. The precision estimates we derive are thus at the ‘intermediate precision’ level as defined by Potts (2012). Ordinary least-squares fitting to the data resulted in the classic heteroscedastic pattern of increasing residuals (differences between linear fit and data) with increasing mass fractions (Fig. 11). Weighted linear least squares (determined using Excel SOLVER), with weights equal to the inverse of the square root of the mass fractions, evened out the residuals to near zero across the range of mass fractions (Fig. 11):

*w*equals the weighting factor (in this case one divided by the square root of

_{i}*x*);

_{i}*x*is mass fraction;

_{i}*y*is standard deviation;

_{i}*a*is the intercept, and

*b*is the slope, of the fitted line. When transformed to percent relative standard deviations (%RSD), the weighted least-squares models flatten at high mass fractions, which is more realistic than the power function model prediction of ever-diminishing uncertainty with increasing mass fraction (nearly equivalent in this case to the Zitter and God 1971 model). The power function does appear to be a better fit to the data at low mass fractions, which implies that the weighted least-squares models are very conservative near the detection limits.

The weighted least-squares fitting estimates the mean repeatability of a measurement for a given mass fraction, which is not the way analytical uncertainty is commonly expressed. To calculate 2 sigma (2$\sigma $) uncertainties the mean uncertainties were multiplied by coverage factors so that 95% of the data were less than the expanded uncertainties (Fig. 12 and Supplementary material Appendix D). Supplementary Table 5 includes all the fitted parameters and coverage factors for each analyte. Repeatability uncertainties at 2$\sigma $ for three common RMs are listed in Supplementary Table 5. Repeatability uncertainty for most elements is below 5%, and for some elements below 2%. Cu, Cr and Ni have higher uncertainties due to low mass fractions in these RMs. The metals Mo, Sn and Zn have higher uncertainty. Volatile elements such as As, Ge, Pb, Bi, Sb and Tl have uncertainties that are much larger, in the range of 10–50%. This is likely in part the result of variable loss of some of these elements during fusion of the borate disks. Silver and Cd uncertainties are also large, reflecting the very low mass fractions of these elements in most geological materials, as well as the poorly established RM mass fractions for these elements.

*C*= measured mass fraction,

*u*= standard deviation or difference, and

*a*,

*b*, and

*c*are the parameters of the fit. Conversion of the quadratic (or linear) log–log equations to %RSD terms is necessary to combine the repeatability uncertainty with the RM and validation uncertainties (see below):

Newer RMs, such as those collected and tested using the IAG GeoProficiency Testing protocol, have relatively low uncertainties due to the large number of participating laboratories (thus reducing the standard error of the mean value of the reported mass fractions). The reported values for older RMs typically have higher uncertainties due to fewer participating laboratories and less precise analytical instrumentation when initially analysed. Because of this, some older RMs have been re-assessed, and updated mass fractions and uncertainties derived from post-certificate data and/or different statistical methods have been published (sources cited in Fig. 13). The fitting parameters in Supplementary Table 5 are based upon models constructed using both recent and older RMs. Note that the unexpected increase in uncertainty at higher mass fraction for some elements (e.g. Rb) (Fig. 13), simply follows the available data; more recent data, especially GeoPT data, better fit the Zitter and God (1971) uncertainty model. The observed pattern for La, of near-constant standard deviation at low mass fraction and near-constant %RSD at high mass fraction, is the expected pattern predicted for most analytes (cf. Magnusson *et al.* 2017). Magnusson *et al.* suggest fitting uncertainty v. mass fraction data solely with those two constraints; however, the data are clearly curvilinear, not bilinear, and thus comport better with three-component quadratic uncertainty models (e.g. Bubert and Klockenkämper 1983). Linear fittings in log–log terms (in other words, power functions such as the widely used Horwitz equation) provide poor fits for most elements at both the lowest and highest mass fractions.

Validation introduces an additional uncertainty component, due either to (1) uncertainty in the RM mass fractions, (2) systematic bias in the determined validation RM mass fractions v. their accepted mass fractions, or (3) the uncertainty in validation determinations being greater than the repeatability uncertainty. The first and third conditions may apply in this case, where relative validation differences often exceed the estimated repeatability uncertainties (Fig. 14 and Supplementary material Appendix F). Note that the relative differences of validation and calibration model mass fractions from accepted values overlap for all determined analyte elements (Fig. 14), so the validation uncertainty simply reflects the calibration model uncertainty and is the dominant uncertainty component in our method.

*U*= combined or total uncertainty; %RSD

^{2}

_{Rprec}= repeatability variance; %RSD

^{2}

_{RMu}= variance of RM mass fractions; Valid

_{cf}= validation coverage factor.

The validation coverage factors were not required for all elements; the factors, where necessary, are listed in Supplementary Table 5. Factors were determined empirically so that 95% of the data expressed in terms of the relative differences in validation and calibration mass fractions fell within the 2$\sigma $ combined uncertainty boundaries (Fig. 14 and Supplementary material Appendix F). All the parameters necessary for these calculations are given in Supplementary Table 5. Typical 2$\sigma $ combined uncertainties for commonly used RMs are also tabulated in Supplementary Table 5. Many elements have total uncertainties in the 5–10% range for mass fractions commonly observed in geological materials; elements with poorer precision, as noted earlier, naturally have larger total uncertainties, sometimes more than 100%.

As corroboration of the uncertainty analysis presented here, and as further validation of the method, the results from the past four rounds of the GeoPT, during which two year period the data were acquired at CSM, are also plotted in Figure 14 and Supplementary material Appendix F. The five diverse sample types included silty soils, monzonite, basalt and calcareous sediment. Except for a consistent bias in Sb and difficulties with the determination of Zn in some runs, the GeoPT results support our estimated uncertainty limits. The GeoPT results also demonstrate that our uncertainty estimates are valid from run to run, not just within single runs.

Detection limits (Supplementary Table 5) were calculated from both the mean and 2$\sigma $ precision equations, with the simple assumption that detection limit equals precision at 100%RSD (we note this is not common usage, but ‘detection limit’ has several definitions). Due to the change in sensitivity of each analyte with bulk sample composition, a strict single detection limit cannot be defined for each element because it varies with the optical coupling of each sample. The values shown should be taken as rough, and conservative, estimates of the detection limits for each analyte using this method. Detection limits determined from background counting uncertainties are lower in our experience. A spreadsheet to calculate the measurement uncertainty and total uncertainty for any element in any sample is available upon request.

## Discussion

Our calibration technique is based upon the Longerich *et al.* equation but employs multiple external and internal standards instead of a single external RM and a single IS element. Use of multiple external RMs tests both the assumptions of linearity and of zero intercept inherent in the strictly single-point calibration implied by the Longerich *et al.* equation. Multiple external RMs provide a more robust estimate of the sensitivity correlation between any two analytes, and its variation with matrix composition. Calibration with multiple external standards thus allows for better matrix matching with unknown samples, so long as their sensitivities fall within calibrated ranges. Multiple external standards also allow for correction of detector nonlinearity, and for other systematic biases in the data.

The use of multiple IS elements also improves performance by pooling more data in the estimation of analyte sensitivities in unknown samples. The power of the weighted mean calculation gathers the pooled data to provide more robust estimates of the analyte sensitivities and predicted mass fractions.

Some puzzles remain, especially the need for secondary corrections to a few volatile elements. And there are some limitations to the type of samples which can be successfully analysed with no additional modification. Samples with very low Fe, such as some high-silica rhyolites and some carbonates, couple poorly to the laser, and require extra doping to improve their ablation yields, and to render the yields more consistent from track to track. Carbonate analysis also suffers from the lack of sufficient RMs; there are very few which are characterized for a full suite of REEs, for example.

## Conclusions

We have developed a new method for LA-ICP-MS analysis of geologic samples, which relies upon multiple external RMs, multiple IS elements and a matrix-matched drift monitor. We have written new software for LA-ICP-MS data handling following collection of background corrected intensities. The method is applicable to a wide range of geologic materials, and relies upon samples, RMs, and drift monitor prepared as low-dilution (1:2) Li-tetraborate flux-fused glass. It also requires that the mass fractions of multiple elements in the samples have been previously determined by another analytical method prior to LA-ICP-MS analysis. Our method uses small pieces of larger glass disks previously prepared for, and analysed by, WDXRF. The advantages of this method are: (1) a relatively large initial sample size (grams), thereby reducing or eliminating the possibility of nugget effects, (2) the homogeneity of sample glasses created by fluxed borate fusion, and (3) the similarity of matrices (and thus response to the laser) of samples, standards and drift monitors.

Calibrations are created for each run using 18 diverse geologic RMs, cast as low-dilution fused glasses. Drift during each run is monitored by repeatedly ablating a custom, low-dilution Li-tetraborate glass, which has similar sensitivities for all analyte elements to the median sensitivities of the RMs and unknowns. Drift is corrected by fitting the measured response of the drift monitor for each analyte to a high-order polynomial function. Linear best-fit calibration models establish the relationships between the sensitivity of an analyte to each of the IS element sensitivities of 18 external RMs. As with using the Longerich *et al.* equation, where the sensitivity of an analyte element is related to the sensitivity of a single IS element of known mass fraction, our method instead relates the sensitivity of an analyte for a given sample or RM to the weighted mean of analyte sensitivity predicted by the chosen multiple IS elements. First-round calibration models employ only those IS elements well-determined by WDXRF; reiteration models allow for the choice of any analyte as an IS element. Results can be further corrected for bias or detector nonlinearity using the known 18 RM mass fractions.

Measurement uncertainty (‘precision’) at 2$\sigma $ was estimated for all elements as a function of mass fraction based upon repeatability determinations of diverse samples. Combined (total) uncertainties at 2$\sigma $, also as functions of mass fraction, were estimated using RM uncertainty functions and flux-fused RM validation determinations. Total uncertainties compare favourably with solution ICP-MS, typically less than 5–10%RSD for most elements at mass fractions common in bulk geological materials.

## Acknowledgements

A big thanks to Stuart Hirshfield (Hamilton Computer Sciences Department) for supervising Ben, Jack, and Oliver's senior project, the creation of *HALite*. Suggestions by three anonymous reviewers led to some improvements in the initial manuscript. We thank Scott Wood for his editorial assistance. RC thanks Thomas Meisel and Cornelia Kriete for the excellent uncertainty analysis short course they taught in Buzios, Brazil at GeoAnalysis 2012.

## Author contributions

**RMC**: conceptualization (lead), data curation (equal), formal analysis (equal), methodology (equal), validation (equal), writing – original draft (lead), writing – review & editing (equal); **DGB**: conceptualization (equal), formal analysis (equal), methodology (equal), project administration (lead), writing – original draft (supporting), writing – review & editing (supporting); **JWS**: data curation (supporting), formal analysis (supporting), investigation (supporting), methodology (supporting); **LJW**: formal analysis (supporting), methodology (supporting), writing – review & editing (supporting); **BP**: methodology (supporting), software (equal); **JH**: methodology (supporting), software (equal); **OK**: methodology (supporting), software (supporting); **ZC**: data curation (supporting), formal analysis (supporting), methodology (supporting), writing – original draft (supporting), writing – review & editing (supporting); **SH**: data curation (supporting), formal analysis (supporting), methodology (supporting)

## Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

## Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

## Data availability

The tabulated data shown in the article are the only data available to share. Other data used for illustration belongs to clients of our laboratory and would require explicit permission from each client to share their data.