Abstract
Apatite major and trace element chemistry is a widely used tracer of mineralization as it sensitively records the characteristics of the magmatic-hydrothermal system at the time of its crystallization. Previous studies have proposed useful indicators and binary discrimination diagrams to distinguish between apatites from mineralized and unmineralized rocks; however, their efficiency has been found to be somewhat limited in other systems and larger-scale data sets. This work applied a machine learning (ML) method to classify the chemical compositions of apatites from both fertile and barren rocks, aiming to help determine the mineralization potential of an unknown system. Approximately 13 328 apatite compositional analyses were compiled and labeled from 241 locations in 27 countries worldwide, and three apatite geochemical data sets were established for XGBoost ML model training. The classification results suggest that the developed models (accuracy: 0.851–0.992; F1 score: 0.839–0.993) are much more accurate and efficient than conventional methods (accuracy: 0.242–0.553). Feature importance analysis of the models demonstrates that Cl, F, S, V, Sr/Y, V/Y, Eu*, (La/Yb)N, and La/Sm are important variables in apatite that discriminate fertile and barren host rocks and indicates that V/Y and Cl/F ratios and the S content, in particular, are crucial parameters to discriminating metal enrichment and mineralization potential. This study suggests that ML is a robust tool for processing high-dimensional geochemical data and presents a novel approach that can be applied to mineral exploration.