Earthquake forecasting models represent our current understanding of the physics and statistics that govern earthquake occurrence processes. Providing such forecasts as falsifiable statements can help us assess a model’s hypothesis to be, at the least, a plausible conjecture to explain the observations. Prospective testing (i.e., with future data, once the model and experiment have been fully specified) is fundamental in science because it enables confronting a model with completely out‐of‐sample data and zero degrees of freedom. Testing can also help inform decisions regarding the selection of models, data types, or procedures in practical applications, such as Probabilistic Seismic Hazard Analysis. In 2010, a 10‐year earthquake forecasting experiment began in Italy, where researchers collectively agreed on authoritative data sources, testing rules, and formats to independently evaluate a collection of forecasting models. Here, we test these models with ten years of fully prospective data using a multiscore approach to (1) identify the model features that correlate with data‐consistent or ‐inconsistent forecasts; (2) evaluate the stability of the experiment results over time; and (3) quantify the models’ limitations to generate spatial forecasts consistent with earthquake clustering. As each testing metric analyzes only limited properties of a forecast, the proposed synoptic analysis using multiple scores allows drawing more robust conclusions. Our results show that the best‐performing models use catalogs that span over 100 yr and incorporate fault information, demonstrating and quantifying the value of these data types. Model rankings are stable over time, suggesting that a 10‐year period in Italy can provide sufficient data to discriminate between optimal and suboptimal forecasts. Finally, no model can adequately describe spatial clustering, but those including fault information are less inconsistent with the observations. Prospective testing assesses relevant assumptions and hypotheses of earthquake processes truly out‐of‐sample, thus guiding model development and decision‐making to improve society’s earthquake resilience.

You do not have access to this content, please speak to your institutional administrator if you feel you should have access.