The assessment of earthquake forecast models for practical purposes requires more than simply checking model consistency in a statistical framework. One also needs to understand how to construct the best model for specific forecasting applications. We describe a Bayesian approach to evaluating earthquake forecasting models, and we consider related procedures for constructing ensemble forecasts. We show how evaluations based on Bayes factors, which measure the relative skill among forecasts, can be complementary to common goodness‐of‐fit tests used to measure the absolute consistency of forecasts with data. To construct ensemble forecasts, we consider averages across a forecast set, weighted by either posterior probabilities or inverse log‐likelihoods derived during prospective earthquake forecasting experiments. We account for model correlations by conditioning weights using the Garthwaite–Mubwandarikwa capped eigenvalue scheme. We apply these methods to the Regional Earthquake Likelihood Models (RELM) five‐year earthquake forecast experiment in California, and we discuss how this approach can be generalized to other ensemble forecasting applications. Specific applications of seismological importance include experiments being conducted within the Collaboratory for the Study of Earthquake Predictability (CSEP) and ensemble methods for operational earthquake forecasting.
Online Material: Tables of likelihoods for each testing phase and code to analyze the RELM experiment.