The selection and weighting of ground‐motion models (GMMs) introduces a significant source of uncertainty in U.S. Geological Survey (USGS) National Seismic Hazard Modeling Project (NSHMP) forecasts. In this study, we evaluate 18 candidate GMMs using instrumental ground‐motion observations of horizontal peak ground acceleration (PGA) and 5%‐damped pseudospectral acceleration (0.02–10 s) for tectonic earthquakes and volcanic eruptions, to inform logic‐tree weights for the update of the USGS seismic hazard model for Hawaii. GMMs are evaluated using two methods. The first is a total residual visualization approach that compares the probability density function (PDF), mean and standard deviations , of the observed and predicted ground motion. The second GMM evaluation method we use is the common total residual probabilistic scoring method (log likelihood [LLH]). The LLH method provides a single score that can be used to weight GMMs in the Hawaii seismic hazard model logic trees. The total residual PDF approach provides additional information by preserving GMM over‐ and underprediction across a broad spectrum of periods that is not available from a single value LLH score. We apply these GMM evaluation methods to two different data sets: (1) a database of instrumental ground motions from historic earthquakes in Hawaii from 1973 to 2007 ( 4–7.3) and (2) available ground motions from recent earthquakes ( 4–6.9) associated with 2018 Kilauea eruptions. The 2018 Kilauea sequence contains both volcanic eruptions and tectonic earthquakes allowing for statistically significant GMM comparisons of the two event classes. The Kilauea ground observations provide an independent data set allowing us to evaluate the predictive power of GMMs implemented in the new USGS nshmp‐haz software system. We evaluate GMM performance as a function of earthquake depth and we demonstrate that short‐period volcanic eruption ground motions are not well predicted by any candidate GMMs. Nine of the initial 18 candidate GMMs fit the observed ground motions and meet established criteria for inclusion in the update of the Hawaii seismic hazard model. A weighted mean of four top performing GMMs in this study (NGAsubslab, NGAsubinter, ASK14, A10) is 50% lower for PGA than for GMMS used in the previous USGS seismic hazard model for Hawaii.