We assessed the applicability of several ground‐motion models (GMMs) against Iran’s local data. Candidate GMMs are selected from those developed for shallow crustal regions such as Iran, Turkey, Japan, Europe and the Middle East, and the western United States. We made the evaluation database prospective to all candidate GMMs to assess their predictive capability. The evaluation database is composed of 643 records from 240 earthquakes with magnitudes ranging from 3.9 to 7.3 and Joyner–Boore distances up to 300 km. We implemented the log‐likelihood method of Scherbaum et al. (2009), the Euclidean distance‐based ranking proposed by Kale and Akkar (2013), and the multivariate logarithmic score of Mak et al. (2017) to evaluate the candidate models. We ranked GMMs by paying attention to the issue of score variability. To assess the score variability, we generated resampled datasets from the whole database using the cluster bootstrap technique and ranked models based on their relative performance among bootstrap samples. Overall, Sedaghati and Pezeshk (2017), Zafarani et al. (2018), and Farajpour et al. (2019) local models outperform remaining models considering the whole database over the entire frequency range. For high‐seismicity regions, the Zhao et al. (2006) model can be used in line with the first two local models to better quantify epistemic uncertainties associated with the process of model selection. In addition to aforementioned local models, Bindi et al. (2014) show acceptable performance against small‐to‐moderate magnitude data and may be considered for estimating seismic hazard in low‐seismicity regions of Iran.