The use of ground-motion-prediction equations to estimate ground shaking has become a very popular approach for seismic-hazard assessment, especially in the framework of a logic-tree approach. Owing to the large number of existing published ground-motion models, however, the selection and ranking of appropriate models for a particular target area often pose serious practical problems. Here we show how observed ground-motion records can help to guide this process in a systematic and comprehensible way. A key element in this context is a new, likelihood based, goodness-of-fit measure that has the property not only to quantify the model fit but also to measure in some degree how well the underlying statistical model assumptions are met. By design, this measure naturally scales between 0 and 1, with a value of 0.5 for a situation in which the model perfectly matches the sample distribution both in terms of mean and standard deviation. We have used it in combination with other goodness-of-fit measures to derive a simple classification scheme to quantify how well a candidate ground-motion-prediction equation models a particular set of observed-response spectra. This scheme is demonstrated to perform well in recognizing a number of popular ground-motion models from their rock-site-recording subsets. This indicates its potential for aiding the assignment of logic-tree weights in a consistent and reproducible way. We have applied our scheme to the border region of France, Germany, and Switzerland where the Mw 4.8 St. Dié earthquake of 22 February 2003 in eastern France recently provided a small set of observed-response spectra. These records are best modeled by the ground-motion-prediction equation of Berge-Thierry et al. (2003), which is based on the analysis of predominantly European data. The fact that the Swiss model of Bay et al. (2003) is not able to model the observed records in an acceptable way may indicate general problems arising from the use of weak-motion data for strong-motion prediction.