The frequency–Bessel transform (F–J) method, which can reliably provide multimodal surface‐wave dispersion spectrograms from recorded ambient noise, has been applied in many studies of the earth’s velocity structure. However, extracting dispersion curves and determining their roots can be challenging. To circumvent these challenges, we present a new, objective spectrum inversion scheme for multimodal dispersion spectrograms. In our new method, the image dissimilarity between the observed dispersion spectrogram and the synthetic kernel spectrum of Green’s function is directly minimized to invert the subsurface velocity structure by a quasi‐Newton method. During the spectrum inversion, Green’s kernel spectrum and its derivatives are efficiently calculated by the generalized reflection and transmission coefficient method. Thus, we can rapidly perform structure inversion for multimodal dispersion spectrograms. Finally, the reliability and practicality of the new method are validated by synthetic and field applications, respectively.