Full-waveform inversion has evolved into a powerful computational tool in seismic imaging. New misfit functions for matching simulated and measured data have recently been introduced to avoid the traditional lack of convergence due to cycle skipping. We have introduced the Wasserstein distance from optimal transport for computing the misfit, and several groups are currently further developing this technique. We evaluate three essential observations of this new metric with implication for future development. One is the discovery that trace-by-trace comparison with the quadratic Wasserstein metric works remarkably well together with the adjoint-state method. Another is the close connection between optimal transport-based misfits and integrated techniques with normalization as, for example, the normalized integration method. Finally, we study the convexity with respect to selected model parameters for different normalizations and remark on the effect of normalization on the convergence of the adjoint-state method.