Optimal transport distance has been recently promoted as a tool to measure the discrepancy between observed and seismic data within the full-waveform-inversion strategy. This high-resolution seismic imaging method, based on a data-fitting procedure, suffers from the nonconvexity of the standard least-squares discrepancy measure, an issue commonly referred to as cycle skipping. The convexity of the optimal transport distance with respect to time shifts makes it a good candidate to provide a more convex misfit function. However, the optimal transport distance is defined only for the comparison of positive functions, while seismic data are oscillatory. A review of the different attempts proposed in the literature to overcome this difficulty is proposed. Their limitations are illustrated: Basically, the proposed strategies are either not applicable to real data, or they lose the convexity property of optimal transport. On this basis, we introduce a novel strategy based on the interpretation of the seismic data in the graph space. Each individual trace is considered, after discretization, as a set of Dirac points in a 2D space, where the amplitude becomes a geometric attribute of the data. This ensures the positivity of the data, while preserving the geometry of the signal. The differentiability of the misfit function is obtained by approximating the Dirac distributions through 2D Gaussian functions. The interest of this approach is illustrated numerically by computing misfit-function maps in schematic examples before moving to more realistic synthetic full-waveform exercises, including the Marmousi model. The better convexity of the graph-based optimal transport distance is shown. On the Marmousi model, starting from a 1D linearly increasing initial model, with data without low frequencies (no energy less than 3 Hz), a meaningful estimation of the P-wave velocity model is recovered, outperforming previously proposed optimal-transport-based misfit functions.