Least-squares reverse time migration is a powerful approach for true-amplitude seismic imaging of complex geologic structures, but the successful application of this method is currently hindered by its enormous computational cost, as well as its high memory requirements for computing the gradient of the objective function. We have tackled these problems by introducing an algorithm for low-cost sparsity-promoting least-squares migration using on-the-fly Fourier transforms. We formulate the least-squares migration objective function in the frequency domain (FD) and compute gradients for randomized subsets of shot records and frequencies, thus significantly reducing data movement and the number of overall wave equations solves. By using on-the-fly Fourier transforms, we can compute an arbitrary number of monochromatic FD wavefields with a time-domain (TD) modeling code, instead of having to solve individual Helmholtz equations for each frequency, which becomes computationally infeasible when moving to high frequencies. Our numerical examples demonstrate that compressive imaging with on-the-fly Fourier transforms provides a fast and memory-efficient alternative to TD imaging with optimal checkpointing, whose memory requirements for a fixed background model and source wavelet are independent of the number of time steps. Instead, the memory and additional computational costs grow with the number of frequencies and determine the amount of subsampling artifacts and crosstalk. In contrast to optimal checkpointing, this offers the possibility to trade the memory and computational costs for image quality or a larger number of iterations and is advantageous in new computing environments such as the cloud, where computing is often cheaper than memory and data movement.