Time-frequency (TF) transforms are commonly used to analyze local features of nonstationary seismic data and to help uncover structural or geologic information. Traditional TF transforms, such as short-time Fourier transform, continuous wavelet transform, and S-transform, suffer from the Heisenberg uncertainty principle, and their TF resolution is limited. The sparse TF (STF) transform has been proposed to address this disadvantage; however, expensive calculation and parameter selection present difficulties. We have developed a self-supervised TF representation based on a generative adversarial networks (STFR-GANs) model to map a 1D seismic signal into a 2D STF image. This model includes three components: a generator, a discriminator, and a reconstruction module. The generator is used to generate the STF spectrum of the input seismic trace, whereas the discriminator distinguishes if this generated STF spectrum is optimal. The reconstruction module serves as a physical constraint to ensure the accuracy of the generated STF spectrum. When implementing model training, the discriminator learns to identify the ideal STF and guides the generator to produce a TF spectrum closer to the ideal one. After model training, we applied the model to synthetic and field data to demonstrate its effectiveness and stability in characterizing the TF features of seismic data. Our results find that STFR-GAN can effectively provide TF representations with higher readability than those of traditional TF methods, further benefiting fluvial channel delineation.