Weak signal preservation is critical in the application of seismic data denoising, especially in deep seismic exploration. It is hard to separate those weak signals in seismic data from random noise because it is less compressible or sparsifiable, although they are usually important for seismic data analysis. Conventional sparse coding models exploit the local sparsity through learning a union of basis, but it does not take into account any prior information about the internal correlation of patches. Motivated by an observation that data patches within a group are expected to share the same sparsity pattern in the transform domain, so-called group sparsity, we have developed a novel transform learning with group sparsity (TLGS) method that jointly exploits local sparsity and internal patch self-similarity. Furthermore, for weak signal preservation, we extended the TLGS method and developed the transform learning with external reference. External clean or denoised patches are applied as the anchored references, which are grouped together with similar corrupted patches. They are jointly modeled under a sparse transform, which is adaptively learned. This is achieved by jointly learning a subset of the transform for each group data. Our method achieves better denoising performance than existing denoising methods, in terms of signal-to-noise ratio values and visual preservation of weak signal. Comparisons of experimental results on one synthetic data and three field data using the - deconvolution method and the data-driven tight frame method are also provided.