Distributed acoustic sensing (DAS) is a novel and fast-developing seismic acquisition technology, which enjoys many advantages compared with traditional geophones. However, DAS data often suffer from severe and diverse types of noise with varying amplitudes, resulting in a low signal-to-noise ratio (S/N) and making the extraction of hidden signals a challenging task. Therefore, exploring a high-efficiency and high-generalization denoising method is crucial for improving the S/N of DAS data and subsequent processing. We develop a dense connection network with the kernel-wise attention mechanism to denoise complex and diverse noise (e.g., high-amplitude erratic, high-frequency, random, and horizontal noise) on real DAS data sets. We use an integrated denoising framework that is suitable for attenuating DAS noise to generate labels for network training. Our network consists of five types of blocks, i.e., convolutional, dense, transition down, transition up, and selective kernel blocks (SKB). In particular, the SKB is used to fuse multiscale features by weighting, thereby improving denoising accuracy. The computational efficiency and denoising performance are further augmented by using a patching method to segment the DAS data and generate many small-scale patches. Our network is trained on a small DAS data set and tested on the synthetic and field data from vastly different geographic areas. The comparisons of our network with three state-of-the-art deep-learning-based benchmark models demonstrate more robust performance and superior signal extraction ability.