Seismic facies interpretation supports subsurface geologic environment analyses and reservoir predictions. Traditional interpretation methods require much manual work, and they heavily depend on the experience and expertise of the interpreters. We have developed advanced algorithms from supervised deep learning to perform automatic seismic facies interpretations. In deep learning, conventional convolutional neural networks (CNN) and encoder-decoder architectures are widely used for image classification and segmentation problems, respectively. Based on these two architectures, we build a 3D conventional CNN and a conventional encoder-decoder, and then we apply an enhanced encoder-decoder (DeepLabv3+) that integrates superior structures to our research. To train the networks, we develop an effective scheme to automatically and diversely augment data by using well-labeled 2D seismic sections in which facies are divided into nine classes. We perform experiments on the Netherlands F3 data set by training diverse samples and tuning parameters, implement the trained networks on the whole data volume, and then quantitatively evaluate the results. The testing results of the encoder-decoders are more accurate and efficient than those of the conventional CNN, as well as being more consistent with the geologic background. The mean intersection-over-union values for the encoder-decoders are 87.8% (conventional) and 92.4% (enhanced), respectively, whereas for the conventional CNN, the value is 67.8%. Besides, the prediction of one seismic section takes less than 1.0 s for the encoder-decoders, whereas it takes 4.0 min for the conventional CNN.