Misclassified nonearthquake seismic events like quarry blasts can contaminate the earthquake catalog. The local earthquakes sometimes have similar features as the quarry blasts, which makes manual discrimination difficult and unreliable. Thus, we propose to use the compact convolutional transformer (CCT) and capsule neural network to discriminate between earthquakes and quarry blasts. First, we extract 60 s three‐channel seismograms, that is, 10 and 50 s before and after the P‐wave arrival time. Then, we transform the time‐series data into a time–frequency domain (scalogram) using the continuous wavelet transform. Afterward, we utilize the CCT network to extract the most significant features from the input scalograms. The capsule neural network is utilized to extract the spatial relation between the extracted features using the routing‐by‐agreement approach (dynamic routing). The capsule neural network extracts different digit vectors for the earthquake and the quarry blast classes, allowing a robust classification accuracy. The proposed algorithm is evaluated using the seismic dataset recorded by the Egyptian Seismic Network. The dataset is divided into 80% for training and 20% for testing. Although the dataset is unbalanced, the proposed algorithm shows promising results. The testing accuracy of the proposed algorithm is 97.31%. The precision, recall, and F1‐score are 97.23%, 98.83%, and 98.02%, respectively. In addition, the proposed algorithm outperforms the traditional deep learning models, for example, convolutional neural network, ResNet, Visual Geometry Group (VGG), and AlexNet networks. Finally, the proposed method is demonstrated to enjoy a high‐generalization ability through a real‐time monitoring experiment.