The traditional magnitude estimation method, which establishes a linear relationship between a single warning parameter and the magnitude, exhibits considerable scatter and underestimation. In addition, the extraction of features from raw waveforms by a deep learning network is a black box. To provide a more robust magnitude estimation and to construct a deep learning network with an interpretable input, in light of deep learning and earthquake rupture physics, we have established a magnitude estimation network model (MEANet) via the physics-based features time series, an attention mechanism, and neural networks. We use events with 4 ≤ M ≤ 7.5 that occur in Japan and the Sichuan-Yunnan region, China, to train and validate MEANet, and then use MEANet to test additional events. Our results find that MEANet has a more robust magnitude estimation than the traditional and methods, with a standard deviation of error of ±0.25 magnitude units at a single station with a 3 s P-wave time window. Within 10 s after the first station is triggered, based on the weighted average of the triggered stations, MEANet provides robust magnitude estimation without underestimation for events with 4 ≤ M ≤ 7.5. Our finding implies that the final magnitude is to some degree deterministic by the combination of deep learning and physics-based features. Meanwhile, MEANet might have potential for earthquake early warning.