Xin et al. (2019) presented 3D seismic velocity models ( and ) of crust and uppermost mantle of continental China using seismic body‐wave travel‐time tomography, which are referred to as Unified Seismic Tomography Models for Continental China Lithosphere 1.0 (USTClitho1.0). Compared with previous models of continental China, the and models of USTClitho1.0 have the highest spatial resolution of 0.5°–1.0° in the horizontal direction and are useful for better understanding the complex tectonics of continental China. Although USTClitho1.0 is implicitly constrained by surface‐wave data by using the model from surface‐wave tomography and the converted model as initial models for body‐wave travel‐time tomography, the predicted surface‐wave dispersion curves from USTClitho1.0 do not fit the observed data well. Here, we present updated 3D and models of the continental China lithosphere (USTClitho2.0) by joint inversion of body‐wave arrival times and surface‐wave dispersion data. Compared with the previous joint inversion scheme of Zhang et al. (2014), similar to Fang et al. (2016), it is further improved by including the sensitivity of surface‐wave dispersion data to in the new joint inversion system. As a result, the shallow structure is also better imaged. In addition, the new joint inversion scheme considers the large topography variations between the eastern and western parts of China. Thus, USTClitho2.0 better resolves the upper‐crustal structure of the Tibetan plateau. Compared with USTClitho1.0, USTClitho2.0 fits both body‐wave arrival times and surface‐wave dispersion data. Thus, the new velocity models are more accurate and can serve as a better reference model for regional‐scale tomography and geodynamic studies in continental China.