We have developed an artificial neural network to estimate P-wave velocity models directly from prestack common-source gathers. Our network is composed of a fully connected layer set and a modified fully convolutional layer set. The parameters in the network are tuned through supervised learning to map multishot common-source gathers to velocity models. To boost the generalization ability, the network is trained on a massive data set in which the velocity models are modified from natural images that are collected from an online repository. Multishot seismic traces are simulated from those models with acoustic wave equations in a crosswell acquisition geometry. Shot gathers from different source positions are transformed as channels in the network to increase data redundancy. The training process is expensive, but it only occurs once up front. The cost for predicting velocity models is negligible once the training is complete. Different variations of the network are trained and analyzed. The trained networks indicate encouraging results for predicting velocity models from prestack seismic data that are acquired with the same geometry as in the training set.