It is well known that the Gutenberg-Richter power law distribution has to be modified for large seismic moments because of energy conservation and geometrical reasons. Several models have been proposed, either in terms of a second power law with a larger b-value beyond a crossover magnitude, or based on a “hard” magnitude cut-off or a “soft” magnitude cut-off using an exponential taper. Since large-scale tectonic deformation is dominated by the very largest earthquakes, and since their impact on loss of life and properties is huge, it is of great importance to constrain as much as possible the shape of their distribution. We present a simple and powerful probabilistic theoretical approach that shows that the gamma distribution is the best model given the two hypotheses that the Gutenberg-Richter power law distribution holds in absence of any condition and that one or several constraints are imposed, either based on conservation laws or on the nature of the observations themselves. The selection of the gamma distribution does not depend on the specific nature of the constraint. We illustrate the approach with two constraints: the existence of a finite moment release rate and the observation of the size of a maximum earthquake in a finite catalog. Our predicted “soft” maximum magnitudes compare favorably with those obtained by Kagan for the Flinn-Engdahl regionalization of subduction zones, collision zones, and midocean ridges.