In my own experience, I've encountered cocks from roughly the size of my thumb to my forearm, or about 2.5 to around 11 inches long. While ones at the extreme are rare, I have seen several. However, since I have not seen millions of cocks, I have seen far more large ones than your statistics would predict. For reference, I would estimate that cocks in the 8 to 9 inch range are found on perhaps 1% or so of the men I have played with, not one in a million.
There are perhaps several reasons for this, but the most likely one has to do with improper use and poor understanding of statistical techniques, leading to invalid conclusions. You, and many of the researchers you quote, have assumed that the data has a Gaussian (normal) distribution when applying the concepts of mean and standard deviation (SD), and then using these number to extrapolate the expected frequency of finding any particular size. Unfortunately, this is often done on datasets which are not truly Gaussian.
The mean and SD are valid concepts for data with any distribution. However, using these two values to predict the frequency in the way you did is only valid if the distribution is Gaussian. It is common for researchers to summarize data as the mean and SD, and leave it to readers to assume that the data set is Gaussian, when in fact the accuracy of fit to the Gaussian model was never verified. Just because the distribution curve seems "bell shaped," doesn't mean that it accurately follows the Gaussian model. Even if the data appears to be a reasonable fit to a Gaussian model, it is important to note that the further you get from the mean, the less accurate the extrapolation may become. This is because the tests used to determine the goodness of fit to the model aren't very sensitive to points out on the "tails" of the distribution, as there are comparatively few of these points in a dataset of roughly Gaussian shape. So, you have to beware of large extrapolations (many SD) in all cases, and even be skeptical closer to the mean if the goodness of fit to the model is suspect.
To have a true Gaussian distribution, the data has to be truly random in nature. It is doubtful that penis size is truly random, as genetics, mate selection, physical limitations, and other somewhat determinisitic factors are involved. Therefore, it is not surprising that actual experience differs from the results extrapolated using the mean and SD assuming a Gaussian model. Even a subtle deterministic effect can casue large deviations from the model at the "tails" of the curve.
In fact, even if penis length were truly random, the model using a mean length plus/minus so many standard deviations is obviously flawed. For example, you say the world's largest cock would be at +10 standard deviations, at 12 x 8.5 inches. But, as a Gaussian distribution is symetrical about the mean, this means that the smallest penis, at -10 standard deviations, would be negative 1 inch long by 0.5 in circumference (this is using your numbers; I didn't bother to verify your caluclations). Since a negative length is physically impossible, it is obvious that your model cannot be accurate at the large extrapolations you make.
When measuring the size of things that vary widely (which cocks do, despite the feel-good BS to the contrary), statistics based on geometric distributions, rather than arithmetic ones, somteimes work out as better fits. For example, the arithmetic Gaussian distribution you assumed predicts that for every cock X inches longer than the arithmetic mean, there is one X inches shorter, given the symetrical distribution curve. However, in a geometric distribution, for every cock Y times the length of the geometric mean, you have one equal to the mean divided by Y. (Note that you can't get negative lengths this way; just nearly zero.) A simple way to analyze a data set in this way is to take the logarithms of the lengths, then apply the usual arithmetic Gaussian statistics to the logs. This is still assuming a Gaussian distribution (which may not really be true due to deterministics factors), but now applies it to the ratio of the length to the mean. I suspect that will produce results that agree better with observation.