**Normal distribution**

When mathematicians in the eighteenth century began to
investigate the distributions of random variables, the familiar bell curve of
the normal distribution soon came to the fore. It was first found by De Moivre
as a limiting form of the binomial distribution in 1733, but was neglected until
it was rediscovered by Gauss in 1809 and it is often called the Gaussian
distribution. It came to be thought apodictic that it always applied, which has
led to some serious errors even to the present day. Yet it can be said that the
normal distribution is *almost* universal. Among other reasons because:

- The normal distribution was put forward by Gauss as a plausible distribution for measurement errors, and indeed it is found to be applicable over almost the whole of science and engineering measurement.
- Many other distributions tend towards the normal under commonly found conditions, as we have seen with Poisson.
- A
remarkable theorem known as
*The Central Limit Theorem*(due to Laplace) states, that, under very general conditions when*n*random variables, whatever their distributions, are added together, the distribution of the sum tends towards the normal as*n*increases. - As a result of the Central Limit Theorem, averages of random variables will also tend to the normal. Many observed variables are, in fact, composed of the sums of other variables, including measurement errors.

When does the normal distribution not apply? Well, we have seen, for example with Poisson, that for small expected values this is the case, as it is with the binomial. The biggest errors, however, were made with attempts to apply it to extreme values. In areas such the strength of materials, which are subject to the principle of a chain being as strong as its weakest link, smallest value distributions apply. In the study of floods and other record highs, largest value distributions apply. The birth month fallacy is an example of a very common error of this sort, and it emerges several times a year.

The normal density function is given by:

where *m*
is the mean and *s*
is the standard deviation. The peak (mode) is also at x = m.
There are inflection points at x = *m *
± *s.
*An important property of the distribution is the way that it declines as *x*
moves away from *m*.
Here is a table of the *percentage* probability that *x* lies *outside*
the range *x* ±
*ls*:

l |
P |

1 |
31.371 |

2 |
4.550 |

3 |
0.270 |

4 |
0.006 |

Thus two standard deviations corresponds to roughly 5% (good enough for epidemiology), three corresponds to much less than 1% and is therefore regarded as unlikely, while 4 is very unlikely indeed.

Engineers sometimes use the three-sigma rule to exclude “outliers” from their graphs. This is one of the reasons that, for example, extreme value distributions were discovered so late.