# Central Limit Theorem simulation

A very well known result in probability theory is the Central Limit Theorem, which states that the sum of i.i.d. random variables converges in distribution and at a given speed to a normal distribution.

Let $$X_1, \dots, X_n$$ be a sample of i.i.d. random variables in $$L^2$$ with expected value $$\mu$$ and variance $$\sigma^2$$. We are interested in the average

$S_n := \frac{X_1 + \dots + X_n}{n}$

We know by the Strong Law of Large Numbers that this quantity converges in probability and a.s. to the expected value $$\mu$$. The CLT states that

$\sqrt n \left(S_n - \mu \right) \xrightarrow{\mathcal{L}} \mathcal{N}(0, \sigma^2)$

In other words the average of $$n$$ independent realizations of a random variable converges to a normal distribution at speed $$\sqrt{n}$$.

We give here a tool to verify that when $$n$$ grows, the variable $$\sqrt{n}(S_n - \mu)$$ approximates more and more precisely a normal distribution.

Here below we write the density function (or the mass function) for some known random variables.

Normal distribution with mean $$\mu$$ and standard deviation $$\sigma$$

$f(x) = \frac 1 {\sigma \sqrt{2 \pi}} e^{- \frac{(x-\mu)^2}{2 \sigma^2}}$

Uniform distribution on the real interval $$[a,b]$$

$f(x) = \left\{ \begin{array}{cc} \frac 1 {b-a} & if \, x \in [a,b] \\ 0 & otherwise \end{array} \right.$

Exponential distribution of parameter $$\lambda$$

$f(x) = \left\{ \begin{array}{cc} \lambda e^{-\lambda x} & x \geq 0 \\ 0 & x < 0 \end{array} \right.$

Gamma distribution of parameter $$k$$

$f(x) = \frac {x^{\alpha-1} \beta^\alpha e^{-\beta x }}{\Gamma(\alpha)}$

Bernoulli distribution of parameter $$p$$

$\mathbb{P}(X = 1) = 1 - \mathbb{P}(X = 0) = p$

Binomial distribution of parameters $$n$$ and $$p$$

$\mathbb{P}(X = k) = \binom n k p^k(1-p)^{n-k}$

Poisson distribution of parameter $$\lambda$$

$\mathbb{P}(X = k) = \frac{\lambda^k}{k!} e^{-\lambda}$

The seed used to initialize the pseudorandom number generator changes at each call of the algorithm if we set the variable $$seed = 0$$. If we choose an integer different from zero, this seed will be taken to initialize the pseudorandom generator. This can be useful when we want to reproduce twice the same simulation.

Let $$Y_n = \sqrt n ( S_n - \mu )$$ and let $$(Y_n^k)_{k\geq 0}$$ i.i.d. such that $$Y_n^k \sim Y_n$$ for each $$k \geq 0$$. In order to visualize the convergence to the normal distribution $$\mathcal N (0, \sigma^2)$$, we make 1000 simulations of $$Y_n$$ i.e. $$(Y_n^1, \ldots, Y_n^{1000})$$ and we plot them in an normed histogram with a fixed number of bins.

We notice that for a number of random realizations $$n=1$$ we obtain an approximation of the distribution of the centered random variable.

We have to set the number of realizations $$n$$ of the random variable and the number of bins in the histogram.