We will make some observations and do some experiments first. At the end we can summarize the understanding and it will become a definition/explanation.

Let us say we have a dice and the sample space ( or a set of all the possibilities) for a roll is {1, 2, 3, 4, 5, 6}.

Let us pick a random number from this set, or if you have a dice, roll it. Let us say I rolled it and got a number say 6. Let us roll it a 100 times. Since it is difficult to roll it that many times, we will use computer to do it.

Let us create a digital dice first.

Input Code: Table[n, {n, 1, 6}]

Output: {1, 2, 3, 4, 5, 6}

From this digital dice, we will pick a random sample of 100 (this is similar to rolling a dice 100 times and noting the outcomes)

Input Code: Table[RandomChoice[a], {n, 1, 100}]

Output: {4, 2, 1, 1, 5, 6, 2, 3, 1, 2, 5, 2, 2, 2, 4, 3, 1, 1, 5, 2, 2, 3, 1, 5, 4, 1, 3, 4, 1, 1, 3, 5, 4, 2, 5, 1, 2, 6, 1, 5, 6, 4, 6, 6, 6, 4, 1, 4, 4, 4, 4, 4, 2, 2, 1, 6, 4, 6, 4, 5, 5, 4, 6, 2, 4, 5, 2, 3, 5, 3, 5, 5, 2, 1, 2, 6, 6, 6, 5, 6, 3, 3, 3, 6, 3, 6, 5, 4, 6, 1, 1, 5, 5, 6, 1, 6, 2, 4, 3, 1}

The mean of all those values is 3.53

Let us call this one set of calculations.

Let us make hundred sets of these calculations.

Input Code: Table[Mean[Table[RandomChoice[a], {n, 1, 100}]], {n, 1, 100}]

Output: {3.57, 3.12, 3.61, 3.4, 3.63, 3.67, 3.69, 3.38, 3.26, 3.56, 3.63, 3.36, 3.51, 3.52, 3.43, 3.29, 3.38, 3.84, 3.27, 3.57, 3.55, 3.11, 3.35, 3.42, 3.48, 3.45, 3.33, 3.5, 3.38, 3.56, 3.57, 3.64, 3.63, 3.46, 3.69, 3.48, 3.44, 3.41, 3.54, 2.98, 3.56, 3.29, 3.53, 3.5, 3.64, 3.6, 3.3, 3.65, 3.39, 3.55, 3.38, 3.87, 3.61, 3.44, 3.57, 3.58, 3.03, 3.53, 3.28, 3.4, 3.35, 3.31, 3.71, 3.24, 3.06, 3.66, 3.29, 3.5, 3.41, 3.56, 3.51, 3.75, 3.47, 3.39, 3.52, 3.43, 3.59, 3.9, 3.26, 3.77, 3.72, 3.56, 3.56, 3.84, 3.5, 3.31, 3.59, 3.63, 3.31, 3.33, 3.64, 3.4, 3.57, 3.24, 3.44, 3.4, 3.59, 3.63, 3.5, 3.47}

If we look at the distribution of this set using a histogram

What do you see? Ans: We see that not all of them occur with the same frequency. Let us try for higher number of trials. (Or higher number of averages)

*Distribution of 100 Averages*

*Distribution of 1000 Averages*

*Distribution of 5000 Averages*

*Distribution of 10000 Averages*

*Distribution of 50000 Averages*

*Distribution of 100000 Averages*

*Distribution of 500000 Averages*

*Distribution of 1000000 Averages (A Million!)*

*What have we seen from this ?*

As the number of experiments are increasing, or in this case the number of averages are increasing, the distribution of the means is becoming more and more normal. There is a surprise for you.

What was the set we started the experiment with? It was *{1, 2, 3, 4, 5, 6}* and the mean of all the elements in the list is 3.5

Look at the last diagram now. Where is the maximum value of the distribution located? 3.5

You see that we didn’t know much about the data we just had the knowledge about the random samples that were extracted from it but we were able to get a good estimate of the mean of the set by looking at the distribution of the sample means. This is in-fact what central limit theorem is.

The code used is

n = 100;

Histogram[Table[Mean[Table[RandomChoice[a], {n, 1, 100}]], {n, 1, n}], 50, ChartStyle -> Hue[0.58], ChartBaseStyle -> {Opacity[0.2], EdgeForm[{Black}]}]

.