Random Variate Correlations

This is a study conducted to study how the random numbers generated independently are correlated to each other. Since the expected correlation is 0, the behavior unfolds as we increase the sizes of the individual samples. Each of the following charts contains several smooth histograms overlapped. Each of the smooth curve (i.e., the smooth histogram) is made from 1000 correlations. Each correlation is calculated between two samples and the sizes of the samples are indicated above the chart.

How To Simulate? The following steps outline the methodology

  1. Obtain a random sample of size ‘n’ from a Standard Uniform Distribution
  2. Obtain another random sample size ‘n’ from a Standard Uniform Distribution
  3. Calculate the Pearson correlation between them
    1. Repeat the above for many trials and construct a distribution with it
      1. For overlaying several distributions, one could repeat above to plots similar to the ones shown below

The plots look really interesting BTW 😉

 

The aggregate chart showing the correlation histograms for all the sample sizes is

Exercises: 

  1. For the plots, I have used the univariate uniform distributions but you could try that with any combinations of other types of variables.
  2. The ideal random variate function should yield similar results regardless of the function itself (like, Normal or Weibull, etc,.)

Code

 

You could use the code below for the exercises.

ClearAll[correlationPlot];
correlationPlot[n_?IntegerQ] :=
    Module[{tickGrid = {Range[-1.5, 1.5, 0.1], Range[0.1, 1.5, 0.1]},
      sampleSize = n},
      SmoothHistogram[
        Table[Correlation[
          RandomVariate[UniformDistribution[{0, 1}], sampleSize],
          RandomVariate[UniformDistribution[{0, 1}],
            sampleSize]], {100}, {1000}], ImageSize -> 788, Frame -> True,
        PlotRange -> {{-1.5, 1.5}, {0, 1.5}},
        PlotStyle -> {{Red, Thickness -> 0, Opacity@0.1}},
        GridLines -> tickGrid, FrameTicks -> tickGrid,
        AspectRatio -> Divide[1, 2], Axes -> False,
        PlotLabel -> "Sample Size = " <> ToString[sampleSize]]] //
        Framed[#, FrameMargins -> 10, FrameStyle -> {GrayLevel[0.5]}] &

correlationPlot[n_?ListQ] :=
    Module[{tickGrid = {Range[-1.5, 1.5, 0.1], Range[0.1, 3, 0.1]}},
      SmoothHistogram[(Table[
        Correlation[RandomVariate[NormalDistribution[0, 1], #],
          RandomVariate[NormalDistribution[0, 1], #]], 10000] & /@ n),
        ImageSize -> 788, Frame -> True,
        PlotRange -> {{-1.5, 1.5}, {0, 3}}, GridLines -> tickGrid,
        FrameTicks -> tickGrid, AspectRatio -> Divide[1, 1],
        Axes -> False, PlotLegends -> Placed[n, Top]]] //
        Framed[#, FrameMargins -> 10, FrameStyle -> {GrayLevel[0.5]}] &

correlationPlot[#] & /@ Range[3, 10]

correlationPlot[{3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 20, 25, 30, 35, 40, 45, 50}]

End of the post