Sheldon Ross 10: Exercise 3.63

Question: Suppose there are n types of coupons, and that the type of each new coupon obtained is independent of past selections and is equally likely to be any of the n types. Suppose one continues collecting until a complete set of at least one of each type is obtained.

  1. Find the probability that there is exactly one type-i coupon in the final collection. Hint: Condition on T, the number of types that are collected before the first type i appears
  2. Find the expected number of types that appear exactly once in the final collection

Analytical Solution

We will slightly digress from what has been asked in the question. Instead of solving what has been asked (which is sort of a special case), we will look at the larger events and see the following aspects of the problem

  1. What is the distribution of the number of coupons that have been only collected once before the set is complete
  2. What is the distribution of the number of coupons that have been collected ‘n’ times before the set is complete
  3. What is the behavior of the above as the number of unique coupons increases

Simulation Solution

Part 1 and 2

How to read the plot?

  • The x-Axis shows the repetition number and the y-Axis shows the number of coupons that were repeated that many times
    • for example, consider that Coupon-A, Coupon-B, and Coupon-C have appeared 2 times, each, then that would be shown have the x-position of 2 and y-Position of 3 (as in the number of types of coupons which is A, B, and C)
  • Why do we see points scattered?
    • I intentionally randomized the position the points to see all of them as opposed to non-randomized plotting where all the points would overlap and we would not have any idea of the density of the distribution of the counts and the occurrences
    • In summary, all the points of a cluster have the same value, they have been intentionally scattered to show all of them
      • If we were to do this in a 3D plot, it would simply be a 3D histogram and the randomization would not be necessary

Notice that the plot is not evenly distributed next to the y-axis, there are some sparsely distributed areas above x=1. Can you guess why that is happening?

Part 3

This plot is more informative and contains more information. In this plot, we see what happens what happens to patterns in the final collection when the number of the unique coupons available “out there for grabs” is changing. We can make the following observations from the plot. You need to understand the plot above to understand the one below.

  • Firstly, the shape evolves as if looking to optimize itself
  • Secondly, you can see how the patchy areas start to appear in the plot as the number of unique coupons available for grams is increasing

Drop a comment if you have any questions about the representation.

 

Part 3 denser plots

 

 


Code

The code for this one is a little involved because of the multi-level dictionaries that are been concatenated to extract the simulation data. I have used shortcuts in many places and hence the code is compact.

ClearAll[associationRandomizer];
associationRandomizer[association_?AssociationQ, offset_Real : 0.3] :=
    MapThread[{#1 + RandomReal[{-offset, offset}], #2 +
        RandomReal[{-offset, offset}]} &, {Values@association,
      Keys@association}]

Module[{plotData = List[], plotExtent = 15},
  Do[
    Module[{coupons = Take[Alphabet[], 26], newCoupon,
      couponCollection = List[], plotSubData},
      newCoupon := RandomChoice@coupons;
      While[True,
        AppendTo[couponCollection, newCoupon];
        If[Union@couponCollection == coupons, Break[]]
      ];
      plotSubData = KeySort@Counts@Values@Counts@couponCollection;
      AppendTo[plotData, plotSubData]
    ]
    , 2000];
  ListPlot[associationRandomizer /@ plotData, AspectRatio -> 1,
    Frame -> True, PlotRange -> ConstantArray[{0, plotExtent}, 2],
    GridLines -> ConstantArray[Range@plotExtent, 2],
    PlotStyle -> {{PointSize@0.002, Darker@Green}},
    GridLinesStyle -> Directive[Lighter@Gray, Dashed], ImageSize -> 788,
    FrameLabel -> (Style[#, Gray, 16] & /@ {"Repeating coupons",
      "Number of such occurrences"})]
]


Grid[Partition[#, 3]] &[
  Table[Module[{plotData = List[], plotExtent = 15},
    Do[
      Module[{coupons = Take[Alphabet[], take], newCoupon,
        couponCollection = List[], plotSubData},
        newCoupon := RandomChoice@coupons;
        While[True,
          AppendTo[couponCollection, newCoupon];
          If[Union@couponCollection == coupons, Break[]]
        ];
        plotSubData = KeySort@Counts@Values@Counts@couponCollection;
        AppendTo[plotData, plotSubData]
      ]
      , 1000];
    ListPlot[associationRandomizer /@ plotData, AspectRatio -> 1,
      Frame -> True, PlotRange -> ConstantArray[{0, plotExtent}, 2],
      PlotStyle -> {{PointSize@0.0, Darker@Green}},
      GridLinesStyle -> Directive[Lighter@Gray, Dashed],
      ImageSize -> 200,
      Epilog -> {Text[
        "Unique coupons \nfor taking = " <> ToString[take], {10, 12}]}]
  ], {take, Range[4, 26, 2]}]]

End of the post 🙂


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.