Sheldon Ross 10: Exercise 3.31

Sheldon Ross 10 Exercise 3.31_Master I found the Sequence-01-01.jpg

Question: Each element in a sequence of binary data is either 1 with probability p or 0 with 1 − p. A maximal subsequence of consecutive values having identical is called a run. For instance, if the outcome sequence is 1, 1, 0, 1, 1, 1, 0, first run is of length 2, the second is of length 1, and the third is of length 3

  1. Find the expected length of the first run
  2. Find the expected length of the second run

Analytical Solution

This problem is rather easy to compute if we use the principles of conditioning. Since we are interested only on the first two sequences, the following possibilities exist

  • The first sequence is zeroes followed by ones OR
  • The first sequence is ones followed by zeroes.

Since the probabilities for the type of bit is dependent on ‘p’, we can determine the relation between the p and the expected run length of the first sequence (S1) and the second sequence (S2)

E[L(S1)] = E[ L(S1) | bit1=1 ] P{bit1=1} + E[ L(S1) |  bit1=0 ] P{bit1=0}

E[L(S1)] = 11-p(p) + 1p(1-p) = p1-p + 1-pp

Similarly,

E[L(S2)] = E[ L(S2) | bit1=1 ] P{bit1=1} + E[ L(S2) |  bit1=0 ] P{bit1=0}

E[L(S2)] = 11-p(1-p) + 1p(p) = 1-p1-p + pp = 2


Simulation Solution

This problem can be simulated but needs some tricks to track the ‘flipping-over’ of the bit sequences. The plots for the different probabilities are shown below.

A trivia : Why do the plots for p = 0.05 and p = 0.95 look similar? Shouldn’t the plots be flipping over when the probabilities have flipped over? I would allow you to ponder on that. If you are not able to figure out, please drop a comment and I will try responding.


Code

Used python for the simulation and Mathematica for the plots. Both are available below.

from sys import stdout
import csv
from mathematica.sheldon_ross_support_functions import randomChoiceCustom

_folderPath = 'D:\\Mathematica Files 4K\\sheldon_ross\\sheldon_ross_chapter_03\\sheldon_ross_10_exercise_3.031\\'

for _p in [0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95]:

    _csvfile = open(_folderPath + 'sheldon_ross_10_exercise_3.031_' + _p.__str__() + '.csv', 'w', newline='')
    _csvWriter = csv.writer(_csvfile)

    _steps = 10000
    _subSteps = _steps // 100
    _progressPadTop = "-" * 106
    _progressPadBottom = "-" * 106
    _masterList = []

    print("\n\nWriting data to the file :" + _csvfile.name)
    print(_progressPadTop)

    _masterSequenceFlips = []
    for i in range(0, _steps):
        __games = []

        _bitSequence = []
        _tracker = 0
        while True:
            _bitSequence.append(randomChoiceCustom(_p))
            if _bitSequence.__len__() > 1 and _bitSequence[-1] != _bitSequence[-2]:
                _tracker += 1
            if _tracker == 2:
                del _bitSequence[-1]
                _csvWriter.writerow(_bitSequence)
                break

        if (i + 1) % _subSteps == 0:
            stdout.write("\r" + ("\r" + ("#" * (i // _subSteps + 1)).ljust(100)) + " " + str(i // _subSteps + 1) + " %")
            stdout.flush()

    print("\n" + _progressPadBottom)
    print("Closing the file: " + _csvfile.name)
    _csvfile.close()

Module[{files =
    SortBy[FileNames["*.csv", NotebookDirectory[]],
      ToExpression[StringDelete[Last[StringSplit[#, "_"]], ".csv"]] &],
  labels},
  labels = StringDelete[Last[StringSplit[#, "_"]], ".csv"] & /@ files;
  Grid@Partition[
    MapThread[
      DistributionChart[
        Transpose[(Length /@ Gather[#]) & /@ Import[#1]],
        PlotLabel -> "\nProbability " <> #2,
        ImageSize -> 300] &, {files, labels}], 2]]

End of the post 😉


.

.

.

.

.

.

.

.

.

.

.

.

.

.