2

I am looking for help analyzing a variant of the Coupon Collector Problem where the coupons can come from more than 1 source. And the probability of each a given coupon that comes from multiple sources might differ between the sources.

Example:

Assume there are 2 Unfair Dice.

  • Die 1 has sides {A, B, C, D, E, O} with probabilities {1/10, 1/10, 1/5, 1/5, 1/20, 7/20}

  • Die 2 has sides {E, F, O} with probabilities {1/5, 1/4, 11/20}

The Collector whishes to obtain at least 1 of {A, B, C, D, E, F} The Collector does not care if O is received, this is the probability that no coupon is received on a given roll.

The Collector wants to know what the is the expected number of rolls $E[N]$ of each die in order to complete their collection?

From this paper The Unequal Probability variant of the Coupon Collector Problem can be solved for each die with the following:

Die 1:

$$\int_0^\infty1-(1-e^\frac{-t}{10})^2(1-e^\frac{-t}{5})^2(1-e^\frac{-t}{20})dt \approx 26.275$$

Die 2:

$$\int_0^\infty1-(1-e^\frac{-t}{5})(1-e^\frac{-t}{4})dt \approx 6.778$$

However doing this would effectively double count the E coupon in the total expectation of dice rolls.

Removing Coupon E from both formulas yields:

Die 1:

$$\int_0^\infty1-(1-e^\frac{-t}{10})^2(1-e^\frac{-t}{5})^2dt = 16.25$$

Die 2:

$$\int_0^\infty1-(1-e^\frac{-t}{4})dt = 4$$

The issue I am looking for help on is how to resolve this double counting. How can I account for the probability that the Collector might have received Coupon E while rolling Die 1 or Die 2? How would one account for the shared probability of Coupon E in the $E[N]$ of each die?

EDIT: I want to add the my central question could probably be reframed as "What is the expected number of rolls of Die 2 given that Die 1 was rolled enough to complete the Die 1 specific Coupons to obtain both Coupon E and Coupon F?"

In my example; given that I rolled Die 1 16.25 times how many times am I expected to have to roll Die 2 to complete the collection?

  • Can you solve for the the best way to just get 1 E? Then the simple inclusion-exclusion rule should apply I would think, subtract the two – Alan Dec 28 '22 at 19:47
  • @Alan inclusion-exclusion seems like the right approach, but I am struggling to understand what my A, B and A+B is in this scenario. The best way to get just 1 E is to roll only die 2 with an $E[E] = 5$ – TheRealOne Dec 28 '22 at 21:55
  • Are you saying the two dice are rolled together, which constitutes one "roll of the dice" for counting purposes? – awkward Jan 02 '23 at 00:11
  • @awkward The dice are rolled one at a time. They can be rolled in any order. So for example one could choose to roll Die 1 until they have received coupons A - D and hope they get lucky getting E, then move on to rolling Die 2 until they get F and E if they did not get it from Die 1. – TheRealOne Jan 15 '23 at 01:48
  • I think your $16.25$ should be $16.5$? – joriki Mar 16 '23 at 19:28

1 Answers1

1

As noted in the comments and your edit, an optimal strategy is to roll die $1$ until you have all coupons specific to that die and then to roll die $2$ until you have all coupons. (The situation isn’t symmetric because die $2$ has the higher probability for the common coupon $E$, so doing it the other way around wouldn’t be optimal.)

To figure out how long this is expected to take, we need, in addition to the expected time required to collect the coupons specific to die $1$, the probability that when we have them we’re still missing coupon $E$. This is the probability that coupon $E$ is the last coupon collected in a complete collection from die $1$ (of course always ignoring coupon $O$). Due to the linearity of expectation, we don’t need to worry about the strong correlation between $E$ being the last coupon collected from die $1$ and the time required for the collection of the other coupons to finish. ($E$ is more likely to be collected last the earlier the other coupons are complete.)

The probability $p$ that $E$ is collected last from die $1$ is calculated in two different and quite instructive ways at Last coupon collected in the coupon collectors problem, once using inclusion–exclusion and once using Poissonization (the latter in more detail in this cstheory answer linked to there). The result is

$$ p=\sum_{S\subset R}(-1)^{|S|}\frac{p_E}{p_E+\sum_{m\in S}p_m}\;, $$

where $R=\{A,B,C,D\}$ is the set of the remaining coupons on die $1$. With your probabilities, this is

\begin{eqnarray} p &=& \frac11-\frac2{1+2}-\frac2{1+4}+\frac1{1+2+2}+\frac4{1+2+4}+\frac1{1+4+4} \\[5pt] &&-\frac2{1+2+2+4}-\frac2{1+2+4+4}+\frac1{1+2+2+4+4} \\[5pt] &=&1-\frac23-\frac25+\frac15+\frac47+\frac19-\frac29-\frac2{11}+\frac1{13} \\[5pt] &=&\frac{22016}{45045} \\[5pt] &\approx&49\%\;. \end{eqnarray}

Now we can calculate the time required for a full collection from both dice. First, by the standard result for a coupon collection with unequal probabilities, we have to roll die $1$ an expected number

$$ 10\left(\frac21+\frac22-\frac1{1+1}-\frac4{1+2}-\frac1{2+2}+\frac2{1+1+2}+\frac2{1+2+2}-\frac1{1+1+2+2}\right)\\ =10\left(2+1-\frac12-\frac43-\frac14+\frac12+\frac25-\frac16\right) \\=\frac{33}2 $$

of times to collect all coupons specific to that die. Then, with probability $p$ we’re still missing coupon $E$, so we have to roll die $2$ an expected number

$$ 20\left(\frac14+\frac15-\frac1{4+5}\right)=\frac{61}9 $$

of times to collect both $E$ and $F$; whereas with probability $1-p$ we already have coupon $E$ and only have to roll die $2$ an expected $4$ times to collect $F$.

Altogether, the expected number of rolls required for a full collection is

$$ \frac{33}2+\frac{22016}{45045}\cdot\frac{61}9+\left(1-\frac{22016}{45045}\right)\cdot4=\frac{3544481}{162162}\approx21.86\;. $$

Here’s Java code that checks this result by simulation.

joriki
  • 238,052
  • Thank you very much. It feels all too obvious now. To expand this further to scenarios where there might be more than one shared coupon and/or more than two dice; for example:

    Die 1 {A, B, C, D, E} Die 2 {E, F, G} Die 3 {F, G, H} ...

    One would need to employ this same inclusion-exclusion logic, but one level up to all of the various combinations of shared coupons and dice.

    – TheRealOne Aug 31 '23 at 01:33