2

I was looking at various extensions of the classical Coupon Collector problem, but couldn't find any answers or hints for the following modification:

Assume there are $n$ distinct coupons and you get them in batches of a different (random) size. That is, assume that every time you get a black box with coupons in it, whose number can be from $0$ to $n$ and coupons within a black box are distinct. The probability of having a specific coupon in a box is $p$ (that is, it's the same for all). What is the expected number of black-boxes one needs to open to collect all coupons?

Vika
  • 447

2 Answers2

3

I'll assume that you intended to imply that each coupon independently has probability $p$ to be in any given black box.

Then we have $n$ independent Bernoulli trials with probability $p$ and are looking for the expected time for all of them to have succeeded at least once. The probability for a given coupon not to have been collected after $k$ black boxes is $(1-p)^k$, so it has been collected with probability $1-(1-p)^k$, so the probability for all coupons to have been collected is $\left(1-(1-p)^k\right)^n$, so the probability that they haven't all been collected is $1-\left(1-(1-p)^k\right)^n$. The expected number of black boxes required to collect all coupons is the sum of these probabilities:

\begin{align} \sum_{k=0}^\infty\left(1-\left(1-(1-p)^k\right)^n\right) &=\sum_{k=0}^\infty\left(1-\sum_{j=0}^n\binom nj\left(-(1-p)^k\right)^j\right) \\ &=\sum_{k=0}^\infty\sum_{j=1}^n(-1)^{j-1}\binom nj(1-p)^{jk} \\ &=\sum_{j=1}^n(-1)^{j-1}\binom nj\sum_{k=0}^\infty(1-p)^{jk} \\ &=\sum_{j=1}^n(-1)^{j-1}\binom nj\frac1{1-(1-p)^j}\;. \end{align}

The same result can also be derived using the maximum-minimums identity. The time it takes to collect all coupons is the maximum of the times $X_i$, where $X_i$ is the time it takes to collect the $i$-th coupon:

\begin{align} \def\ex#1{\mathbb E\left[#1\right]} \ex{\max_iX_i} &=\sum_i\ex{X_i}-\sum_{i\lt j}\ex{\min\{X_i,X_j\}}+\sum_{i\lt j\lt k}\ex{\min\{X_i,X_j,X_k\}}-\cdots \\ &=\sum_{j=1}^n(-1)^{j-1}\binom nj\frac1{1-(1-p)^j}\;, \end{align}

since the minimum of the completion times of $j$ particular coupons is the expected time it takes for a Bernoulli trial with success probability $1-(1-p)^j$ to succeed. Equivalently, we could use inclusion-exclusion to find the probability that not all coupons have been collected and then sum over $k$ as above.

joriki
  • 238,052
0

Hint 1: The expected number of coupons in a single box is $E[X]$. The expected number of coupons in two boxes is $E[X+X] = E[X] + E[X] = 2E[X]$. You can extend this to the expected number of coupons in $m$ boxes. So you can calculate the expected number of boxes required to get a total of $y$ coupons.

Hint 2: There is a formula for counting the number of coupons required to collect all $n$ coupons. Once you calculate this, you can replace $y$ in Hint 1.

ashleydc
  • 159
  • Am I right thinking that it will be about $\frac{n \ln n}{K }$, where $K$ is the expected number of coupons in a box, that is $K = np$? – Vika Apr 06 '16 at 17:23
  • @Vika the expected number of coupons in a box is $K = \frac{n+1}{2}$, other than that, you're correct. – ashleydc Apr 06 '16 at 18:14
  • Why doesn't the answer depend on $p$ then? If $p$ is really small, then I think we will need to buy more boxes. – Vika Apr 06 '16 at 19:29
  • @Vika My apologies, I mis-read the question and didn't see the part about the coupons in the black box being distinct. I thought that there could be duplicate coupons. I'm taking a second look at this and will edit my answer if I can come up with an approach – ashleydc Apr 07 '16 at 02:44
  • The reason that I was confused was because when I read that there could be between $1$ and $n$ coupons in a box, I thought that the probability for each was identical (for example, the probability of receiving a box with $n$ coupons was $\frac{1}{n}$). It looks like the problem is actually stating that there is $1$ box with $n$ coupons, $n$ boxes with $1$ coupon, ... Is this correct? – ashleydc Apr 07 '16 at 03:07
  • The probability of having a specific coupon in a box is $p$ and is the same for all coupons. So, the number of coupons can be 0, 1, 2, ..., n. There is no guarantee that one box will contain $n$ coupons. But there cannot be any repetitions of coupons within a box. For example, one box can have 2 coupons: one of type 1 and one of type 3. The second can have 3 coupons: one of type 1, one of type 3 and one of type 4. There are n types. – Vika Apr 13 '16 at 01:34
  • What I'm trying to clarify is this. Suppose you have $5$ coupons. There is one possible way to have a box with all $5$ coupons in it. But, there are $5$ different ways to have a box with a single coupon in it because a box with a single coupon can have any of the $5$ coupons. This means there's a higher probability of getting a box with $1$ coupon then there is of getting a box with all $5$ in them. Is that a correct way to interpret the problem you're trying to solve? – ashleydc Apr 13 '16 at 18:28
  • Apart from the issues of interpretation of the question discussed in the comments, the switch from expected number of coupons to expected number of boxes seems unwarranted. – joriki May 21 '16 at 11:58