0

I'm trying to solve a problem similar to this one, I understand the answers of that problem, but my problem is

Suppose N bins that are ordered and labeled from 0 to N-1. How many bins in mean do I need to choose, to fill each of the $N$ bins with least one ball, where the choose is made using two bins at once, that two bins are the $((j+c)\% n)^{th}$ and $((j-c)\%n)^{th}$ where $j\in N$ is selected in random form?. I got verify the answer of last question when the two bins are choosed in randomly form. For example for $n=16$ that is approximately 27, but in my question that is approximatly 25, when c = 7$. Could you help me please to understand why the number of balls in mean is 25 and not 27?.

1 Answers1

1

Here's some analysis that follows the simple and neat analysis of the coupon collector problem presented by Levin, Peres and Wilmer on page 23 in their book Markov Chains and Mixing Times.

The probability that the $i$th bin is not filled in time step $t$ is the probability that neither $(i + c\mod n)$ nor $(i - c \mod n)$ was picked in time step $t$, i.e., $1 - \frac{2}{n}$.

Let $\tau$ be the (random) number of steps taken for all the bins to be filled. Let $A_i(k)$ be the event that the $i$th bin was not filled in the first $k$ steps.

Note that that the probability that $\tau > k$ is the same as the probability that there exists a bin that was not filled in the first $k$ steps, i.e.,

$P(\tau > k) = P \left( \cup_{i=0}^{n-1} A_i(k)\right)$.

By the union bound we get $P(\tau > k) \leq \sum_{i=0}^{n-1} P(A_i(k)).$

For a given bin $i$, the probability of not filling it in $k$ independent throws is $\left(1 - \frac{2}{n}\right)^k$. So we get

$P(\tau > k) \leq n\left(1 - \frac{2}{n}\right)^k \leq n e^{-2k/n}.$

So, we get that

$P\left(\tau > \frac{n}{2} (\log n + c)\right) \leq e^{-c}.$

So, the coupon collector time in this case is sharply concentrated near $\frac{n}{2} \log n$ which is exactly the time expected if the coupon collector process was run at "double speed", i.e., if two balls were thrown independently at each step. The dependence present in your situation is not being accounted for since we have used a simple union bound. My guess is that it might be hard to show that this process is faster than the "double speed" coupon collector process, and, in fact, it may be slower.