3

Suppose I buy a big box of assorted sweets. There are 10 distinct types of sweets, all evenly distributed in their assortment. I am separating them out into party bags for a party. I make up 100 party bags, each containing four distinct sweets. As such there are 400 sweets in total, with 40 of each variety distributed amongst the bags.

The question is, if I look inside $n$ bags (without replacement), what is the probability that I will have seen every variety of sweet?

I appreciate that if I knew the exact tally of sweet varieties I wanted to find I could use a multivariate hypergeometric distribution. This is a version of the coupon collecting problem but without replacement. It may be that this problem isn't easily analytically solvable? Am I doomed to be unable to do this for such a large dimension of variables? Any advice or input would be greatly appreciated. Perhaps calculating a Chao index is required?

  • This is the coupon collector problem where you draw by fours. – Ross Millikan May 10 '18 at 19:10
  • Without replacement though? I thought the CC problem was explicitly for replacement? – Zack Ashman May 10 '18 at 19:13
  • Yes, although replacement will not matter much here as you have so many of each kind. Whether there are $40$ or $38$ of each kind to draw from is a small change. If you are interested in that level of accuracy, I will reopen it. Another approximate approach is to compute the chance you have found a particular kind, then take the tenth power of that for the chance you have seen them all. That ignores that having seen one makes seeing another slightly less likely. – Ross Millikan May 10 '18 at 19:26
  • Many thanks for this but I do unfortunately need this greater level of accuracy, the above problem is just an example of a more general framework that I need for a lot of sampling simulation problems! I will edit the question title to make more clear that I am looking for solutions to a CC problem without replacement. – Zack Ashman May 10 '18 at 22:56

0 Answers0