Probability that the sum of $k$ distinct integers selected from $1, 2, \dots, n$ is divisible by $n$

Question

Would you please help me solve Problem 7 of Section 4.5, "Combinatorial Number Theory," in An Introduction to the Theory of Numbers, Niven, Zuckerman, Montgomery, 5th ed., Wiley (New York), 1991:

Let $n$ and $k$ be positive, relatively-prime integers with $n > k$. Prove that if $k$ distinct integers are selected at random from $1, 2, \dots, n$, the probability that their sum is divisible by $n$ is $1/n$.

The authors describe three methods for solving combinatorial problems: (1) the pigeonhole principle, (2) the one-to-one correspondence procedure, in which elements in a finite set or between two sets are paired off to determine the number of elements, and (3) the inclusion-exclusion principle. I know that the denominator of the probability fraction is $n \choose k$, so I tried (without success) to use the second method to count the number of desired outcomes to be $(n - 1)!/((n - k)! k!)$.

Because $n$ and $k$ are relatively prime, I know that $n \choose k$ is divisible by $n$. In that case, numerical evidence suggests that the sum in question is uniformly distributed modulo $n$. If I can prove that conjecture, I am done.

I found similar problems with particular values for $n$ and $k$, but the solutions do not help me solve this general case.

score 6 · Accepted Answer · answered Sep 05 '17 at 16:52

6

Think of the numbers as integers modulo~$n$. Consider the map $A=\{a_1,a_2,\ldots,a_k\}\mapsto\{a_1+1,a_2+1,\ldots,a_k+1\}$. This adds $k$ modulo $n$ to the sum of the elements of $A$.

answered Sep 05 '17 at 16:52

Angina Seng

158,341

1

Can you explain more? How does this solve the problem? – C.S. Jun 02 '18 at 15:32

Marko Riedel · Answer 2 · 2022-11-06T00:02:01.970

The exponential formula tells us that the cycle index $Z(P_k)$ of the unlabeled set operator

$$\def\textsc#1{\dosc#1\csod} \def\dosc#1#2\csod{{\rm #1{\small #2}}}\textsc{SET}$$

on $k$ slots has OGF

$$Z(P_k) = [w^k] \exp\left(\sum_{l\ge 1} (-1)^{l-1} a_l \frac{w^l}{l}\right).$$

The desired probability is given by

$${n\choose k}^{-1} \frac{1}{n} \sum_{p=0}^{n-1} \left. Z(P_k) \left(\sum_{q=1}^n z^q\right)\right|_{z=\exp(2\pi ip/n)}.$$

This is

$${n\choose k}^{-1} \frac{1}{n} \sum_{p=0}^{n-1} \left. [w^k] \exp\left(\sum_{l\ge 1} (-1)^{l-1} \left(\sum_{q=1}^n z^{ql}\right) \frac{w^l}{l}\right)\right|_{z=\exp(2\pi ip/n)}.$$

Evaluating the contribution for $p=0$ first we get

$${n\choose k}^{-1} \frac{1}{n} [w^k] \exp\left(\sum_{l\ge 1} (-1)^{l-1} n \frac{w^l}{l}\right) \\ = {n\choose k}^{-1} \frac{1}{n} [w^k] \exp\left(n \log(1+w)\right) \\ = {n\choose k}^{-1} \frac{1}{n} [w^k] (1+w)^n = {n\choose k}^{-1} \frac{1}{n} {n\choose k} = \frac{1}{n}.$$

It remains to evaluate the contribution from $1\le p\le n-1.$ Now for these $p$ if $l$ is a multiple of $m = n/\gcd(p, n)$ we have

$$\sum_{q=1}^n \exp(2\pi ip/n)^{ql} = n.$$

We get zero otherwise. This yields for the remaining terms without the scalar in front

$$\sum_{p=1}^{n-1} [w^k] \exp\left(\sum_{l\ge 1} (-1)^{ml-1} n \frac{w^{ml}}{ml} \right) = \sum_{p=1}^{n-1} [w^k] \exp\left(-\frac{n}{m} \sum_{l\ge 1} \frac{(-w)^{ml}}{l} \right) \\ = \sum_{p=1}^{n-1} [w^k] \exp\left(-\frac{n}{m}\log\frac{1}{1-(-w)^m}\right) = \sum_{p=1}^{n-1} [w^k] (1-(-w)^{n/\gcd(p, n)})^{\gcd(p, n)} \\ = \sum_{p=1}^{n-1} [w^k] (1+(-1)^{1+n/\gcd(p, n)} w^{n/\gcd(p, n)})^{\gcd(p, n)}.$$

Using an Iverson bracket this becomes

$$\sum_{p=1}^{n-1} [[n/\gcd(p,n)|k]] \times {\gcd(p, n)\choose k\gcd(p,n)/n} (-1)^{k\gcd(p,n)/n+k}.$$

Putting it all together we thus obtain

$$\bbox[5px,border:2px solid #00A000]{ \frac{1}{n} + (-1)^k {n\choose k}^{-1} \frac{1}{n} \sum_{p=1}^{n-1} [[n/\gcd(p,n)|k]] \times {\gcd(p, n)\choose k\gcd(p,n)/n} (-1)^{k\gcd(p,n)/n}.}$$

Note that $n/\gcd(p,n)$ is a divisor of $n$ that is at least two (we would need $p=n$ to get $n/\gcd(p,n) = 1$ but $p\lt n$). This means when $\gcd(k, n) = 1$ the Iverson bracket fails for all $p$ and only the first term remains, which is what we wanted to prove.

As a sanity check we evaluate the formula when $k=n.$ The sum of the one subset is $n(n+1)/2$ so we should get for the probability one when $n$ is odd and zero when it is even. We find

$$\frac{1}{n} + (-1)^n {n\choose n}^{-1} \frac{1}{n} \sum_{p=1}^{n-1} {\gcd(p, n)\choose \gcd(p,n)} (-1)^{\gcd(p,n)} \\ = \frac{1}{n} + (-1)^n \frac{1}{n} \sum_{p=1}^{n-1} (-1)^{\gcd(p,n)} = (-1)^n \frac{1}{n} \sum_{p=1}^{n} (-1)^{\gcd(p,n)} \\ = (-1)^n \frac{1}{n} \sum_{d|n} \sum_{\gcd(p,n) = d} (-1)^d = (-1)^n \frac{1}{n} \sum_{d|n} (-1)^d \sum_{\gcd(q,n/d) = 1} 1 \\ = (-1)^n \frac{1}{n} \sum_{d|n} (-1)^d \varphi(n/d).$$

We evaluate this using formal Dirichlet series, starting from $\sum_{d|n} \varphi(d) = n$ which yields

$$\sum_{n\ge 1} \frac{\varphi(n)}{n^s} = \frac{\zeta(s-1)}{\zeta(s)}.$$

Furthermore we have

$$\sum_{n\ge 1} \frac{(-1)^n}{n^s} = -\left(1-\frac{2}{2^s}\right)\zeta(s).$$

This means that

$$\sum_{n\ge 1} \frac{1}{n^s} \sum_{d|n} (-1)^d \varphi(n/d) = -\left(1-\frac{2}{2^s}\right) \zeta(s-1).$$

Now we have for $n$ even

$$- (-1)^n \frac{1}{n} [n^{-s}] \left(1-\frac{2}{2^s}\right) \zeta(s-1) = - (-1)^n \frac{1}{n} (n - 2 [(n/2)^{-s}] \zeta(s-1)) \\ = - (-1)^n \frac{1}{n} (n - 2 \times n/2) = 0.$$

We obtain zero as required. On the other hand with $n$ odd we find

$$- (-1)^n \frac{1}{n} [n^{-s}] \left(1-\frac{2}{2^s}\right) \zeta(s-1) = - (-1)^n \frac{1}{n} [n^{-s}] \zeta(s-1) \\ = - (-1)^n \frac{1}{n} \times n = 1,$$

again as required. This concludes the sanity check. and indeed the entire argument.

Remark. Some of this material is duplicate where I have not found the links however.

Addendum. The sum can be simplified. Starting with

$$\sum_{p=1}^{n-1} [[n/\gcd(p, n)|k]] \times {\gcd(p,n) \choose k \gcd(p,n)/n} (-1)^{k\gcd(p,n)/n}.$$

With the GCD being $d$ we get

$$\sum_{d|n \wedge d \lt n} \sum_{\gcd(p,n) = d} [[n/d|k]] \times {d \choose kd/n} (-1)^{kd/n}.$$

where we retain $1\le p\le n-1$. This is

$$\sum_{d|n \wedge d \lt n} \sum_{\gcd(q,n/d) = 1} [[n/d|k]] \times {d \choose kd/n} (-1)^{kd/n} \\ = \sum_{f|k \wedge f\gt 1} \sum_{d|n \wedge d \lt n} \sum_{\gcd(q,n/d) = 1, n/d=f} {d \choose kd/n} (-1)^{kd/n}$$

Here we have $1\le qd\le n-1$ but since $d\lt n$ we may actually use $1\le qd\le n$ to obtain (this is because $\gcd(n/d,n/d) = n/d \gt 1$)

$$\sum_{d|n \wedge d \lt n \wedge n/d|k} \sum_{\gcd(q,n/d) = 1} {d \choose kd/n} (-1)^{kd/n} = \sum_{d|n \wedge d \lt n \wedge n/d|k} \varphi(n/d) {d \choose kd/n} (-1)^{kd/n} \\ = \sum_{d|n \wedge d \gt 1 \wedge d|k} \varphi(d) {n/d \choose k/d} (-1)^{k/d} = \sum_{d|\gcd(n,k) \wedge d \gt 1} \varphi(d) {n/d \choose k/d} (-1)^{k/d}.$$

Merge this into the boxed formula to get

$$\frac{1}{n} + (-1)^k {n\choose k}^{-1} \frac{1}{n} \sum_{d|\gcd(n,k) \wedge d \gt 1} \varphi(d) {n/d \choose k/d} (-1)^{k/d}$$

or alternatively

$$\bbox[5px,border:2px solid #00A000]{ (-1)^k {n\choose k}^{-1} \frac{1}{n} \sum_{d|\gcd(n,k)} \varphi(d) {n/d \choose k/d} (-1)^{k/d}.}$$

Probability that the sum of $k$ distinct integers selected from $1, 2, \dots, n$ is divisible by $n$

2 Answers2

Linked