18

I have tried something to solve the series $$\binom{n}{1}^2+2\binom{n}{2}^2 + 3\binom{n}{3}^2 + 4\binom{n}{4}^2+\cdots + n\binom{n}{n}^2.$$ My approach is : $$(1+x)^n=\binom{n}{0} + \binom{n}{1}x + \binom{n}{2}x^2 + \cdots + \binom{n}{n}x^n.$$ Differentiating the above equation $$n(1+x)^{n-1} = \binom{n}{1} + \binom{n}{2}x + \cdots + n\binom{n}{n}x^{n-1}$$

Also, $$ \left(1+\frac{1}{x}\right)^n =\binom{n}{0} + \binom{n}{1}\frac{1}{x} + \binom{n}{2}\left(\frac{1}{x}\right)^2 + \cdots + \binom{n}{n}\left(\frac{1}{x}\right)^n$$ Multiplying above two equation I get, $$\begin{align*} &{n(1+x)^{n-1}\left(1 + \frac{1}{x}\right)^n}\\ &\quad= \left( \binom{n}{1}^2 + 2\binom{n}{2}^2 + 3\binom{n}{3}^2 + 4\binom{n}{4}^2 + \cdots + n\binom{n}{n}^2\right)\left(\frac{1}{x}\right) + \text{other terms} \end{align*}$$

So I can say that coefficient of $\frac{1}{x}$ in expansion of $n(1+x)^{n-1}(1+\frac{1}{x})^n$ will give me the required answer.

Am I doing it correct,please correct me if I'm wrong ?

If I'm right,please tell me how to calculate the coefficient of $\frac{1}{x}$ ?

Based on the answers,I tried to implement the things in a C++ code.

I tried implementing the code using extended euclidean algorithm so that the problem of truncated division can be eliminated but still not abled to figure out why am I getting wrong answer for n>=3. This is my updated code : http://pastebin.com/imS6rdWs I'll be thankful if anyone can help me to figure out what's wrong with this code.

Thanks.

Solution:

Finally abled to solve the problem.Thanks to all those people who spent their precious time for my problem.Thanks a lot.This is my updated code :

http://pastebin.com/WQ9LRy6F

4 Answers4

18

First recall that the coefficient of $x^n$ in $(1+x)^n(1+x)^n=(1+x)^{2n}$ implies $$ \begin{align} \sum_{k=0}^n\binom{n}{k}^2 &=\sum_{k=0}^n\binom{n}{k}\binom{n}{n-k}\\ &=\binom{2n}{n}\tag{1} \end{align} $$ and then note that $$ \begin{align} \sum_{k=0}^nk\binom{n}{k}^2 &=\sum_{k=0}^nk\binom{n}{n-k}^2\\ &=\sum_{k=0}^n(n-k)\binom{n}{k}^2\tag{2} \end{align} $$ Adding the first and last parts of $(2)$ yields $$ \begin{align} 2\sum_{k=0}^nk\binom{n}{k}^2 &=n\sum_{k=0}^n\binom{n}{k}^2\\ &=n\binom{2n}{n}\tag{3} \end{align} $$ Therefore, $$ \sum_{k=0}^nk\binom{n}{k}^2=\frac{n}{2}\binom{2n}{n}\tag{4} $$

robjohn
  • 345,667
  • It looks great but I have one problem with this result.Actually I need to write a code in C to calculate S(result) where n can be as large as 10^6.Although I'm asked to print S % mod as final output where mod=10^9+7 but still calculating S which is derived will definitely overflow.So Can you help me on that part ? – code_hacker Mar 19 '12 at 17:35
  • 1
    @code_hacker: This identity might help $$ \begin{align} \binom{2n}{n} &=\frac{2n(2n-1)}{n^2}\binom{2(n-1)}{n-1}\ &=\frac{4n-2}{n}\binom{2(n-1)}{n-1} \end{align} $$ – robjohn Mar 19 '12 at 17:47
8

This kind of looks like you want to appeal to Vandermonde's convolution, or at least the method you'd use to prove it. It can be applied directly as follows:

Let $S = \sum\limits_{k=0}^n k\binom{n}{k}^2$ be the sum we want to compute. Note that $S = \sum\limits_{k=0}^n (n-k) \binom{n}{n-k}^2 = \sum\limits_{k=0}^n (n-k) \binom{n}{k}^2$. Therefore $2S = n\sum\limits_{k=0}^n \binom{n}{k}^2 = n \binom{2n}{n}$. Then

$$S = \frac{n}{2}\binom{2n}{n}.$$

Did
  • 279,727
  • It looks great but I have one problem with this result.Actually I need to write a code in C to calculate S(result) where n can be as large as 10^6.Although I'm asked to print S % mod as final output where mod=10^9+7 but still calculating S which is derived will definitely overflow.So Can you help me on that part ? – code_hacker Mar 19 '12 at 17:35
6

Building on the excellent work of @thomas-belulovich & @robjohn, but addressing your subsequently revealed goal....

You want $$ T(n) =S(n)-m\left\lfloor\frac{S(n)}m\right\rfloor =S(n)\text{ mod }m \qquad\text{for}\qquad S(n) =\frac{n}{2}{2n\choose n}. $$ Note that $$ \frac{S(n)}{S(n-1)} =\frac{n}{n-1}\cdot\frac{2n(2n-1)}{n^2} =2\frac{2n-1}{n-1}. $$ A brute force approach to this might be to calculate $$ R(k)=2(2k-1)(k-1)^{-1}\pmod m $$ for each $1<k\le n$ and then multiply modulo $m$ termwise to obtain $$ T(n)=S(1)\prod_{k=2}^nR(k). $$ Calculating each $R(k)=(4k-2)x$ requires $O(\log k)$ divisions using the extended Euclidean algorithm to find an $x$ so that $$ (k-1)x+my=1. $$

Extended Euclidean Algorithm: Given nonzero $a,b\in\mathbb{Z}$, find $x,y\in\mathbb{Z}$ so that $ax+by=\text{gcd}(a,b)$. (Adapted from David Joyner's book.)

  1. Set $\overline{u}=\langle{u_0,u_1,u_2}\rangle$ $\leftarrow\langle{a,1,0}\rangle$ and $\overline{v}\leftarrow\langle{b,0,1}\rangle$ (vectors in $\mathbb{Z}^3$). Set $\overline{v}\leftarrow-\overline{v}$ if $b<0$.

  2. While $v_0 \ne 0$, repeat:

  3. $\qquad$ Calculate the quotient $q=\text{quo}(u_0,v_0)=u_0-v_0\left\lfloor\frac{u_0}{v_0}\right\rfloor$.

  4. $\qquad$ Set $\quad\overline{w}=\overline{u}-q\overline{v},\quad$ then $\quad\overline{u}=\overline{v},\quad$ and then $\quad\overline{v}=\overline{w}.\quad$

  5. Return $a=u_0$, $x=u_1$, and $y=u_2$.

This is doable in not too many lines of C code, and it works as long as $m=10^9+7$ has no (prime) factors $p\le n$ (in fact, this $m$ is prime). If you needed a more efficient algorithm and $m$ were a composite product of distinct primes (which it isn't), you could use the prime factorization of $m$ and a nice fact about binomial coefficients modulo primes [Lukas 1878] to separately calculate residues $$ {a\choose b}\equiv{[a/p]\choose[b/p]}{a\mod p\choose b\mod p}\pmod p $$ modulo each factor $p$, and then recover $T(n)$ using the Chinese Remainder Theorem.

Here's a few pre-computed values: $$\matrix{ k& n=10^k &{2n\choose n}\text{ mod }m &T(n) \\0& 1 &2 &1 \\1& 10 &184756 &923780 \\2& 100 &407336795 &366839610 \\3& 1000 &72475738 &237868748 \\4& 10000 &703593270 &966325381 \\5& 100000 &879467333 &366342189 \\6& 1000000 &192151600 &799327475 }$$

If you want to find an efficient method to solve this problem, you'll probably want to look at the works of Donald Knuth, Andrew Granville, and Richard Stanley, who also has compiled outstanding lists of bijective proof problems and characterizations of the Catalan numbers $C_n=\frac1{n+1}{2n\choose n}$, to which our $S(n)$ are closely related since $S(n)={n+1\choose2}C_n$.


One might be tempted to try to shorten the computation using Wilson's theorem, $$ p\text{ prime} \quad\implies\quad (p-1)!\equiv-1\pmod p $$ Morley's (1895) congruence, $$ p=2n+1\text{ prime} \quad\implies\quad {2n\choose n}\equiv(-1)^n4^{2n}\pmod{p^3} $$ and/or Kummer's theorem using enough "landmark" primes near $n$ and $2n$, with extended Euclidean algorithm for inverses and the CRT (there's the catch!) to get the final result. For example, here are some prime pairs $q_i=2p_i+1$ near $n=10^6$ and $2n$: $$ \matrix{ i & p_i-n & q_i-2n \\ 1 & -251 & -501 \\ 2 & -191 & -381 \\ 3 & 151 & 303 \\ 4 & 193 & 387 \\ } $$ From Wilson's theorem for odd primes $q$, grouping $(q-1)!=(1\cdots\frac{q-1}2)(\frac{q+1}2\cdots q-1)$ and noting that the second term is $(-1)^\frac{q-1}2$ times the first term modulo $q$, we find that $$ \left(\left(\tfrac{q-1}{2}\right)!\right)^2 \equiv(-1)^{\frac{q+1}2} \pmod q $$ so that for prime pairs $q_i=2p_i+1$, $$ {2p_i\choose p_i}\equiv(-1)^{p_i}=-1\pmod{q_i}. $$ Thus we can compute (using the extended Euclidean algorithm for the inverses of $k$ modulo $q_i$) $$ {2n\choose n}\equiv-\prod_{k=p_i+1}^n\frac{4k-2}{k} \pmod{q_i} $$ for $i=1,2$ above. However, we cannot use the CRT to get $T(n)$. We would have to have enough prime pairs to reconstruct $S(n)$, and then compute its residue at the end. Since the central binomial coefficient is approximately $${2n\choose n}\approx\frac{4^n}{\sqrt{\pi n}}\left(1-\frac1{nc_n}\right),\qquad c_n\in(8,9)$$ we would need about 96 thousand pairs (p,q), making this approach infeasible.

bgins
  • 7,990
  • Based on the recursive formula : S(n)/S(n-1) = (4n-2)/(n-1) why shouldn't I pre compute S(n) for all n upto 10^6.Can I do like that ? – code_hacker Mar 19 '12 at 19:51
  • Of course, you can precompute these if it suits your needs. Please not I made some corrections to the ratio $S(n)/S(n-1)$ and hence also (my) $R(n)$. – bgins Mar 19 '12 at 20:41
  • Wow I learnt more from this answer. My +1 for this. – Kirthi Raman Mar 19 '12 at 21:16
  • @bgins : I tried to find the multiplicative inverse using extended euclidean algorithm but it's giving me wrong answer for n>=3 In fact the corresponding x(multiplicative inverse) at every step is very large.Can you please tell me where am I doing wrong ? One thing I'm sure of is that my extended_euclid() function is correct as I have tested and implemented it on the description and example of wikipedia.Thanks in advance. Link to the updated code : http://pastebin.com/imS6rdWs – code_hacker Mar 20 '12 at 07:55
  • You have one or two issues left. Your xgcd needs to use signed types since you are subtracting. Also, as a general purpose routine, it's not returning inverses when the gcd is greater than one (which shouldn't be a problem here since $m$ is prime and $0<n<m$). Do you really need this in C++? Why not try sage or sage online? Type xgcd? or inverse_mod? for example (and binomial(2*n,n) % m for a direct, costly, arbitrary precision calculation). Although right now the public server seems to be loaded down. – bgins Mar 20 '12 at 13:09
  • @bgins : I think problem is much more than expected.I modified my code and now using only long long data types but still getting absurd results for n>=3.Here is my updated code: http://pastebin.com/FsaZrxe1 One more thing,when I'm calculating x(multiplicative inverse) it sometimes comes out to be negative.why is it coming negative ? Can you please explain the idea of this x in calulcations.On getting x,should I simply multiply it with (4k-2) and then take modulo of that.Is it right ? It might be irritating for you and I'm sorry for that.But I'm really stuck with this code.Please help me. – code_hacker Mar 20 '12 at 21:20
  • I got the c++ code working and it's quite fast. You just need to reduce mod $m$ sufficiently often, as @AndreNicolas recommended. Was this homework for math or computer science, or something else? – bgins Mar 20 '12 at 22:56
  • @code_hacker: so your background is computer science? or something else? The extended euclidean algorithm is a beautiful algorithm from elementary number theory. An example: gcd$(3,5)=2\cdot3-1\cdot5$. In general, when $a,b>0$, then $x$ & $y$ must have opposite signs. Because of the subtractions, it can produce negative numbers. But, miraculously as it may seem, $x$ and $y$ are never too large in magnitude, so a single addition by $m$ will "correct" a negative sign. It is generally necessary to reduce mod $m$ after every significant operation in order to avoid overflow, as mentioned before. – bgins Mar 20 '12 at 23:23
  • @bgins:Thanks finally abled to solve it.Overflow was occuring earlier which need to be reduced using mod m.Thanks for the help. – code_hacker Mar 21 '12 at 10:30
2

First we address the overflow issue. Note that $10^9+7$ is relatively prime to all the numbers that come up in a naive calculation of $\binom{2n}{n}$. Indeed $10^9+7$ happens to be prime. So when we are calculating, we can always reduce modulo $10^9+7$ as often as necessary to prevent overflow.

Now to the identity. We have $n$ boys and $n$ girls. We want to choose $n$ people. The number of ways this can be done is $\binom{2n}{n}$. But we can choose $0$ boys and $n$ girls, or $1$ boy and $n-1$ girls, and so on. We can choose $k$ boys and $n-k$ girls in $\binom{n}{k}\binom{n}{n-k}$ ways, or equivalently in $\binom{n}{k}^2$ ways. This gives the standard derivation of the identity $$\binom{2n}{n}=\sum_{k=0}^n \binom{n}{k}^2.$$ Note now that the $\binom{2n}{n}$ choices are all equally likely. The expected number of boys is, by symmetry, equal to $\frac{n}{2}.$ But the probability that there are $k$ boys is $\frac{\binom{n}{k}^2}{\binom{2n}{n}}$, and therefore the expected number of boys is $$\sum_{k=0}^n k\frac{\binom{n}{k}^2}{\binom{2n}{n}}.$$ The term corresponding to $k=0$ is $0$, so can be omitted, and we get $$\sum_{k=1}^n k\frac{\binom{n}{k}^2}{\binom{2n}{n}}=\frac{n}{2},$$ which is essentially our identity.

Remark: There is a very nice book on bijective proofs called Proofs that Really Count. A title that so far doesn't seem to have been used is Mean Proofs.

André Nicolas
  • 507,029
  • This is the way to do things, and will definitely run fast enough. (You will have to implement a function that computes multiplicative inverse mod p, but that isn't terrible and runs very quickly.) – Thomas Belulovich Mar 19 '12 at 18:45
  • I have implemented the things in a C++ code.This is the link to the code http://pastebin.com/qxQsZyz2 but it's giving me wrong answer when I tried to submit it on online judge.Can any one please tell me where am I doing wrong ? – code_hacker Mar 19 '12 at 19:05
  • You're dividing by $i$ on line $17$, which is a truncated division. See Thomas's comment and my post about computing inverses modulo $m$ (my post) or mod (your OP notation). – bgins Mar 19 '12 at 19:28
  • Nice expected value argument! – Thomas Belulovich Mar 20 '12 at 18:41