Here's a combinatorial proof of this fact, picked from
Andrews, George E., Identities in combinatorics. I: On sorting two ordered sets, Discrete Math. 11, 97-106 (1975). ZBL0301.05006.
(On a side note, check out the other parts of "Identities in Combinatorics").
I'm a fan of combinatorial facts, and that one can prove complex binomial identities using counting devices is fascinating.
We begin with the following setup : let $S = \{s_1<s_2<\ldots<s_{n+m}\}$ be a set (of real numbers, say) of size $n+m$. Let $S_1,S_2$ be subsets of $S$, such that $|S_1| = n, |S_2| = m$ (Note : $S_1,S_2$ need not be disjoint). Suppose that $S_1 = \{s_1^1<s^1_2 < \ldots < s^1_n\}$ and $S_2 = \{s^2_1 < s^2_2 < \ldots < s^2_m\}$.
Consider the $n$ smallest elements of the union $S_1 \cup S_2$. The intersection of this set with $S_2$ is our quantity of interest, that we call the "intermingling coefficient". Note that the intermingling coefficient is between $0$ and $\min\{n,m\}$.
The set of $n$ smallest elements of $S_1 \cup S_2$ will be called the intermingling set in this answer. When intersected with $S_2$, the cardinality of that set will be the intermingling coefficient.
Examples : Suppose $S = \{1,2,3,\ldots, 10\}$, $n=6,m=4$, and $S_1 = \{2,3,4,5,6,8\}$ and $S_2 = \{3,4,5,7\}$. Then, $S_1 \cup S_2 = \{2,3,4,5,6,7,8\}$. The six smallest elements are $\{2,3,4,5,6,7\}$, this is the intermingling set. Of these, four belong to $S_2$ so the intermingling coefficient is $4$.
On the other hand, suppose that $S,S_1$ are as above but $S_2 = \{1,8,9,10\}$. Then, $S_1 \cup S_2 = \{1,2,3,4,5,6,8,9,10\}$ and the six smallest numbers are $\{1,2,3,4,5,6\}$ (this is the intermingling set), of which only one belongs in $S_2$, hence the intermingling coefficient is $1$.
We will prove our identity by counting the number of ways we can pick a pair of subsets $S_1$ and $S_2$ from $S$, such that $|S| = m+n,|S_1| = n,|S_2| = m$.
Indeed, since $|S| = m+n$ and $|S_1|=n,|S_2|=m$, one way of doing this is to treat the choice of $S_1,S_2$ as independent choices, and then combine those. That leads to $\binom{m+n}{n}$ choices of $S_1$ and $\binom{m+n}{m}$ choices for $S_2$, hence an answer of $\binom{m+n}{n}\binom{m+n}{m}$.
The other way : we will , for each $0 \leq k \leq \min(n,m)$, find the number of pairs of subsets $|S_1|=n,|S_2| = m$ with intermingling coefficient $k$.
Lemma : Given $|S| = m+n$, the number of pairs of subsets $S_1,S_2$ with $|S_1|=n,|S_2|=m$ having intermingling coefficient $k$, (where $0 \leq k \leq \min(n,m)$ equals $$
\binom{m}{k}\binom{n}{k} \binom{m+n+k}{k}
$$
As you can see, this proves the identity because if $k>n$ or $k>m$ then that particular term of the "infinite" sum is $0$. Thus the infinite sum terminates at $\min(n,m)$.
The main claim behind the proof
Suppose that $S = \{s_1<s_2<\ldots<s_{m+n}\}$. We will use a device originally due to Golomb to create sets that have intermingling coefficient $k$, for a fixed $k$.
Let $s_{m+n}<x_1<x_2<\ldots<x_k$ be a choice of $k$ real numbers, $0\leq k \leq \min(m,n)$. We will replace these later on with elements of $S$. Let $S' = S \cup\{x_1,x_2,\ldots,x_k\}$. Pick a subset $T$ of size $m+n$ from $S'$. This can be done in $\binom{m+n+k}{m+n} = \binom{m+n+k}k$ ways.
Suppose that $T$ is arranged in ascending order, so that $T = \{a_1<\ldots < a_n < b_1 < \ldots < b_m\}$ (so that it has $m+n$ total elements). Let $T_0 = \{a_1,\ldots,a_n\}$ and $T'_0 = \{b_1,\ldots,b_m\}$.
These sets have size $n$ and $m$ respectively, but as written they have intermingling coefficient $0$ by choice. Therefore, we must do two things at this point. One is to make them mingle by exchanging elements. The second, is to get rid of the extra $x_i$s.
To do this, let $\sigma_0 \subset T_0, \sigma'_0 \subset T'_0$, such that $|\sigma_0| = |\sigma'_0| = k$. To make the mingling occur, we will exchange the memberships of elements of these sets. That is, let $$
T_1 = (T_0 \setminus \sigma_0) \cup \sigma_0' \\
T'_1 = (T'_0 \setminus \sigma'_0) \cup \sigma_0
$$
How many choices do we have for $\sigma_0$ and $\sigma'_0$? They are subsets of size $k$, therefore we have $\binom nk$ and $\binom mk$ choices for them respectively. We see that every binomial term has now appeared, so there should be no need to choose anymore.
Claim : The number of ways in which one can pick a pair of subsets with an intermingling coefficient of $k$ is in bijection with the number of tuples $(T, \sigma_0,\sigma_0')$ above that can be created for a particular choice of $x_1,\ldots,x_n$.
If we prove this claim, then we're done because we've already counted that the latter set has size $\binom{m}{k}\binom{n}{k}\binom{m+n+k}{k}$.
One direction of the bijection
What is the intermingling set and coefficient at this point? Indeed, the intermingling set of $T_1$ and $T'_1$ is still $T_0$. However, by choice, $T_0 \cap T'_1 = \sigma_0$, so the intermingling coefficient is equal to $k$. One part is done.
The question is, how do we get rid of the $x_i$s now, and what even was their purpose? To understand this, let's take an example.
Say $S = \{1,2,3,4,5,6,7,8\}$, and $n=4,m=4$. Let $k=3$. Then we add $3$ elements to $S$, to get $S' = \{1,2,3,4,5,6,7,8,x_1,x_2,x_3\}$. Then we choose a subset of size $4+4 = 8$, say $T= \{1,3,4,6,7,x_1,x_2,x_3\}$. This breaks into $T_0 = \{1,3,4,6\}$ and $T'_0 = \{7,x_1,x_2,x_3\}$.
As written, $T_0$ and $T'_0$ are not subsets of $S$ and they have intermingling coefficient $0$. Let $\sigma_0 = \{1,4,6\}$ and $\sigma'_0 = \{7,x_2,x_3\}$. Then, following the "exchange", $T_1 = \{3,7,x_2,x_3\}$ and $T'_1 = \{1,4,6,x_1\}$ have intermingling coefficient equal to $3$ and intermingling set still equal to $\{1,3,4,6\}$.
However, at this point, we observe what the $x_i$ might represent. Indeed, we have not accounted for repeated elements as of yet. Note that if $S_1,S_2$ are general choices of subsets of $S$, then they need not be disjoint. However, as written above, the sets $T_1,T'_1$ will always be disjoint!
Therefore, the $x_i$ will model repeated elements in some way.
Going back to the above example, we then expect that $x_1$ will be replaced by one of $3,7$, and $x_2,x_3$ will each be replaced by one of $1,4,6$. However, think about it : how can this be done without affecting the intermingling coefficient?
Let's take the above example. Suppose that $x_1 \to 3$. Then, $T'_1 = \{1,3,4,6\}$ will end up having intermingling coefficient $4$ regardless of the choices of $x_2,x_3$ so this is clearly out of the way. Thus, $x_1 \to 7$ is forced.
The better way of thinking about this is that the larger $x_1$ is, the better its chances of not appearing in the intermingling set : so we take the largest possible choice of $x_1$.
What about $x_2,x_3$? Actually , any choice works : but since we need a canonical choice, we pick the first two elements of $\sigma_1$ i.e. $x_2 \to 1,x_3 \to 4$. This won't affect the intermingling coefficient. Any choice wouldn't : indeed, all such choices will already fall in the intermingling set, and the intermingling coefficient is only affected if the second set is affected (which is not the case here), so we are just picking the choice that is the easiest to enumerate.
Doing these manipulations, we land up with $S_1=\{3,7,1,4\}$ and $S_2 = \{1,4,6,7\}$, which have intermingling coefficient $3$.
Let's make this rigorous and phrase it properly. Suppose that $\sigma_0 = \{c_1<\ldots<c_k\}$ and $\sigma_0' = \{d_1<\ldots<d_k\}$ where the $c_i,d_i \in S'$ i.e. they could be in $S$ or be one of the $x_i$.
Suppose that $x_i$ occurs in $T_1$. Then replace $x_i$ with $c_i$. (The "canonical" choice for each $i$).
Suppose that $x_j$ occurs in $T_1'$. Then, find out $j' = \#\{l < j : x_l \in \sigma_0'\}$, and replace $x_j$ with $d_{j-j'}$. (The largest possible choice for each $j$).
Check that , under this scheme, we have $x_1 \to 7, x_2 \to 1,x_3 \to 4$.
Call the sets, following replacements, as $T_1 \to S_1$ and $T_1' \to S_2$.
Claim : $S_1,S_2$ have intermingling coefficient $k$.
Proof : The elements of $S_2$ that belong to the intermingling set of $S_1 \cup S_2$ is , by our construction, equal to the set $\sigma_0$ that had size $k$ by definition.
Thus, the constructions of $S_1,S_2$ are complete, and one direction is done.
The other direction
Begin with $S$ and $S_1,S_2\subset S$ of sizes $m+n,n,m$ respectively. Find the intermingling coefficient, which is some $r$, say. Add the formal elements $\sup S < x_1<\ldots<x_r$ to $S$ to create $S'$, as usual.
We know how to find $\sigma_0$ : find the intermingling set and intersect it with $S_2$. That has size $r$. (interestingly enough, $\sigma_0$ doesn't contain any $x_i$ : that's a consequence of $k\leq m$).
What about $\sigma_0'$? Well, first find all the elements of $S_1$ which don't belong in the intermingling set. Those go into $\sigma_0'$. Then, suppose that $\sigma_0 = \{s_1<\ldots<s_r\}$, and find all $i \leq r$ such that $s_i \in S_1$. The corresponding $x_i$ will be put into $\sigma_0'$ to finish off. =Let the set of indices (which is a subset of $\{1,2,\ldots,r\}$) which are put in $\sigma_0'$ be equal to $I$.
What about $T$? Of course, add each of $S_1,S_2,\sigma_0,\sigma_0'$ to $T$. We account for the remaining repeated elements as follows : look at $\sigma_0'$, and specifically the parts of $\sigma_0'$ which belonged in $S_1$, but not the intermingling set. Find those elements that belong in $S_2$ as well (a repeated element which occurs both in $S_1$ and $S_2$), and assign $x_i$ for $i \in I^c$ to these elements in an index-decreasing fashion (as the index in $I^c$ increases, the assigned element decreases).
These are inverses of each other. Let me go through an elaborate example to show you how this works.
Example 1
Let $S = \{1,2,3,4,5,6,7,8\}$, $S_1 = \{1,3,5,8\}$, $S_2 = \{1,4,5,6\}$. The intermingling coefficient is $3$, and the intermingling set is $\{1,3,4,5\}$.
So we create $S' = \{1,2,3,4,5,6,7,8,x_1,x_2,x_3\}$. Then, $\sigma_0$ is just the intermingling set intersected with $S_2$, which is $\sigma_0 =\{1,4,5\}$.
What about $\sigma_0'$? First we take all elements in $S_1$ not in the intermingling set : that's $\{8\}$. Then we find the "indices" of all elements in $\sigma_0 \cap S_1$ : in this case, that's $1,5$ which are the first and third elements, so $x_1,x_3 \in \sigma_0'$. Hence, $\sigma_0' = \{8,x_1,x_3\}$. Thus , $I=\{1,3\}$.
As for $T$, look at the elements of $\sigma_0'$ that belonged in $S_1$ but not in the intermingling set : that's again, just $\{8\}$. However, this element doesn't belong in $S_2$, so no additional $x_i$ is needed (i.e. $x_i$ for $i \in I^c$ is not going to be included in $T$) and we get $$
T = \{1,3,4,5,6,8,x_1,x_3\}
$$
Let's prove that we can go the other way as well. We have $$
T = \{1,3,4,5,6,8,x_1,x_3\} , \sigma_0 = \{1,4,5\}, \sigma_0' = \{8,x_1,x_3\}
$$
According to definitions, $T_0 = \{1,3,4,5\}, T'_0 = \{6,8,x_1,x_3\}$. Perform the exchange of the $\sigma$s to get $$
T_1 = \{3,8,x_1,x_3\}, T'_1 = \{1,4,5,6\}
$$
Interestingly, we don't need to do any replacement in $T'_1$, it's already equal to $S_2$. As for $T_1$, recall that we replace it with "$c_i$" where $c_i$ is what comes when you arrange $\sigma_0$ in increasing order. That means that $c_1 = 1,c_3 = 5$ so we get $x_1 \to 1, x_3 \to 5$. Finally, $T_1 \to S_1$, as desired.
That's how this bijection works. Let's take a more complicated example by modifying one from the paper itself.
The paper's example
The paper has a large example, $$
S = \{1,2,3,\ldots,19\} \\
S_1 = \{1,2,4,5,6,10,11,12,13,17\} \\
S_2 = \{2,3,5,6,9,13,14,17,19\}
$$
This is with $n=10,m=9$. The intermingling set is $\{1,2,3,4,5,6,9,10,11,12\}$ and of these, $5$ of them lie in $S_2$ so the intermingling coefficient is $5$.
Formally, we create $S'=\{1,2,\ldots,19,x_1,x_2,x_3,x_4,x_5\}$.
To find $\sigma_0$, we intersect the intermingling set with $S_2$, so that gives $\sigma_0 = \{2,3,5,6,9\}$.
Then, to find $\sigma_0'$, we first find the elements of $S_1$ not in the intermingling set , which is $\{13,17\}$. Then we find the "indices" of all elements in $\sigma_0 \cap S_1$, which is $\{2,5,6\}$ i.e. the first, third and fourth elements of $\sigma_0$. Hence, $\sigma_0' = \{13,17,x_1,x_3,x_4\}$ : and $I = \{1,3,4\}$.
Finally, to $T$. The elements of $\sigma_0'$ that are in $S_1$ and not the intermingling set is just $\{13,17\}$. In the decreasing assignment fashion, $x_5 \to 13$ and $x_2 \to 17$, so these both belong in $T$ and $$
T = \{1,2,3,4,5,6,9,10,11,12,13,14,17,19,x_1,x_2,x_3,x_4,x_5\}
$$
To recreate $S_1,S_2$, we first create $$
T_0 = \{1,2,3,4,5,6,9,10,11,12\}
T_0' = \{13,14,17,19,x_1,x_2,x_3,x_4,x_5\}
$$
Then we exchange as per $\sigma_0,\sigma_0'$ to get $$
T_1= \{1,4,10,11,12,13,17,x_1,x_3,x_4\} \\
T_1' = \{2,3,5,6,9,14,19,x_2,x_5\}
$$
Replacement time. For $x_i \in T_1$, we replace by the corresponding indexed element in $\sigma_0$, so $x_1 \to 2,x_3 \to 5, x_4 \to 6$. This gives $T_1 = S_1$.
For $T_1'$, take $x_2$. For $j=2$, we have $j'=1$ because $x_1 \in \sigma_0'$. Thus, $x_2 \to d_1 = 13$. For $j=5$, we have $j' = 3$ because $x_1,x_3,x_4 \in \sigma_0'$. Thus, $x_5 \to d_2 = 17$. Finally, $T'_1 = S_2$.
Thus, these maps are in fact bijections!
A generalization
It turns out that one doesn't need to start with a set of size $n+m$ : anything bigger will also have done it. Exactly the same scheme as above leads to the following conclusion :
Theorem : For $\nu \geq m+n$, $$
\sum_{r \geq 0}^{\min(n,m)} \binom{n}{r} \binom mr \binom{\nu+r}{n+m} = \binom{\nu}{m} \binom{\nu}n
$$
It is worth going through the papers of Golomb,
Golomb, S. W., New proof of a classic combinatorial theorem, Am. Math. Mon. 75, 530-531 (1968). ZBL0162.03001.
and of the paper of Baer and Brock,
Baer, Robert M.; Brock, P., Natural sorting, J. Soc. Ind. Appl. Math. 10, 284-304 (1962). ZBL0111.15404.
For more such "sorting"/"most-wanted" related arguments.