2

(Cross-listed from the math stackexchange)

Let $(X_i)_{i=1}^n$ be iid random variables with joint cdf $F$. Recall that the empirical distribution function is: $$ F_n(x) = \frac{1}{n} \sum_{i=1}^n \chi_{[-\infty,x]}(X_i) . $$ Note that $F_n$ is a function-valued random variable. It is known that the Kolmogorov–Smirnov statistic: $$ \mathrm{KS}_n = \|F-F_n\|_\infty , $$ converges almost surely to zero. Even better, if $F$ is continuous, the normalized statistic: $$ K_n = \sqrt{n}\cdot \mathrm{KS}_n $$ converges to the Kolmogorov distribution (sup of the Brownian Bridge), which doesn't depend on the distribution of the $X_i$.

Let $G$ be a compact Lie group (semisimple, if that helps). Is there a similar statistic measuring how close a sequence in $G$ is to a given probability distribution, such that the limiting statistic is distribution-free?

Any ideas or pointers to a reference would be appreciated!

  • What do you mean by "similar" statistics? And do you mean that the asymptotic distribution of the statistics is independent of $F$ or the limiting distribution (mainly moments) of the statistics is independent of $F$. Even in semi-simple Lie groups the notions of limiting distribution and asymptotic distributions do not have to be the same, right? – Henry.L Mar 29 '17 at 15:47
  • I guess the question isn't entirely precise. Given a sequence ${x_i}$ of sample points in $G$, drawn from a distribution $\mu$, I'm looking for a function which measures how "close" the empirical measure $\frac{1}{n} \sum_{i=1}^n \delta_{x_i}$ on $G$ is to the true measure $\mu$, and hoping that there is such a function which, when normalized, converges as $n\to \infty$ to a distribution which does not depend on $\mu$. Does that help? – Daniel Miller Mar 29 '17 at 15:53
  • 1
    Yes. You are talking about asymptotic distributions, there are quite a few results I can complement in my answer, thanks for the clarification! – Henry.L Mar 29 '17 at 15:55
  • Feel free to ask more or start a bounty if you are not satisfied. Great question! – Henry.L Mar 29 '17 at 16:25
  • Daniel - the best choice would be the total variation distance, but this is rather hard to compute and probably stronger than what you have in mind. In many cases you are only interested in weak-* convergence, and limited to proper test function class (say Lip-fcns) a result along your lines is called quantitative equidistribution statement. One possibility is to compute the Fourier-stieltjes coefficients of your empirical measures and try to conclude something, but that usually works best in the abelian case. – Asaf Mar 29 '17 at 17:02
  • @Asaf So what is the benefit of sup-distance over Wasserstein in this case, altough I think sup distance is also a reasonable choice? Could you explain a bit what you are thinking in your comment above? When I consider it, actually I think Wasserstein is easier to compute. If you suggest sup-distance, then it is simply KS test the OP proposed. And I do not see any superiority of sup-distance in the setup of compact groups. Is there any asymptotic results about sup-distance in this setup? thanks – Henry.L Mar 29 '17 at 19:21
  • Henry, total variation is the natural distance function (even norm) defined on measures (not necessarily probability). For example it implies convergence in the Wasserstein metric, but indeed, it is very hard for measures to converge in the total variation distance, so usually for various statistical considerations this requirement is relaxed, for example to weak-* convergence in equidistribution questions, or say the wasserstein distance in optimal transport questions and such. The Fourier approach I've mentioned in the abelian compact case would result in weak-* convergence. – Asaf Mar 29 '17 at 20:38

1 Answers1

2

From the perspective of statisticians, I will suggest two chapters in

Parthasarathy, Kalyanapuram Rangachari. Probability measures on metric spaces. Vol. 352. American Mathematical Soc., 1967.

Parthasarathy explained why it is natural to study probability measures on a Lie group. If you are looking forward to a more algebraic-flavored probability theory, this post might be of interest for you.

Your question is two-fold. One is to find a functional that measures the similarity between an empirical probability measure and another "True" measure; the other side is to discuss the asymptotic behavior of the functional w.r.t. empirical probability measures.

For the first problem, surprisingly there's relatively few functionals that measures the similarity between two probability measures on two (Banach) spaces. And most of such functionals are distances defined on $M(G)$, the space of probability measures defined on the group $G$ (and $G$ is taken as a manifold). One that I feel natural is the Wasserstein distance. The Kolmogorov-Smirnov distance is also one of them. If you are mainly interested in this fold of the problem, there is a recent reference

Chirikjian, Gregory S. Stochastic Models, Information Theory, and Lie Groups, Volume 2: Analytic Methods and Modern Applications. Vol. 2. Springer Science & Business Media, 2011.

For the second problem, since you are asking for the asymptotic behavior of such a empirical probability measure on a Lie group $G$, one approach is to find out an invariant measure that $F_n$ converges to, and therefore so does your statistical functional. For a general treatment of this approach, there is an influential book (especially chapter 2) that explained how you can apply Moore's result(Theorem 2.2.6) when you i.i.d. sample from $G_i$'s. I think there is a generalization to exchangeable case, but cannot remember for now. @Asaf pointed out that this theorem is for non-compact groups and hence might not be useful in your case, my ignorance.

Zimmer, Robert J. Ergodic theory and semisimple groups. Vol. 81. Springer Science & Business Media, 1984.

Another approach to study the asymptotic behavior of the empirical probability measures is the (notoriously hard, and forgive my stupidity) generic chaining technique. This approach, although does not claimed the asymptotic behavior of the empirical measures $F_n$, does use another measure (usually sub-Gaussian) to control the behavior of the sequence $F_n$.

Talagrand, Michel. "The generic chaining."Springer, 2005.

After clarifying two parts of your question, let me return to your question, the answer for Wasserstein distance mentioned above, is partially positive. However, I am doubtful that it can be extended to a very general compact Lie group without any clear product structure, even if semisimple. So if there is any reference, I would be interested to know.

del Barrio, Eustasio, Evarist Giné, and Carlos Matrán. "Central limit theorems for the Wasserstein distance between the empirical and the true distributions." Annals of Probability (1999): 1009-1071.

Henry.L
  • 7,951
  • What's the relevance of Moore's theorem here? The group in the OP's question is compact. – Asaf Mar 29 '17 at 16:53
  • @Asaf Oh.. yes I forgot that, what I remember is that it applies to simple groups. Thanks, edited – Henry.L Mar 29 '17 at 16:55