191

Do you know of any very important theorems that remain unknown? I mean results that could easily make into textbooks or research monographs, but almost nobody knows about them. If you provide an answer, please:

  1. State only one theorem per answer. When people will vote on your answer they will vote on a particular theorem.

  2. Provide a careful statement and all necessary definitions so that a well educated graduate student working in a related area would understand it.

  3. Provide references to the original paper.

  4. Provide references to more recent and related work.

  5. Just make your answer useful so other people in the mathematical community can use it right away.

  6. Add comments: how you discovered it, why it is important etc.

  7. Please, make sure that your answer is written at least as carefully as mine. I did invest quite a lot of time writing my answers.

As an example I will provide three answers to this question. I discovered these results while searching for papers related to the questions I was working on.

Piotr Hajlasz
  • 27,279
  • 2
    Related (although distinct): https://mathoverflow.net/questions/176425/rediscovery-of-lost-mathematics – YCor Mar 27 '18 at 19:00
  • 29
    In the title it is probably better to replace "unknown" by "virtually unknown". – Igor Belegradek Mar 27 '18 at 20:56
  • 20
    @IgorBelegradek The number of humans is finite, so aren't all theorems virtually unknown? :) – Najib Idrissi Mar 28 '18 at 07:20
  • I'd say "widely unknown". "Virtually" sounds senseless here when used in its common meaning. – YCor Mar 28 '18 at 12:07
  • 6
    I am not sure. Most of the results are "widely unknown" because they are known only to specialists in the area. For example one can say that the Hurewicz theorem mod Serre class C is "widely unknown" because it is not a basic result in a standard algebraic topology curriculum. But what I had in mind are the results that are also unknown to the specialists in the area. – Piotr Hajlasz Mar 28 '18 at 15:27
  • 1
    I'm confused about how a theorem can be both important and unknown. If (almost) nobody, even including specialists in its area, knows of the theorem, what can it have achieved that is important? About the only example I can think of is a theorem that shows that some once-popular approach to an important problem couldn't work, which inspired somebody to come up with a new approach that worked or was otherwise fruitful. – David Richerby Mar 29 '18 at 17:22
  • 5
    @DavidRicherby Liu and van Rooij solved a problem posed by Hormander (see my post). This was the only missing case in the result of Hormander. The result of Hormander is in nearly every textbook in harmonic analysis. However, the result of Liu and Van Rooij remains unknown. A result can be important and unknown if it is published in a minor journal. That was the case. – Piotr Hajlasz Mar 29 '18 at 17:37
  • 2
    @DavidRicherby mathematics evolve and there are fashions; so it can occur that some study/result that looks of marginal interest at some date (because it's not fashionable, too original, or even because it's not well spread by its author) appears as important later. – YCor Mar 30 '18 at 13:16
  • @PiotrHajlasz A paper can remain quite unknown even if published in a famous journal. I'm aware of several papers published in major journals with a number of citations close to zero, and mathscinet random browsing yields many examples. – YCor Mar 30 '18 at 13:17
  • 2
    The new question ("What important theorems were less well known before 2018?", replacing "Do you know important theorems that remain unknown?") is far too broad. First, an upper bound on the original year would be a good safeguard. I preferred the "unknown" although not so well-defined (I proposed "widely unknown"). "Less well-known" is as much subjective, and more open-ended. At least the previous formulation was explicitly subjective. In addition, this title will be completely obsolete after 2018, while this question will most likely still reappear in the front list. – YCor Apr 01 '18 at 10:28
  • @PeterLeFanuLumsdaine thanks very much: indeed I didn't notice that the change was done by another user and not by the OP. – YCor Apr 01 '18 at 16:51
  • 1
    isn't this question unanswerable --- the moment an answer appears on MO, the theorem is no longer "unknown" :-) – Suvrit Apr 03 '18 at 00:53
  • "Provide references to the original paper": it's not explicit before in the question that these should have been published. This somewhat narrows the question; there have certainly been results that were known to some community but were not published, and some may have been forgotten, sometimes rediscovered. (I don't mean that narrowing the question in this way is bad -reference to unpublished work would make things even more speculative-, but I just mention this.) – YCor Apr 03 '18 at 14:51
  • 1
    @YCor "Provide references to the original paper" I just wanted to make suggestions about how to write an answer of high quality. I have seen too many posts with vague statements and no references. I wanted to write a clear and concise guideline. Listing all possible exceptions would make my statement long and opaque. I am sure that is someone knows a good, but unpublished result, they will not hesitate to list it here. – Piotr Hajlasz Apr 03 '18 at 22:22
  • 8
    +1 This question has become probably the most informative question I've seen on any stackexchange site, ever. – Yly Apr 05 '18 at 19:03
  • One could say "... unknown or forgotten". – Wlod AA Dec 31 '20 at 22:49
  • Related (but not identical): https://mathoverflow.net/questions/66075/the-half-life-of-a-theorem-or-arnolds-principle-at-work – Gerry Myerson Sep 22 '22 at 05:25

12 Answers12

164

Monotony is a superfluous hypothesis in the Monotone convergence theorem for Lebesgue integral. In fact the following is true.

Theorem - Let $(X, \tau, \mu)$ be a measurable space, $f_n : X \rightarrow [0,\infty]$ a sequence of measurable functions converging almost everywhere to a function $f$ so that $f_n \leq f$ for all $n$. Then $$\lim_{n\rightarrow \infty} \int_X f_n d\mu = \int_X f d\mu.$$

Proof: $$ \int_X f d\mu = \int_X \underline{\lim} \, f_n d\mu \leq \underline{\lim} \int_X f_n d\mu \leq \overline{\lim} \int_X f_n d\mu \leq \int_X f d\mu. $$

I learnt this result from an article by J.F. Feinstein in the American Mathematical Monthly, but I never saw it in any textbook. Since the Monotone convergence theorem is important, I wish to argue that this is also an important theorem. Here is an illustration.

Let $(X, \tau, \mu)$ be a measurable space, $f : X \rightarrow [0,\infty]$ a measurable function. Then $$ \int_X f d\mu = \lim_{r \rightarrow 1, r>1} \sum_{n\in {\bf Z}} r^n \mu\Bigl( f^{-1}([r^n, r^{n+1}))\Bigr). $$ Neither the dominated nor the monotone convergence theorem apply here. Note that this is a way to define the Lebesgue integral of nonnegative functions. Computing integrals by geometrically dividing the $x$ axis is due to Fermat.

Piotr Hajlasz
  • 27,279
coudy
  • 18,537
  • 5
  • 74
  • 134
  • 4
    According to this MSE question, this theorem appears in at least one textbook : https://math.stackexchange.com/questions/2713020/why-is-the-monotone-convergence-theorem-more-famous-than-its-stronger-cousin – Arnaud D. Mar 30 '18 at 13:48
  • 1
    Is this an extension of the monotone convergence or the dominated convergence theorem? – lcv Apr 01 '18 at 19:41
  • 2
    @lcv this is an extension of the monotone convergence theorem. The function is assumed to be non-negative. – coudy Apr 01 '18 at 20:15
  • 4
    I am an undergrad just familiar with the Lebesgue measure, not with measure spaces. I am surprised that about your example; I haven't had time to digest it, but I was under the impression that the D.C.T. and Fatou;s lemma imply the result in your post. If $f$ is integrable, then it is a dominating function and D.C.T applies. If not, then $\int f = \infty$ and by Fatou's lemma $\int f_n \to \infty$. I am guessing that this is true somehow only with the Lebesgue measure, and not in other measure spaces? Othewise I don't understand how your example would be possible. – Ovi Apr 25 '18 at 16:11
  • 1
    @lcv DCT has integrable $g$ rather than nonnegative $f$ right? – BCLC May 01 '18 at 16:20
  • 7
    Are you sure Joel's article was in the AMM? According to https://arxiv.org/abs/1412.7702 it was published in the Irish Mathematical Society's Bulletin – Yemon Choi Jul 23 '18 at 04:26
  • 1
    @YemonChoi Indeed, thanks for pointing the correct reference. – coudy Jul 26 '18 at 18:01
  • 1
    Isn't it a consequence of the dominated convergence theorem? – Viktor B Feb 22 '19 at 23:43
  • 9
    @Sinusx $f$ is not assumed to be integrable. – coudy Feb 26 '19 at 17:51
68

The following result of Hörmander [2] (see also Theorem 2.5.6 in [1]), plays a significant role in harmonic analysis since all convolution type operators are translation invariant.

Definition. We say that a bounded linear operator $T:L^p(\mathbb{R}^n)\to L^q(\mathbb{R}^n)$ is translation invariant if $T(\tau_y f)=\tau_y(Tf)$ for all $f\in L^p(\mathbb{R}^n)$ and all $y\in\mathbb{R}^n$, where $(\tau_y f)(x)=f(x+y)$.

Theorem (Hörmander). If $T:L^p(\mathbb{R}^n)\to L^q(\mathbb{R}^n)$, $1\leq p<\infty$, $1\leq q\leq\infty$ is non-zero and translation invariant, then $q\geq p$.

The proof is simple and well known. The argument does not generalize to the case of $p=\infty$. However, the argument still works if we replace $L^\infty$ by $L^\infty_0$ which is the subspace of $L^\infty$ consisting of functions that converge to $0$ at infinity. In that case Hörmander proved the following result:

Theorem (Hörmander). If $T:L^\infty_0(\mathbb{R}^n)\to L^q(\mathbb{R}^n)$ is non-zero and translation invariant, then $q=\infty$.

Hörmander (see p.97 in [2]), calls this result somewhat incomplete for $p=\infty$.

I was quite curious about the case $p=\infty$ and since I could not find an answer, I discussed it with several mathematicians. As a result of cooperation with M. Bownik, F. L. Nazarov and P. Wojtaszczyk we finally proved the following result:

Theorem. If $T:L^\infty(\mathbb{R}^n)\to L^q(\mathbb{R}^n)$ is non-zero and translation invariant, then $q\geq 2$.
On the other hand, there is a non-zero translation invariant operator $T_1:L^\infty(\mathbb{R}^n)\to L^2(\mathbb{R}^n)\cap L^\infty(\mathbb{R}^n)$. It follows that $T_1:L^2(\mathbb{R}^n)\to L^q(\mathbb{R}^n)$ is bounded for all $2\leq q\leq\infty$.

I was very excited about the result and our proof. However, a few days later I got an e-mail from M. Bownik who told me that that the result had already been proved by Liu and van Rooij [3]! Despite the fact that this paper solves a problem of Hörmander, it has only one citation according to MathSciNet. Many textbooks in harmonic analysis quote the result of Hörmander, but nobody mentions the beautiful result of Lui and van Rooij!

I cannot resist and I have to recall my e-mail conversation with Nazarov:

Dear Fedja, Bad news. The problem was solved in 1974 by Liu and Van Rooij.....

Dear Piotr, Actually, this is a wonderful news: instead of going through the painstaking and time consuming proofreading and submission process, we can just relax and think of something else :-).

Fore more details see slides: https://sites.google.com/view/piotr-hajasz/research/presentations

[1] L. Grafakos, Classical Fourier analysis. Second edition. Graduate Texts in Mathematics, 249. Springer, New York, 2008.

[2] L. Hörmander, Estimates for translation invariant operators in $L^p$ spaces. Acta Math. 104 (1960), 93-140. https://doi.org/10.1007/BF02547187 https://link.springer.com/article/10.1007%2FBF02547187

[3] T. S. Liu, A. C. M. van Rooij, Translation invariant maps $L^\infty(G)\to L^p(G)$. Nederl. Akad. Wetensch. Proc. Ser. A 77 = Indag. Math. 36 (1974), 306-316. https://doi.org/10.1016/1385-7258(74)90021-3

Piotr Hajlasz
  • 27,279
  • 11
    Are those operators really called translation invariant ? Shouldn't they be called equivariant ? "invariant" to me suggests an condition of the form $T\circ\tau_a = T$. – Johannes Hahn Mar 27 '18 at 15:07
  • 8
    In harmonic analysis such operators are called translation-invariant, see reference [1] and many other books in harmonic analysis. – Piotr Hajlasz Mar 27 '18 at 15:11
  • 28
    @JohannesHahn, your terminological objection certainly makes sense, but/and, depending on notation, the visibly reasonable terminology can vary. E.g., from the viewpoint of your comment, $T\circ \tau =\tau\circ T$ makes $T$ look $\tau$ equivariant, indeed. But/and if we declare that the action of $\tau$ is by $\tau\circ T\circ \tau^{-1}$, then $T$ "becomes" $\tau$ invariant. In my own experience this comes up a lot, and I use "equivariant" and "invariant" indiscriminantly, in part because of the inevitable ambiguity. (Esp., e.g., on ${\mathrm Hom}(X,Y)$ spaces with bimodules...) – paul garrett Mar 27 '18 at 18:05
  • 1
    I believe this predates Hormander. I've seen this attributed to Littlewood as the "the higher exponents are always on the left" principle. – Mark Lewko Jun 05 '18 at 19:29
  • 4
    @MarkLewko Do you have a reference to the paper of Littlewood? I googled your quote and found it in notes of Tarry Tao, but there was no reference to a particular result to Littlewood. – Piotr Hajlasz Jun 05 '18 at 19:45
60

Acknowledgment: Let me acknowledge Theo Buehler from whom I have learnt about the theorem to be presented, however since Theo seems to be absent from MO, let me post it by myself.

Lemma (Zabreiko, 1969) Let $X$ be a Banach space and let $p\colon X \to [0,\infty)$ be a seminorm. Suppose that for every absolutely convergent series $\sum_{n=1}^\infty x_n$ in $X$ we have $$ p\left(\sum_{n=1}^\infty x_n\right) \leqslant \sum_{n=1}^\infty p(x_n) \in [0,\infty]. $$ Then $p$ is continuous. That is, there is a constant $C\geqslant 0$ such that $p(x)\leqslant C\Vert x\Vert$ for all $x\in X$.

Now, using Zabreiko's lemma you may easily recover the open mapping theorem, Banach's bounded inverse theorem, the uniform boundedness principle, and closed graph theorem. For more details see this fantastic post.

Original references:

Tomasz Kania
  • 11,291
  • 4
    A textbook containing the theorem is "An Introduction to Banach Space Theory" by Megginson, where it is also used to prove the mentioned results. – Michael Greinecker Jul 23 '18 at 05:54
  • 1
    Does it hold for Frechet spaces too? – Fedor Petrov Feb 18 '19 at 23:00
  • @FedorPetrov Yes, it does. I just posted an answer to this https://math.stackexchange.com/questions/842870/zabreikos-lemma/4534644#4534644 on MSE which shows that the Zabreiko lemma (ZL) follows very easily from the open mapping theorem (OMT, some people may find this disappointing). Since the OMT holds for Fréchet spaces so does ZL. – Jochen Wengenroth Sep 19 '22 at 13:57
57

Here is one little-known and one completely unknown result.

The little known result is the Mean Motion Theorem. This says that for all real numbers $\lambda_j$ and all complex numbers $a_j$ the following limit exists: $$m:=\lim_{t\to+\infty}\phi(t)/t,\quad\mbox{where}\quad \phi(t)=\arg\sum_{j=1}^na_je^{i\lambda_jt},$$ where $t$ is real. (There is a natural way to define what happens to the $\arg$ at the zeros, but there is not much loss in generality if one assumes for simplicity that the sum has no real zeros).

This result was conjectured by Lagrange, coming from celestial mechanics, and was proved in full generality by the combined efforts of H. Weyl, P. Hartman and A. Wintner in the 1930s. The final result, without any restrictions on $\lambda_k,a_k$ is due to B. Jessen and H. Tornehave in 1945. It seems that the subject was forgotten after the 1940s.

The completely unknown result is a much stronger statement for $n=3$ under some additional conditions on $\lambda_j$ and $a_j$, namely that $$\phi(t)=mt+O(1).$$ This is due to Piers Bohl in 1909. I have never seen any reference on this stronger result, or any discussion of possible generalization to larger $n$.

Weyl, Wintner and Hartman refer to Bohl proving the $n=3$ case of their results, the first non-trivial case, but do not discuss the $O(1)$. Favorov's paper from 2008 has Bohl's paper in the reference list but also does not discuss the $O(1)$. In fact I have not seen ANY mention of a more precise error term than $o(t)$ in the literature. A number of papers GENERALIZE the mean motion theorem to infinite sums. But nobody addresses the improvement of the error term. Here is another piece of evidence that the result is "completely unknown": Precise form of the mean motion theorem.

Remark on references. The only book I know which addresses the subject is Sternberg's 1969 book. (This book has a rare distinction: it is not reviewed in Mathscinet:-) The whole first chapter of the book explains the historical background: the problem is evidently related to constructing a calendar:-)

Weyl's 1938 paper is very well written, fortunately in English, and accessible to a non-specialist. If you can read German or Russian, Bohl's paper is also good reading, it is completely elementary. I suspect that nobody reads Bohl since his result has been "superseded" by Weyl and Co. It does not help that it was published in German.

References in chronological order:

Bohl, P., Über ein in der Theorie der säkularen Störungen vorkommendes Problem. J. für Math. 135, 189-283 (1909). (There is a Russian translation which is difficult to obtain, so I post it here for the benefit of this community.)

Weyl, Hermann, Mean motion, Amer. J. Math. 60, 889-896 (1938)

B. Jessen and H. Tornehave, Mean motions and zeros of almost periodic functions. Acta Math. 77, (1945). 137–279.

S. Sternberg, Celestial mechanics, Part 1, W. A. Benjamin, NY, 1969.

Favorov, S. Yu., Lagrange's mean motion problem, Algebra i Analiz 20 (2008), no. 2, 218--225; translation in St. Petersburg Math. J. 20 (2009), no. 2, 319–324. MR2424001

  • What are the "additional conditions on $\lambda_j$ and $a_j$"? – aorq Mar 28 '18 at 01:22
  • One condition is that $|a_j|=\sum_{k\neq j}|a_k|$, (if $>$ then this is easy) the other conditions are more difficult to state, one needs more definitions. He also shows that on a dense set of $\lambda$, the error term is not $O(1)$. – Alexandre Eremenko Mar 28 '18 at 02:24
  • 2
    Could you suggest a reference, ideally a book, monography or survey, where I, as a non-specialist (or even, as a complete tourist), might learn more about this little-known-but-not-completely-unknown Mean Motion Theorem (I mean the form which you say is due to Weyl, Hartman and Wintner)? – Gro-Tsen Mar 28 '18 at 08:48
  • @Gro-Tsen: I edited the answer and addressed this in Remark 2. – Alexandre Eremenko Mar 28 '18 at 13:22
  • 3
    The whole subject of celestial mechanics is a good example! –  Mar 29 '18 at 13:42
  • 3
    @AlexandreEremenko I think you should write an article for the American Mathematical Monthly about this topic. – Piotr Hajlasz Mar 29 '18 at 14:12
  • 5
    @Piotr Hajlas: I have nothing really new to say, and an expository paper was actually written by Favorov (which I recommended him to do). – Alexandre Eremenko Mar 29 '18 at 18:42
  • 2
    @Matt F.: The subject of celestial mechanics is certainly not forgotten, it is used all the time to control the spacecraft, not speaking of astronomical applications. But it lost somewhat of its popularity with pure mathematicians. – Alexandre Eremenko Mar 29 '18 at 18:46
  • I have Sternberg's books (both parts) in an electronic format. If anyone is interested, please contact me (it is easy to find my e-mail if you google my name) and can send it to you. – Piotr Hajlasz Mar 29 '18 at 18:57
  • Bear in mind AMM receives many submissions. – Hollis Williams Aug 22 '20 at 20:12
55

In 1942 A. Sard [3] (see also [4]) proved the following theorem.

Theorem (Sard). Let $f:M^m\to N^n$ be of class $C^k$, and let $S={\rm crit}\, f$. If $k> \max (m-n, 0)$, then $\mathcal{H}^n(f(S))=0$.

Here $\mathcal H^s$ stands for the $s$-dimensional Hausdorff measure (we shall follow the convention that $\mathcal H^s$ is the counting measure for all $s\leq 0$) and, for a $C^1$ mapping $f\colon M^m\to N^n$, ${\rm crit}\, f$ denoted the set of critical points of $f$.

It is well known that the assumptions of Sard's theorem are optimal within the scale of $C^k$ spaces. Now, several years after Sard's paper, A.Ya.Dubovitskii [1] obtained a more general, better result.

Theorem (Dubovitskii). Let $f\colon M^m\to N^n$ be a mapping of class $C^k$. Set $s=m-n-k+1$. Then $$ \mathcal H^s (f^{-1}(y) \cap {\rm crit}\, f) = 0 \quad \text{for $\mathcal H^n$ a.e. $y\in N^n$.} $$

If $k> \max (m-n, 0)$, Sard's theorem follows from that of Dubovitskii.

Dubovitskii, like a large number of mathematicians in the Soviet Union of that time, was isolated from the West and from the new results of western mathematics. He does not quote Sard's paper. On pages 398-402 of [1] he gives a variant of Whitney's example for the sharpness of the Sard theorem, and an example of a function $f\in C^k\bigl((0,1)^m,(0,1)^n\bigr)$ such that all sets $f^{-1}(y)\cap {\rm crit}\, f$ have $(m-n-k)$-dimensional Hausdorff measure greater than zero, where $m,k,n$ are positive integers such that $m-n-k>0$. He attributes the first example to Menshov but gives no reference, and acknowledges Menshov, Novikov, Kronrod and Landis in his Introduction.

The result of Dubovitskii remained unknown until 2005 when a new proof and some generalizations were published in [2]. It is very surprising since his paper was quoted in Milonor's, Topology from the Differentiable Viewpoint (p. 10).

[1] A. Ya. Dubovickii, On the structure of level sets of differentiable mappings of an $n$-dimensional cube into a $k$-dimensional cube. (Russian) Izv. Akad. Nauk SSSR. Ser. Mat. 21 (1957), 371-408.

[2] B. Bojarski, P. Hajłasz, P., P. Strzelecki, Sard's theorem for mappings in Hölder and Sobolev spaces. Manuscripta Math. 118 (2005), no. 3, 383–397.

[3] A. Sard, The measure of the critical values of differentiable maps. Bull. Amer. Math. Soc. 48 (1942), 883-890.

[4] S. Sternberg, Lectures On Differential Geometry. Prentice Hall, 1964.

Personal comment. I proved the result of Dubovitskii as an undergraduate student. When I discovered that it had already been published, I was quite devastated. I had waited 15 years before I decided to publish it. I am happy I did publish it. Not because of a new `modern' proof and some generalizations that I and my collaborators were able to obtain, but because the old result of Dubovitskii has been brought to public and gained a proper recognition. This comment is related to an answer that I gave to another post.

Piotr Hajlasz
  • 27,279
42

My answer is inspired by the one of coudy: how many scientists who deal with the Lebesgue integral on a daily basis know that there exists a necessary and sufficient condition for the passage to the limit under the integral symbol? I learned about it while reading some papers on the history of Italian mathematics written by the late Gaetano Fichera: the result, stated in modern language ([4], Ch. VIII, pp. 110-128), is reported below.

Definition 1. Let $(E,\mathcal{E})$ be a measure space and $\phi:\mathcal{E}\to\overline{\mathbb{R}}$ a numerical set function: $\phi$ is called exhaustive if $$ \lim_n\phi(A_n)=0 $$ for all families $\{A_n\}$ of pairwise disjoint sets in $\mathcal{E}$.

Definition 2. Let $(E,\mathcal{E})$ be a measure space and $H$ a set (and thus possibly a family) of numerical set functions defined on $\mathcal{E}$: $H$ is called uniformly exhaustive if the numerical set function $$ A\mapsto\sup_{\phi\in H} \vert\phi(A)\vert\;\text{ is exhaustive.} $$

Cafiero's theorem (on the passage to the limit under the integral). Let $(E,\mathcal{E})$ be a measure space, $(\mu_n)_{n\geq 1}$ be a sequence of real measures and $(f_n)_{n\geq 1}$ be a sequence of real functions such that $f_n\in\mathcal{L}^1(\vert\mu_n\vert)$ for all $n$ (here the notation $\vert\mu\vert$ identifies the variation of the measure $\mu$). Suppose moreover that the following pointwise limits exist \begin{split} \lim_{n\to\infty} \mu_n &=\mu\\ \lim_{n\to\infty} f_n &=f \end{split} where $\mu$ and $f$ are respectively a real measure and a real function. Then $$ \lim_{n\to\infty} \int f_n\mathrm{d}\mu_n = \int f \mathrm{d}\mu\iff\text{$(f_n\cdot\mu_n)_{n\geq 1}$ is uniformly exaustive.} $$

The result was originally proved by Cafiero in [1] (see also book [2], ch. VII, §2 pp. 377-392), who generalized the concept of uniform additivity introduced before and independently by Renato Caccioppoli and Vladimir Dubrovskii: that theorem includes the ones of Nykodym, Vitali, Hahn and Saks and an earlier result of Gaetano Fichera [3], where a necessary and sufficient condition was proved for the integral with respect to a given fixed measure. The work of Cafiero is cited in the bibliography of the treatise on linear operators by Dunford and Schwartz but, to my knowledge, the only English reference discussing (very briefly) his contribution is the recent treatise of Vladimir Bogachev.

[1] Cafiero, F. (1953), "Sul passaggio al limite sotto il segno d'integrale per successioni d'integrali di Stieltjes-Lebesgue negli spazi astratti, con masse variabili con gli integrandi [On the passage to the limit under the integral symbol for sequences of Stieltjes–Lebesgue integrals in abstract spaces, with masses varying jointly with integrands]" (Italian), Rendiconti del Seminario Matematico della Università di Padova, 22: 223–245, MR0057951, Zbl 0052.05003.

[2] Cafiero, F. (1959), Misura e integrazione [Measure and integration] (Italian), Monografie matematiche del Consiglio Nazionale delle Ricerche 5, Roma: Edizioni Cremonese, pp. VII+451, MR0215954, Zbl 0171.01503.

[3] Fichera, G. (1943), "Intorno al passaggio al limite sotto il segno d'integrale" [On the passage to the limit under the integral symbol] (Italian), Portugaliae Mathematica, 4 (1): 1–20, MR0009192, Zbl 0063.01364.

[4] Letta, G. (2013), Argomenti scelti di Teoria della Misura [Selected topics in Measure Theory], (in Italian) Quaderni dell'Unione Matematica Italiana 54, Bologna: Unione Matematica Italiana, pp. XI+183, ISBN 88-371-1880-5, Zbl 1326.28001.

  • what does "(and thus possibly a family)" mean? – mathworker21 Jan 14 '20 at 13:59
  • Hi @mathworker21. It is a clarification, perhaps a little pedantic: the definition of uniformly exhaustive I reported from [4] (Definition 1.1, p. 110) refers a simple set $H$, while in Cafiero's theorem we have a net $(f_n\cdot\mu_n)_{n\ge1} which is a particular kind of family. I pointed out that the definition includes that case, since a net is a special case of a (set theoretical) family and each set is a family with itself taken as the index (Halmos docet). Finally, apologies for the later in my answer to your comment: I am very busy now. – Daniele Tampieri Jan 15 '20 at 18:41
  • @DanieleTampieri I'm struggling to get my head around the condition $\lim_n \phi(A_n)=0$ for all disjoint $(A_n)_n$, or did you mean $\liminf_n \phi(A_n)$ ? – dohmatob Oct 21 '21 at 14:29
  • 1
    @dohmatob no: we are really taking the full limit. The condition means that for any family ${A_n}n$ such that the set theoretic limit is the empty set $\emptyset$, the numerical set function $\phi$ goes likewise to $0$ (you surely noticed that $\Cap{m\ge j\ge n} A_j=\emptyset$ for all $m$). (P.S. I apologize for my later in answer to your comment, and good luck for your researches). – Daniele Tampieri Oct 22 '21 at 12:27
  • @DanieleTampieri Thanks for the reply, and more broadly, for Cafiero's theorem. – dohmatob Oct 23 '21 at 09:15
  • The answer of @coudy referenced here. – LSpice Dec 31 '21 at 17:12
  • 1
    @LSpice, thanks. I'll put the ink in the answer. – Daniele Tampieri Dec 31 '21 at 17:14
  • @Daniele tempieri. Isn't something missing in your condition: the fact that the sets $A_n$ are pairwise disjoint does not imply what it is supposed, unless the whole space has finite measure. For example, if we take $\mathbf R_+$ and the $A_n = [n, n+1[$, then these sets are pairwise disjoint but their Lebesgue measure is equal to 1. – MikeTeX Mar 06 '23 at 10:26
41

Ionin–Pestov theorem is not very well known, but it deserves to be included in standard introductory texts on differential geometry of curves. It gives the simplest meaningful example of a local-to-global theorem which is what differential geometry is about.

Theorem. Assume that a plane region $F$ is bounded by a simple loop with curvature at most $1$. Then $F$ contains a unit disc.

enter image description here

The original reference:

  • Пестов, Г. Г., Ионин В. К. О наибольшем круге, вложенном в замкнутую кривую // Доклады АН СССР. — 1959. — Т. 127, № 6.

We used it in our textbook What is differential geometry.

  • 16
    This was on my written qualifying exam! Afterward, the person who put it on the exam realized they did not know how to prove it. – Deane Yang Dec 31 '20 at 04:06
  • 2
    Nice. I immediately thought that at any point on the boundary there has to be a disk tangent to it and inside $F$... But of course that is not the case! – Yaakov Baruch Sep 20 '22 at 09:30
  • 3
    Let me add that this theorem has been extended to a bound on the curvature in the viscosity sense (see Theorem 1.6 of 10.1007/s00526-017-1263-0). – Paolo Intuito Oct 28 '22 at 14:22
35

The Lusin theorem says that a measurable function coincides with a continuous function away from a set of measure less than $\varepsilon$. A result of Federer says that an a.e. differentiable function coincides with a $C^1$ function away from a set of measure less than $\varepsilon$. So such functions have a $C^1$ Lusin property. Imomkulov proved an analogous $C^2$ Lusin property for subharmonic functions.

The following fundamental property of subharmonic functions was proved by S.A.Imomkulov [4] in 1992:

Theorem. Let $f(x)$ be a subharmonic function on a domain $D\subset\mathbb{R}^n$. Then for any $\varepsilon>0$ there is $g\in C^2(\mathbb{R}^n)$ such that $ |\{x\in D:\ f(x)\neq g(x)\}|<\varepsilon. $

Unfortunately, the result is not known. According to MathSciNet the paper of Imomkulov has zero citations. Recently, another proof has been obtained in [2] although the authors were not aware of the result of Imomkulov.

Convex functions are subharmonic and in that special case the above result was proved in [1] and in [3]. Note that both of the papers were published after the paper of Imomkulov.

[1] G. Alberti, On the structure of singular sets of convex functions. Calc. Var. Partial Differential Equations 2 (1994), 17–27.

[2] G. Alberti, S. Bianchini, C. G. Stefano; Crippa, On the $L^p$-differentiability of certain classes of functions. Rev. Mat. Iberoam. 30 (2014), no. 1, 349–367.

[3] L. C. Evans, W. Gangbo, W. Differential equations methods for the Monge-Kantorovich mass transfer problem. Mem. Amer. Math. Soc. 137 (1999), no. 653

[4] S. A. Imomkulov, Twice differentiability of subharmonic functions. (Russian. Russian summary) Izv. Ross. Akad. Nauk Ser. Mat. 56 (1992), no. 4, 877--888; translation in Russian Acad. Sci. Izv. Math. 41 (1993), no. 1, 157–167

Piotr Hajlasz
  • 27,279
  • 5
    This is very nice; a while ago I was interested in a question about stronger versions of Lusin's theorem (https://mathoverflow.net/questions/34518/analogues-of-luzins-theorem); I didn't know about either Federer's result or Imomkulov's. – Vaughn Climenhaga Apr 08 '18 at 03:25
29

The following is an example from computability theory (more specifically, $\lambda$-calculus) so maybe it's a bit at the border of mathematics proper, but both the question and the answer are so simple and natural that I think it deserves to be mentioned.

We consider the simply-typed $\lambda$-calculus, with types $A,B,\ldots$ generated by a single atom $o$: $$A,B ::= o\mathrel{|} A\to B.$$ Let $A$ be an arbitrary type and let $$ \begin{array}{rcl} \mathsf{Str}[A]&:=& (A\to A)\to(A\to A)\to A\to A, \\ \mathsf{Bool}&:=& o\to o\to o. \end{array} $$ These are the standard types of Church binary strings and booleans, respectively. By well-known results, for all $A$, every closed term $M:\mathsf{Str}[A]\to\mathsf{Bool}$ must decide some language $L\subseteq\{0,1\}^\ast$. Let $\mathsf{ST}\lambda$ be the class of such languages, i.e., $$\mathsf{ST}\lambda:=\{L\subseteq\{0,1\}^\ast\mathrel{|}L\text{ is decidable by some }M:\mathsf{Str}[A]\to\mathsf{Bool}\text{ for some type }A\}.$$ It is natural to ask the following:

Question: Does $\mathsf{ST}\lambda$ correspond to a well-known class?

Surprisingly, very few people know the answer. I myself mentioned in passing this question in front of various audiences of $\lambda$-calculus experts, explicitly saying that a precise characterization of $\mathsf{ST}\lambda$ was missing and that probably this did not correspond to anything remarkable (I even wrote this in a CS Theory StackExchange answer). I never got any reaction. So you can imagine how surprised I was when, a few days ago, after discussing this with two colleagues and realizing that the answer might be the well-known class of regular languages, we found out that we were right and that this had actually been known for more than 20 years:

Theorem. $\mathsf{ST}\lambda=\mathsf{REG}$ (the regular languages on $\{0,1\}$).

This is Theorem 3.4 in the following paper (which is as nice as is unknown):

Gerd G. Hillebrand, Paris C. Kanellakis: On the Expressive Power of Simply Typed and Let-Polymorphic Lambda Calculi. LICS 1996: 253-263.

Very roughly, the proof is based on the fact that the simply-typed $\lambda$-calculus may be interpreted in the category of finite sets (it is cartesian closed) and from this interpretation one may build a finite state automaton for the language decided by a simply-typed $\lambda$-term.

Granted, this is nothing ground-shattering, researchers in the theory of $\lambda$-calculus and programming languages may survive very well (as they do) without knowing it, but I find it doubly suprising that this result is never mentioned: on the one hand, because $\mathsf{ST}\lambda$ ends up being such a simple and universally known class; on the other hand, because this is quite unexpected. Indeed, the class of functions on natural numbers computable by simply-typed $\lambda$-terms is, depending on the convention, either the polynomials with if-then-else (an old result of Schwichtenberg) or a weird subclass of the elementary functions, which includes all towers of exponentials but fails to include, for instance, subtraction (an old result of Statman; if I remember correctly, this class was studied thoroughly by Thierry Joly in his thesis). In both cases, this seems to give the simply-typed $\lambda$-calculus far more power than deciding regular languages.

Edit (22 Nov 2021): I'd like to add that, since I wrote this answer, the above theorem of Hillebrand and Kanellakis has become the starting point of a rich theory exploring the connections between automata and $\lambda$-calculi, showing that the result is more than just a curiosity. An account of this line of research may be found in Lê Thành Dũng "Tito" Nguyễn's Ph.D. thesis.

  • 1

    These are the standard types of Church binary strings and booleans

    I think that deserves a big caveat: Church-encoding is well-known to not really work in STLC because you must fix the result type, and it only works properly in either untyped lambda calculus, or in System F (where it’s best called Böhm-Berarducci encoding). Indeed, Church encoding supports naturals, which of course can’t be modeled (correctly) in the category of finite sets.

    – Blaisorblade Dec 06 '20 at 14:29
  • As a consequence, standard presentations of STLC add datatypes such as naturals as primitives with their operations. Even booleans better be primitive if you want if at all types in general — as soon as you have more than 1 base type, o -> o -> o breaks down. – Blaisorblade Dec 06 '20 at 14:35
  • @Blaisorblade I'm not sure I understand your comments. Why would the Church-encoding "not really work in STLC"? I have the impression that you think the Church encoding or the type $o\to o\to o$ don't work because they don't give you the expressiveness you'd like. But this is not about what you like: the question "what is the class $\mathsf{ST}\lambda$" is natural, well posed and has a very nice answer, whether you like it or not. In fact, what's interesting is precisely that these types do not give you everything one would normally expect from a general-purpose programming language. – Damiano Mazza Dec 07 '20 at 21:22
  • 1
    "What does o -> o -> o in pure STLC encode?" is a perfectly sensible question, which your reference discusses. They call that "Church encoding", and I didn't know anybody did that.

    But today, perhaps thanks to that work, we avoid considering that setting or calling that a Church encoding, at least not without strong warnings. I just checked TAPL, and it describes Church encodings for untyped LC and for System F (chapter 23).

    – Blaisorblade Dec 08 '20 at 20:36
  • IMHO that's also why these results is not well-known — they sunk a research area out of interest for most people, except maybe for research like https://dl.acm.org/doi/10.1145/346048.346051 (which, no offense, seems for now a niche even within SIGPLAN or at POPL). – Blaisorblade Dec 08 '20 at 20:39
  • 1
    I've been reading $\lambda$-calculus-related research papers for almost 20 years now. The map $n\mapsto\lambda f.\lambda x.f(\ldots fx\ldots)$ with $n$ occurrences of $f$ is universally known as "the Church encoding" of natural numbers (because it was first used by Church, it would seem). Related encodings (like binary strings) are often called "Church" too. In any case, I have never seen anybody setting an "expressiveness bar" under which it would be illegitimate to call such encodings "Church". – Damiano Mazza Dec 09 '20 at 07:15
  • 1
    About $o\to o\to o$, I don't know who calls that "Church encoding", I most certainly didn't. Now I see that the sentence I wrote may be parsed ambiguously: it should be "These are the standard types of (Church binary strings) and (booleans)", i.e., "Church" only goes with "binary strings". Still, I wouldn't see any problem in calling $\lambda x.\lambda y.x$ and $\lambda x.\lambda y.y$ the "Church Booleans". Regardless of whether it is historically accurate, I stand by my point above: there is no standard minimum expressiveness for this name to be legitimate! – Damiano Mazza Dec 09 '20 at 07:22
  • 1
    Also, you say that the reference I give discusses "what does $o\to o\to o$ in the STLC encode". This is a misunderstanding: that type is basically irrelevant in Hillebrand and Kanellakis's work, any type with a partition of its normal forms into two non-empty sets "True" and "False" would do. The real question is: what languages may be decided by terms taking a Church string and outputting True/False according to some fixed convention? The string encoding being the same (called "Church") makes it interesting to compare the answer in the pure $\lambda$-calculus, System F, STLC, etc. – Damiano Mazza Dec 09 '20 at 09:28
24

I am not sure whether the following satisfies the OP's high standards for a good answer, but I thought the result was very interesting when I first learned about it a few years ago.

Theorem. Let $E \subset \mathbb{C}$ be compact, and let $f$ be a bounded continuous function on the Riemann sphere $\widehat{\mathbb{C}}$ which is analytic on $\widehat{\mathbb{C}} \setminus E$. Then $$f(E)=f(\widehat{\mathbb{C}}).$$

$\phantom{a}$

I am not sure where this result was first proved. The only reference I know is the book of A.Browder, Introduction to Function Algebras, Lemma 3.5.4, p.199.

The proof is not difficult. If I recall correctly, it goes like this :

Proof. We want to show that for $w \in \widehat{\mathbb{C}}$, if there exists $z \in \widehat{\mathbb{C}}$ with $f(z)=w$, then there exists $z \in E$ with $f(z)=w$. Replacing $f$ by $f-w$ if necessary (or by $1/f$ if $w=\infty$), it suffices to show that if $f$ has a zero in $\widehat{\mathbb{C}}$, then it has a zero on $E$. Suppose for a contradiction that $f$ has a zero in $\widehat{\mathbb{C}}$ but does not vanish on $E$. Without loss of generality $\infty\notin E$. Then $f$ has finitely many complex zeros, say $z_1,\dots,z_n$, listed with multiplicities. Let $$g(z)=\frac{f(z)}{(z-z_1)\cdots (z-z_n)}.$$ Then $g$ is continuous and non-vanishing on $\mathbb{C}$ and $g(\infty)=0$. So there is a continuous function $h$ on $\mathbb{C}$ analytic outside $E\cup \infty$ with $g=e^h$. But this is impossible $g$ has a pole of finite order at $\infty$.

Now, as an example :

Example. Let $\Gamma \subset \mathbb{C}$ be a curve with Hausdorff dimension bigger than one. Then by Frostman's lemma, there is a nontrivial Radon measure supported on $E$ with growth $\mu(\mathbb{D}(z_0,r)) \leq r^{1+\epsilon}$ for all $z_0 \in \mathbb{C}$, $r>0$, for some small $\epsilon>0$. The growth condition on $\mu$ implies that the Cauchy transform $$f(z) = \int_\Gamma \frac{d\mu(\zeta)}{\zeta-z} \qquad (z \in \widehat{\mathbb{C}}\setminus E)$$ is Holder continuous (see Theorem 2.10 in M. Younsi, On removable sets for holomorphic functions, EMS Surv. Math. Sci. 2 (2015), no. 2, 219-254.), hence in particular extends to be continuous on the whole sphere. By the theorem, the function $f$ maps $\Gamma$ to a space-filling curve!

Fedor Petrov
  • 102,548
Malik Younsi
  • 1,942
  • I have to think about the example carefully since I want to understand all details. I like your post a lot. There is a result of Salem-Zygmund about holomorphic functions on the unit disc whose boundary values form a Peano type curve. Your example fits into this well. If the example is your original contribution you could expand it and publish in Amer. Math. Monthly. – Piotr Hajlasz Apr 05 '18 at 02:41
  • @Kalim I modified the proof a bit, hopefully it became better – Fedor Petrov Feb 20 '19 at 19:19
  • 4
    Shabat's book (Introduction to complex analysis vol.1) has this theorem as a problem. The shortest proof that I know is based on a notion of degree of the map. A map $f\colon S^2\to\mathbb{C}$ is of degree 0. Let $w\in f(S^2)$. If $f$ is holomorphic at $z\in f^{-1}(w)$, then $z$ makes a positive contribution to $\mathrm{deg} f$ since $f$ preserve the orientation. So it must be compensated by contribution of points in $E$. – Oleg Eroshkin Feb 20 '19 at 20:19
  • @OlegEroshkin I was not aware of Shabat's book, thanks for the reference! – Malik Younsi Feb 21 '19 at 20:09
  • @FedorPetrov Thanks! – Malik Younsi Feb 21 '19 at 20:10
  • The condition "bounded" can be omitted, since any continuous function on $\widehat{\mathbb{C}}$ is bounded. For the same reason, the case of $w=\infty$ does not occur, hence it can be disregarded in the proof. Finally, one also needs to assume that $f(\infty)\neq 0$, along with $\infty\not\in E$, in order for the $z_j$'s to lie in $\mathbb{C}$. – GH from MO Nov 22 '21 at 10:33
13

The Lévy Continuity Theorem for random Schwartz distributions due to Fernique:

First let me recall the well known Lévy continuity theorem for Borel probability measures on a finite-dimensional vector space $V$ like $\mathbb{R}^d$.

To a probability measure $\mu$ one can associate the characteristic function $$ \begin{array}{llll} \Phi_{\mu}: & V' & \longrightarrow & \mathbb{C}\\ & \ell & \longmapsto & \Phi_{\mu}(\ell)= \int_V e^{i \ell(v)}\ d\mu(v) \end{array} $$ which is defined on the dual space. By definition, a sequence of probability measures $(\mu_n)$ converges weakly to a probability measure $\mu$, iff for all bounded continuous functions $F:V\rightarrow \mathbb{R}$ (or $\mathbb{C}$), $$ \lim_{n\rightarrow \infty} \int_V F(v)\ d\mu_n(v) = \int_V F(v)\ d\mu(v)\ . $$ For $V$-valued random variables, this corresponds to convergence in law or in distribution.

We now have the following well-known result say for $V=\mathbb{R}^d$.

Lévy Continuity Theorem:

A sequence of Borel probability measures $(\mu_n)$ on $V$ converges weakly to some (unspecified) Borel probability measure iff the corresponding characteristic functions $\Phi_{\mu_n}$ converge pointwise on $V'$ to some function which is continuous at the origin.

Now the not well known result I propose in this answer, is the analogue for $V=\mathcal{S}'(\mathbb{R}^d)$ the space of temperate Schwartz distributions on $\mathbb{R}^d$.

Lévy-Fernique Continuity Theorem:

A sequence of Borel probability measures $(\mu_n)$ on $V$ converges weakly to some (unspecified) Borel probability measure iff the corresponding characteristic functions $\Phi_{\mu_n}$ converge pointwise on $V'$ to some function which is continuous at the origin.

To clarify, here $V=\mathcal{S}'(\mathbb{R}^d)$ equipped with the strong topology. Also $V'$ is the topological dual equipped with the strong topology. One has $V'\simeq \mathcal{S}(\mathbb{R}^d)$ with its usual topology, i.e., these spaces are reflexive. The definitions of characteristic functions and weak convergence of Borel probability measures are the same as in the finite-dimensional case above.

Comments:

This is important because just about any random object/process can be seen as living inside a space of distributions like $\mathcal{S}'$ or $\mathcal{D}'$.

(Will add more comments later when I find time).

References:

6

The following seems, to me, to be "the nicest theorem that does not have a special name". It is a wonderful blend of Topology, Geometry and Analysis. Moreover, it has a short and simple statement, involving only the notions of Euler characteristic and of a Lie group.

Theorem: The Euler characteristic of a connected nontrivial Lie group is zero!

The proof is also simple, being an application of Lefschetz fixed point theorem: Since the left translation endomorphism of a Lie group $G$ (by a non-trivial element) has no fixed points, Lefschetz numbers are homotopy invariant, and the Lefschetz number of the identity map is the Euler characteristic of $G$, the result follows (note that compactness of the Lie group is not really an issue).

Of course, the result (and the nice short proof) is known, but I think it should be much better known, and part of a lot of standard books.

Hexhist
  • 83