11

I am interested in knowing about abstract mathematical concepts, tools or methods that have come up in theoretical machine learning. By "abstract" I mean something that is not immediately related to that realm. For instance, a concept from mathematical optimization does not qualify since optimization is directly related to the training of deep networks. In contrast, to me Topological Data Analysis is a non-trivial example of applying algebraic topology to data analysis.

Here are few examples that I have encountered in the literature (all in the context of deep learning).

  1. Betti numbers have been utilized to introduce a complexity measure which could be used for comparing deep and shallow architectures:
    https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2014-44.pdf
  2. A connection between Sharkovsky's Theorem and the expressive power of deep networks: https://arxiv.org/pdf/1912.04378.pdf
  3. An application of Riemannian geometry:
    https://arxiv.org/pdf/1606.05340.pdf
  4. Algebraic geometry naturally comes up in studying neural networks with polynomial activation functions. This paper discusses functional varieties associated with such networks: https://arxiv.org/abs/1905.12207

I find it useful to compile a list of such research works on ML that draw on pure math.

KhashF
  • 2,608
  • Artificial Intelligence SE and Theoretical Computer Science SE are probably better places to ask this question than here. –  Apr 18 '20 at 00:11
  • @nbro it seems to me that the question is math-focussed. – YCor Apr 18 '20 at 00:31
  • @YCor Yes, but learning theory is a central topic in computer science and artificial intelligence. But I agree with you that this question may not be off-topic here. I just think that those other sites may also be appropriate (if not more appropriate) to ask this question. –  Apr 18 '20 at 00:33
  • 1
    Hard to answer it since every example you gave arises as 'naturally' in machine learning as optimization does. – Piyush Grover Apr 18 '20 at 00:33
  • Possibly relevant https://mathoverflow.net/questions/266028/algebraization-of-bayesian-networks – YCor Apr 18 '20 at 00:38
  • Recently, someone asked a similar (although more general) question on AI SE. See What are some resources on computational learning theory?. –  Apr 18 '20 at 00:46
  • @PiyushGrover In that case, please feel free to include anything that you deem appropriate. I am not well versed in ML and I am mostly familiar with deep learning where optimization by gradient descent algorithms is an essential part of the training process. – KhashF Apr 18 '20 at 01:00
  • @nbro The works that I have cited use results from algebraic geometry, algebraic topology and dynamical systems. So I believe that this is the correct forum to ask for similar papers. – KhashF Apr 18 '20 at 01:02
  • @KhashF But your question is still about learning theory, which is an AI topic. It's fine to ask this question here, because here you will find people that know more about algebraic geometry, algebraic topology and dynamical systems, but how many mathematicians care about learning theory? –  Apr 18 '20 at 01:15
  • Related: https://mathoverflow.net/questions/204176/group-theory-in-machine-learning – Alexander Chervov Apr 18 '20 at 09:13

1 Answers1

6

Probably, one the most striking is the "UMAP" (Uniform manifold approximation and projection) - a method of dimensional reduction in machine learning. The authors of the method use CATEGORY THEORY for its discovery. Well, there are certain discussions to what extent category theory is really required, see John Baez blog and references there in, still it is the author's original viewpoint how the method has been discovered. (The algorithm/implementation can be understood without category theory).

The method become quite quickly very popular - gaining 748 citations in two years according to google scholar. And found applications in many fields including bioinformatics (UMAP Nature), as well as capable to produce beautiful images MO355631.

It is similar to previously widely used method - tSNE (T-distributed stochastic neighbor embedding), however, quite often produces better results with less computational efforts, thus beating the predecessor both in quality and speed.

The documentations can be found here: UMAP docs.