5

In the same spirit of this question:

How much of mathematical General Relativity depends on the Axiom of Choice?

I want to go radically further ahead and ask for what remains of mathematical general relativity in case one limits one's set theory to that of ZF.

The motivation is that: to cause any departure from common sense, such as Banach-Tarski Paradox the minimum known set theory needed is ZF+HB for the moment, and at the same time one knows that BT cannot be deduced from ZF.

So simply I want to know which results depend crucially on the Hahn-Banach theorem(axiom) part of the ZF+HB, i.e. the results that exist in ZF+HB, but fail to exist in ZF.

PS: ZF<ZF+HB<ZF+UFL<ZF+AC

Where

UFL:Ultra Filter lemma

AC:Axiom of choice

HB:Hahn-Banach

  • 4
    'Departure from common sense' is a pretty relative thing. If there are no non-measurable subsets of $\mathbb{R}$, then there is an equivalence relation $\sim$ on $\mathbb{R}$ such that $\mathbb{R}/\sim$ has strictly larger cardinality than $\mathbb{R}$. – James Hanson Sep 07 '23 at 14:07
  • 2
    Furthermore, I'm a pretty firm believer that any 'physically meaningful' theory needs to boil down to a computational model that can in principle actually be implemented on a computer. Under this criterion, we know by absoluteness results that no 'physically meaningful' math can actually depend on things like AC or Hahn-Banach. – James Hanson Sep 07 '23 at 14:07
  • Isn't that just a reflection that the notion of larger and smaller cardinality breaks down without choice? https://mathoverflow.net/questions/260057/axiom-of-choice-banach-tarski-and-reality/260118#260118 – Aaron Bergman Sep 07 '23 at 14:36
  • 3
    @AaronBergman Sure, but I would argue that cardinality not being a linear order in the first place could also be regarded as failing to adhere to 'common sense.'

    Ultimately my point is that there's always going to be counterintuitive things once you start working with infinite sets and the ire directed towards stuff like Banach-Tarski in particular is misguided.

    – James Hanson Sep 07 '23 at 15:12
  • Can we ignore the motivation that I shared, and get to the point of the question? GR within ZF or ZF+ADC – Bastam Tajik Sep 07 '23 at 15:18
  • For what it's worth, if I ever found any physical result that depended on anything more than ZF+DC, I would be very skeptical. That gets you HB for separable Hilbert spaces, for example, which covers everything I've seen in QM, although I know you asked about GR. – Aaron Bergman Sep 07 '23 at 15:24
  • @AaronBergman Can't you prove HB for (arbitrary) Hilbert spaces in ZF alone? Given a linear functional on a closed subspace $X$ of a Hilbert space $H$, you extend by letting the functional be $0$ for anything in $X^\perp$. – James Hanson Sep 07 '23 at 16:36
  • @JamesHanson It's not immediately clear to me how you define the orthogonal projection onto an arbitrary closed subspace of an arbitrary Hilbert space without using DC: all proofs I am aware of use approximating sequences and taking a limit (if the subspace is infinite-dimensional) – Yemon Choi Sep 07 '23 at 23:00
  • 1
    @YemonChoi My intuition for why it should work is that the orthogonal projection is definable in the sense of continuous logic in a Hilbert space expanded with the distance predicate of the subspace. In particular this tells you that there's a very absolute formula that defines the distance to the orthogonal projection.

    A more analysis argument would be to establish that you can build a canonical Cauchy filter whose limit is the orthogonal projection. You need to require that Hilbert spaces are closed under limits of Cauchy filters, but without choice you should be doing that anyway.

    – James Hanson Sep 07 '23 at 23:03
  • I was thinking of the separable Banach space case, which I think uses much less than ZF+DC. As a (former) physicist, I mainly looked into this to make sure I didn’t need to worry about it and then subsequently forgot the details, so definitely not an expert. – Aaron Bergman Sep 08 '23 at 00:03
  • 2
    I think this question is hiding the issue of a misunderstanding of what General Relativity is, making it into a bit of a straw man argument. – Ryan Budney Sep 08 '23 at 01:38
  • @RyanBudney I don't understand your comment. What is the misunderstanding exactly? – James Hanson Sep 08 '23 at 05:57
  • 1
  • 1
    @JamesHanson: I suspect the author is perhaps thinking of general relativity not as a physical theory, but as some very specific mathematical model encoded in one specific language, which is unspoken. I would say that isn't really GR, but more of a "play model". – Ryan Budney Sep 08 '23 at 22:09
  • 2
    The chain in your PS isn't quite right. HB does not imply ADC. In fact, UFL holds and countable choice fails in the Cohen model with an infinite Dedekind finite set of reals. – Elliot Glazer Sep 13 '23 at 04:54
  • @ElliotGlazer yes absolutely. Sorry I edited the question. But is there intermediatory level between ZF and ZF+HB? – Bastam Tajik Sep 13 '23 at 09:28
  • 1
    There are going to be many intermediate principles. For instance, you could limit HB to Banach spaces of a certain density character. – James Hanson Sep 13 '23 at 14:50
  • @JamesHanson Even full HB does not imply there is an np ultrafilter on $\mathbb{N},$ see https://www.jstor.org/stable/2272118 Thm 2. Incidentally, full UFL can be decomposed into the ordering principle (OP) and compactness for ordered languages (CfO), the latter implying "HB-for-sep-normed-spaces." I've been investigating the conjecture that OP and CfO are separately (but not together) consistent with a total, isometry-invariant extension of Lebesgue measure on $\mathbb{R}^n.$ – Elliot Glazer Sep 13 '23 at 19:31
  • @ElliotGlazer Oh I see, I misread 'non-principal measure' as 'non-principal ultrafilter.' – James Hanson Sep 13 '23 at 20:37

2 Answers2

12

This is perhaps more of an extended comment than a real answer, but I do think it goes a long way towards answering these kinds of questions.

The set-theoretic result referred to as Shoenfield absoluteness is actually extremely broad and in fact cover a huge swath of 'ordinary mathematics' once one understands the relevant techniques for coding statements in terms of real numbers. This is to the extent that I actually suspect that the 'dezornification' mentioned in this answer to the referenced question could actually have been done using an absoluteness argument (which isn't to say that there isn't both practical and theoretical value in actually working out the details as is done in that paper).

Okay so what does Shoenfield absoluteness say? Given any model $V$ of $\mathsf{ZF}$ and an inner model $M \subseteq V$ containing the same ordinals, $\mathbf{\Sigma}^1_2$ sentences are absolute between $M$ and $V$ (i.e., $V$ satisfies a given $\mathbf{\Sigma}^1_2$ sentence if and only if $M$ satisfies the same sentence). A $\mathbf{\Sigma}^1_2$ sentence is roughly speaking a sentence of the form "There exists a real number $x$ such that for all real numbers $y$, $(x,y) \in B$." where $B$ is some Borel set. (Technically we need that the 'instructions for building $B$' live inside $M$, but that's not going to be an issue for how we're going to apply this result.) This implies that $\mathbf{\Pi}^1_2$ sentences (for all reals $x$, there is a real $y$ such that...) are similarly absolute and that $\mathbf{\Sigma}^1_3$ sentences (there exists an $x$ such that for all $y$ there exists a $z$...) are upwards absolute (i.e., if they hold in $M$, then they hold in $V$).

While this may seem relatively specific (since we're talking about just pairs or maybe triples of real numbers), it's important to recognize that since we're working with Borel conditions (rather than, say, merely continuous operations), a huge amount of information can be encoded in a single real number. The intuitive paradigm is this: Any object that can be specified by a countable amount of data can be encoded as a real number. This includes continuous functions, separable metric spaces, separable manifolds, countable algebraic rings, etc., and it also includes any sequences of such objects, since a countable bundle of countable bundles of data is still countable. (We think of the data as coming with an explicit enumeration, so the whole 'countable union of countable sets' issue is irrelevant here.)

The other thing that makes this powerful is the fact that given any such 'countably specified object' $x$, there is an inner model $L[x]$ of $V$ (containing the same ordinals) which satisfies a very strong form of the axiom of choice (specifically, global choice) as well as the (generalized) continuum hypothesis and a lot of other strong set-theoretic principles. In particular, this requires no choice at all in $V$.


Let's think about this in the case of a particular toy model (in possibly excruciating detail). Suppose we're thinking about some explicitly defined ordinarly differential equation $x' = f(x)$ (where $f$ is some fixed continuous function) and we want to show that for any given initial conditions $x(t_0) = x_0$, there is a solution to this differential equation that exists in some interval around $t_0$ and is maximal (in the sense that it cannot be extended to a larger interval). Suppose moreover that for the life of us we just can't see how to prove this without appealing to Zorn's lemma or (equivalently) transfinite induction and choice. Regardless we are able to show the following: $\mathsf{ZFC}$ proves that for any $(x_0,t_0)$, there is an interval $I$ and functions $X_0,X_1: I \to \mathbb{R}$ such that

  • $t_0 \in I$,
  • $X_0'(t) = X_1(t)$ for all $t \in I$,
  • $X_0(t_0) = x_0$,
  • $X_1(t) = f(X_0(t))$ for all $t \in I$, and
  • for any larger interval $J \supset I$, no extension $(Y_0,Y_1)$ of $(X_0,X_1)$ to $J$ satisfies the above conditions.

At its face, this is a mess of quantifiers and seems like it might be too complicated to fit into the absoluteness framework I was talking about before, but it actually can. To make the statement a little more sane, we need some preliminary definitions.

A tame function is a function from $\mathbb{R}$ to $\mathbb{R}$ that is continuous and piecewise linear with rational coefficients and with finitely many conditions defined by rational endpoints. Note that a tame function can be coded by a single natural number. Given an open interval $I$ and a sequence of tame functions $(f^i)$, we say that $(f^i)$ represents a continuous function on $I$ if for each closed interval $K \subset I$ with rational endpoints, the sequence $(f^i)$ is a Cauchy sequence in the sup norm computed on rational points in $K$.

For a fixed $x_0$ and $t_0$, the claim can be written like this: There exists $\alpha \in \mathbb{R}$ such that for all $\beta \in \mathbb{R}$ there exists $\gamma \in \mathbb{R}$ such that

  • $\alpha$ encodes an infinite tuple $(I,X_0^0,X_0^1,X_0^2,\dots,X_1^0, X_1^1, X_1^2,\dots)$ where $I$ is an open interval and $(X_0^i)$ and $(X_1^i)$ are sequences of tame functions representing continuous functions on $I$,
  • $t_0 \in I$,
  • if $\beta \in I$, then the derivative of $\lim_i X^i_0(t)$ at $\beta$ is equal to $\lim_i X^i_1(\beta)$,
  • $\lim_i X^i_0(t_0) = x_0$,
  • if $\beta \in I$, then $\lim_i X^i_1(\beta) = f(\lim_i X^i_0(\beta))$, and

IF $\beta$ codes an infinite tuple $(J_L,Y_0^0, Y_0^1,Y_0^2,\dots,Y_1^0, Y_1^1,Y_1^2,\dots)$ such that $J \supset I$ and $(Y^i_0)$ and $(Y^i_1)$ are sequences of tame functions representing continuous functions on $J$, $\lim_i X^i_j(r) = \lim_i Y^i_j(r)$ for both $j < 2$ and each rational $r \in I$, THEN $\gamma \in J$ and either

  • the derivative of $\lim_i Y^i_0(t)$ at $\gamma$ is not equal to $\lim_i Y^i_1(\gamma)$ or
  • $\lim_i Y^i_1(\gamma) \neq f(\lim_i Y^i_0(\gamma))$.

I claim that everything after the initial $\alpha\beta\gamma$ quantifiers is a Borel condition. The key is that at each point we're only quantifying over countable sets (such as the naturals or the rationals). This means that for a fixed choice of $t_0$ and $x_0$, this is a $\mathbf{\Sigma}^1_3$ sentence (and I actually think we can get away with only quantifying over rational $\gamma$, so it's probably actually only $\mathbf{\Sigma}^1_2$). Since we proved this in $\mathsf{ZFC}$, we know that it holds in the inner model $L[f,t_0,x_0]$ and then by Shoenfield absoluteness it actually holds in $V$. Therefore our choice-y proof can actually be systematically transformed into a proof in $\mathsf{ZF}$ with no choice at all.

This argument generalizes and shows that if you have any 'explicitly written' $\Pi^1_4$ sentence and you can prove it in $\mathsf{ZFC}$, then you can actually prove it in $\mathsf{ZF}$. Any $\Pi^1_4$ theorem that's actually written down in a paper is going to be explicit in this sense.


Now regarding GR in particular, there's ostensibly a jump in complexity, since not only does GR deal with PDEs instead of ODEs, but it deals with PDEs on differential manifolds that are themselves influenced by the dynamics of the PDE. Nevertheless, I actually claim that the story is not going to be so different from the toy model above (except for a proportionately greater amount of pain in actually coding things). For just 'ordinary' PDEs, assuming we're talking about bona fide pointwise solutions that are continuous functions (rather than, say, weak solutions), there's no different in terms of the quantifier complexity between specifying an initial condition that is a single real number and specifying an initial condition that is a continuous function defined on $\mathbb{R}^n$ (or indeed a Borel function of bounded Borel rank or a manifold with some specified Riemannian metric). These things are all countable bundles of data and the questions you ask about them are 'simple' enough that they can be expressed in a Borel way. The same is actually true of a (separable) manifold with a given metric. You can think of the manifold as being coded by a specific countable atlas which can ultimately be coded by a countable bundle of data.

In particular, this is why I think that the aforementioned 'dezornification' could have been done in a completely systematic way using Shoenfield absolutness. There's another additional wrinkle, which is that the result says that the solution is not just maximal but is in fact the unique maximal solution. This is a little bit more technical to state with GR because of the fact that solutions are manifolds with metrics on them rather than functions on a fixed background, but I believe it can still be formalized as an 'explicit' $\Pi^1_4$ sentence: For all initial data $x$, there is a solution $y$ such that [all proper extensions $z$ fail to be solutions] and [for all solutions $w$, there is a manifold embedding $f$ of $w$ into $y$ witnessing that $y$ is an extension of $w$]. With some massaging, this can be brought into the same form as what we did in the toy model (although this will use the fact that for continuous things we can get away with quantifying over rationals instead of arbitrary reals).

The reason I'm not comfortable calling this a proper answer is that I don't have enough background in mathematical GR to actually get a sense for whether all of the major results in it can be expressed as 'explicit' $\Pi^1_4$ sentences (or simpler), but I would actually be really, really surprised if this weren't the case. And in some sense, this is why I think these kinds of set theoretic issues are a bit of a red herring. $\mathsf{ZF}$ is not really any more 'constructive' than $\mathsf{ZFC}$ when it comes to 'tangibly small' objects. To me the more pernicious issue is whether the resulting object is in fact computable. Non-computable objects usually sneak in through compactness arguments (which are used in the kind of analysis that is relevant to applications such as physics), and (again for tangibly small objects) these are already valid in $\mathsf{ZF}$ alone with no choice at all.

James Hanson
  • 10,311
  • 8
    Since this is literally the only venue in which this is going to be relevant, I would also like to complain that Wald's textbook General Relativity contains an inaccurate statement about the axiom of choice. In Appendix A, he gives the long line as an example of a non-paracompact manifold and states that one needs the axiom of choice to construct it, but this is just wrong. $\mathsf{ZF}$ already proves that the long line exists. – James Hanson Sep 07 '23 at 21:56
  • 1
    I think I remember you telling me about this error and complaining that you'd never get a chance to say something about it, way back in Madison. :P – Noah Schweber Sep 07 '23 at 22:00
  • 4
    I have a long and ever growing list of complaints that need very specific contexts to be relevant. – James Hanson Sep 07 '23 at 22:02
  • Remember, you only get half points if you engineer the context yourself. – Noah Schweber Sep 07 '23 at 22:02
  • 3
    I didn't ask the question. – James Hanson Sep 07 '23 at 22:03
  • 1
    I know, I was thinking about future instances. :) – Noah Schweber Sep 07 '23 at 22:04
  • 1
    Re: "I think that the aforementioned 'dezornification' could have been done in a completely systematic way using Shoenfield absolutness." I disagree: I am certain that none of the folks involved there knew anything about Shoenfield absoluteness, so we certainly couldn't have done it. :-) Now if you were to want to write a paper... – Willie Wong Sep 08 '23 at 05:05
  • A quick n00b question: in your fifth paragraph, what is the relation between $x$ and $L[x]$? You asserted that for every object $x$ of a certain type, there is a model $L[x]$ but not how the two are related. – Willie Wong Sep 08 '23 at 05:14
  • @WillieWong Unfortunately defining the parameter-free version $L$ carefully already takes a good chunk of a graduate-level set theory course. Intuitively $L[x]$ is the 'universe of sets that are explicitly definable in terms of $x$ and the ordinals.' – James Hanson Sep 08 '23 at 06:49
  • @JamesHanson But without choice, it’s consistent that the long line is paracompact (this is equivalent to $\omega_1$ being singular). I bet it’s consistent with ZF that all manifolds are paracompact. – Elliot Glazer Sep 12 '23 at 05:47
  • @ElliotGlazer I had wondered about this but the precise phrasing in the book really makes me think Wald didn't have this nuance in mind. He says "the 'long line'...is perhaps the simplest example [of a non-paracompact manifold], although the axiom of choice is required to define it." My guess is that Wald picked up the idea that you need choice to even define $\omega_1$, which people sometimes say (and which certainly isn't true). – James Hanson Sep 12 '23 at 12:43
  • @ElliotGlazer In the next paragraph, Wald talks about how paracompactness is equivalent to admitting a Riemannian metric and also to being second countable (for manifolds). Are these equivalences still true without choice? – James Hanson Sep 12 '23 at 12:55
  • 2
    @JamesHanson They’re inequivalent. It’s a ZF theorem that the long line is not metrizable (if an ordinal $\alpha$ has a metric, one can can transfinitely construct a canonical enumeration for each $\beta \le \alpha$). – Elliot Glazer Sep 12 '23 at 14:20
  • 1
    @JamesHanson A second countable manifold is a Polish space, so Shoenfield applies and you get basically the whole ZFC theory, include embeddings into $\mathbb{R}^n.$ I don’t know if being connected metrizable or even a connected Riemannian manifold is enough to prove second countability. – Elliot Glazer Sep 12 '23 at 14:27
  • @ElliotGlazer Do you have an argument that the Prüfer manifold can be paracompact in ZF? – James Hanson Sep 12 '23 at 15:31
  • For a connected Riemannian manifold, I think this works: fix $p,$ an $(n-1)$-sphere $S$ centered at $p,$ and a countable dense $X \subset S.$ For points $q$ of rational distance from $p$ via a geodesic through a point in $X,$ take the balls $B(q, 1/n).$ – Elliot Glazer Sep 12 '23 at 15:50
  • @ElliotGlazer The Prüfer manifold doesn't have a countable dense set. It's the other standard example of a non-paracompact manifold. – James Hanson Sep 12 '23 at 16:53
  • @JamesHanson Sorry my previous comment was just following up on whether Riemannian manifolds are second countable. I haven’t checked if the Prufer manifold can be paracompact. – Elliot Glazer Sep 12 '23 at 17:17
6

As Ryan Budney mentioned in a comment, there is some ambiguity about what exactly you mean by "general relativity." General relativity is primarily a physical theory rather than a mathematical theory. So one way to interpret your question is:

Are there any physical predictions or calculations in general relativity that cannot be derived by working entirely in ZF (or some other mathematical foundation that is weaker than ZF)?

This version of the question is what I believe James Hanson is addressing in his answer. The main point is that any "physically meaningful content" of general relativity is highly likely to be mathematically formalizable in some theory that is much weaker than ZF. See my answer to another MO question for a couple of references.

However, you used the phrase, "mathematical general relativity," suggesting that you're primarily interested not in preserving just the "physically meaningful content" of general relativity, but some particular body of mathematics that goes under the label of "general relativity." The difficulty with answering this version of the question is that the boundaries of this particular body of mathematics aren't sharply defined. In the MO question that you linked to, there was a specific theorem being discussed, so the question of whether it can be proved without AC is clearly defined. But if you want to widen the question to encompass all of "mathematical general relativity," then it is no longer clear exactly which theorems you're asking about.

I think that any attempt to make your question more precise is going to result either in a question of the form, "Can this specific theorem, which is widely regarded to be part of mathematical general relativity, be proved with AC?" or a question of the form, "Can we develop from scratch all the physically relevant parts of GR using logically weak axioms?" which basically brings us back to the first interpretation of your question above.

Timothy Chow
  • 78,129
  • Maybe I could ask what are the most general(least structure involved in) theorems lost in Lorentzian Geometry limiting one's self to ZF set theory. – Bastam Tajik Sep 09 '23 at 12:42
  • Hi Timothy, i think a more accurate statement would use "proved" rather than "derived", in particular "proved to exist" for solutions with desired properties -like cosmological solutions with desired curvature, stability of black holes, cosmic censorship,... Because to "derive" is what physicists do all the time, and it is mostly intuitive reasoning, often through formal manipulations carelessly generalizing from a few physical cases of interest, and almost always assuming any required mathematical hypothesis (like existence of solutions) or rather overlooking them. – plm Sep 09 '23 at 12:50
  • 2
    @BastamTajik I think that before addressing that question, one must first grapple with the more fundamental question of how to develop basic analysis on the basis of ZF. See this MO answer for example. For example, there are annoying technicalities associated with developing measure theory purely on the basis of ZF. You might lose some theorems for some "technical" reasons that don't really have anything to do with Lorentzian geometry per se. – Timothy Chow Sep 09 '23 at 15:31