72

Let me begin by formulating a concrete (if not 100% precise) question, and then I'll explain what my real agenda is.

Two key facts about forcing are (1) the definability of forcing; i.e., the existence of a notion $\Vdash^\star$ (to use Kunen's notation) such that $p\Vdash \phi$ if and only if $(p \Vdash^\star \phi)^M$, and (2) the truth lemma; i.e., anything true in $M[G]$ is forced by some $p\in G$.

I am wondering if there is a way to "axiomatize" these facts by saying what properties forcing must have, without actually introducing a poset or saying that $G$ is a generic filter or that forcing is a statement about all generic filters, etc. And when I say that forcing "must have" these properties, I mean that by using these axioms, we can go ahead and prove that $M[G]$ satisfies ZFC, and only worry later about how to construct something that satisfies the axioms.


Now for my hidden agenda. As some readers know, I have written A beginner's guide to forcing where I try to give a motivated exposition of forcing. But I am not entirely satisfied with it, and I have recently been having some interesting email conversations with Scott Aaronson that have prompted me to revisit this topic.

I am (and I think Scott is) fairly comfortable with the exposition up to the point where one recognizes that it would be nice if one could add some function $F : \aleph_2^M \times \aleph_0 \to \lbrace 0,1\rbrace$ to a countable transitive model $M$ to get a bigger countable transitive model $M[F]$. It's also easy to grasp, by analogy from algebra, that one also needs to add further sets "generated by $F$." And with some more thought, one can see that adding arbitrary sets to $M$ can create contradictions, and that even if you pick an $F$ that is "safe," it's not immediately clear how to add a set that (for example) plays the role of the power set of $F$, since the "true" powerset of $F$ (in $\mathbf{V}$) is clearly the wrong thing to add. It's even vaguely plausible that one might want to introduce "names" of some sort to label the things you want to add, and to keep track of the relations between them, before you commit to saying exactly what these names are names of. But then there seems to be a big conceptual leap to saying, "Okay, so now instead of $F$ itself, let's focus on the poset $P$ of finite partial functions, and a generic filter $G$. And here's a funny recursive definition of $P$-names." Who ordered all that?

In Cohen's own account of the discovery of forcing, he wrote:

There are certainly moments in any mathematical discovery when the resolution of a problem takes place at such a subconscious level that, in retrospect, it seems impossible to dissect it and explain its origin. Rather, the entire idea presents itself at once, often perhaps in a vague form, but gradually becomes more precise.

So a 100% motivated exposition may be a tad ambitious. However, it occurs to me that the following strategy might be fruitful. Take one of the subtler axioms, such as Comprehension or Powerset. We can "cheat" by looking at the textbook proof that $M[G]$ satisfies the axiom. This proof is actually fairly short and intuitive if you are willing to take for granted certain things, such as the meaningfulness of this funny $\Vdash$ symbol and its two key properties (definability and the truth lemma). The question I have is whether we can actually produce a rigorous proof that proceeds "backwards": We don't give the usual definitions of a generic filter or of $\Vdash$ or even of $M[G]$, but just give the bare minimum that is needed to make sense of the proof that $M[G]$ satisfies ZFC. Then we "backsolve" to figure out that we need to introduce a poset and a generic filter in order to construct something that satisfies the axioms.

If this can be made to work, then I think it would greatly help "ordinary mathematicians" grasp the proof. In ordinary mathematics, expanding a structure $M$ to a larger structure $M[G]$ never requires anything as elaborate as the forcing machinery, so it feels like you're getting blindsided by some deus ex machina. Of course the reason is that the axioms of ZFC are so darn complicated. So it would be nice if one could explain what's going on by first looking at what is needed to prove that $M[G]$ satisfies ZFC, and use that to motivate the introduction of a poset, etc.

By the way, I suspect that in practice, many people learn this stuff somewhat "backwards" already. Certainly, on my first pass through Kunen's book, I skipped the ugly technical proof of the definability of forcing and went directly to the proof that $M[G]$ satisfies ZFC. So the question is whether one can push this backwards approach even further, and postpone even the introduction of the poset until after one sees why a poset is needed.

Timothy Chow
  • 78,129
  • 7
    Here's a slightly tongue in cheek answer: if the mathematician knows about sheaves, say you are taking some certain sheaves over a site whose objects are the forcing conditions, and then the complicated stuff in forcing comes from a) passing to a two-valued model by a giant quotienting operation and then b) simulating sets as well-founded trees in the result. From the point of view of the forcing relation, you don't do a) and b) but implicitly work with the sheaves, possibly passing to "dense open subsets" in the site (to abuse some language) to show things hold. – David Roberts Aug 21 '20 at 09:45
  • 5
    @David: Does that mean that someone can explain sheaves to me in a way that I can understand, just because I understand forcing? :-) – Asaf Karagila Aug 21 '20 at 10:18
  • 1
    @DavidRoberts : For such a person, I think the Boolean-valued model approach that I actually took in my paper should be reasonably palatable. Scott Aaronson (among others) found that approach off-putting, so I am trying to find a path that doesn't appeal to multi-valued models. – Timothy Chow Aug 21 '20 at 14:33
  • @TimothyChow yes, that's pretty much the gist of it. But I can imagine given Aaronson's area of expertise sheaves are not exactly the thing that would make it make more sense for him. I must say I appreciated the early version of your writeup when I was trying to grasp the idea of forcing. – David Roberts Aug 21 '20 at 23:45
  • @Asaf maybe :-) That would be a good MO question, I think... – David Roberts Aug 21 '20 at 23:46
  • 2
    I have wondered whether Mac Lane and Moerdijk's book Sheaves in Geometry and Logic has ever helped someone with a background in logic and set theory understand modern geometry. It seemed to be written for people going in the other direction. – Timothy Chow Aug 22 '20 at 00:27
  • 1
    @AsafKaragila: The topos of sheaves on a site "is" the category of sets in the forcing model of IZF (so this is Heyting-valued rather than Boolean-valued forcing) obtained by adjoining a generic flat cover-preserving presheaf on that site. (It's not exactly that because of the opposites of David's (a) and (b).) – Mike Shulman Aug 22 '20 at 08:57
  • 1
    I asked a similar question in https://math.stackexchange.com/questions/3576510/abstracting-the-general-forcing-argument-from-case-specific-arguments – Daniel Schepler Aug 22 '20 at 17:47
  • 1
    On variations of exposition: Did you check this paper by Moore? Perhaps it might have some extra ingredient for your recipe. – Pedro Sánchez Terraf Aug 22 '20 at 22:52
  • 3
    I just wanted to say I already really like the beginners guide you wrote! Please let us know if there will be an improved version – Vincent Aug 26 '20 at 07:45

6 Answers6

37

This is an expansion of David Roberts's comment. It may not be the sort of answer you thought you were looking for, but I think it is appropriate, among other reasons because it directly addresses your question

if there is a way to "axiomatize" these facts by saying what properties forcing must have.

In fact, modern mathematics has developed a powerful and general language for "axiomatizing properties that objects must have": the use of universal properties in category theory. In particular, universal properties give a precise and flexible way to say what it means to "freely" or "generically" add something to a structure.

For example, suppose we have a ring $R$ and we want to "generically" add a new element. The language of universal properties says that this should be a ring $R[x]$ equipped with a homomorphism $c:R\to R[x]$ and an element $x\in R[x]$ with the following universal property: for any ring $S$ equipped with a homomorphism $f:R\to S$ and an element $s\in S$, there exists a unique homomorphism $h:R[x]\to S$ such that $h\circ c = f$ and $h(x) = s$.

Note that this says nothing about how $R[x]$ might be constructed, or even whether it exists: it's only about how it behaves. But this behavior is sufficient to characterize $R[x]$ up to unique isomorphism, if it exists. And indeed it does exist, but to show this we have to give a construction: in this case we can of course use the ring of formal polynomials $a_n x^n + \cdots + a_1 x + a_0$.

From this perspective, if we want to add a function $F : \aleph_2\times \aleph_0 \to 2$ to a model $M$ of ZFC to obtain a new model $M[F]$, the correct thing to do would be to find a notion of "homomorphism of models" such that $M[F]$ can be characterized by a similar universal property: there would be a homomorphism $c:M\to M[F]$ and an $F : \aleph_2\times \aleph_0 \to 2$ in $M[F]$, such that for any model $N$ equipped with a homomorphism $f:M\to N$ and a $G : \aleph_2\times \aleph_0 \to 2$ in $N$, there is a unique homomorphism $h:M[F]\to N$ such that $h\circ c = f$ and $h(F) = G$.

The problem is that the usual phrasing of ZFC, in terms of a collection of things called "sets" with a membership relation $\in$ satisfying a list of axioms in the language of one-sorted first-order logic, is not conducive to defining such a notion of homomorphism. However, there is an equivalent formulation of ZFC, first given by Lawvere in 1964, that works much better for this purpose. (Amusingly, 1964 is exactly halfway between 1908, when Zermelo first proposed his list of axioms for set theory, and the current year 2020.) In Lawvere's formulation, there is a collection of things called "sets" (although they behave differently than the "sets" in the usual presentation of ZFC) and also a separate collection of things called "functions", which together form a category (i.e. functions have sets as domain and codomain, and can be composed), and satisfy a list of axioms written in the language of category theory. (A recent short introduction to Lawvere's theory is this article by Tom Leinster.)

Lawvere's theory is usually called "ETCS+R" (the "Elementary Theory of the Category of Sets with Replacement"), but I want to emphasize that it is really an entirely equivalent formulation of ZFC. That is, there is a bijection between models of ZFC, up to isomorphism, and models of ETCS+R, up to equivalence of categories. In one direction this is exceedingly simple: given a model of ZFC, the sets and functions therein as usually defined form a model of ETCS+R. Constructing an inverse bijection is more complicated, but the basic idea is the Mostowski collapse lemma: well-founded extensional relations can be defined in ETCS+R, and the relations of this sort in any model of ETCS+R form a model of ZFC.

Since a model of ETCS+R is a structured category, there is a straightforward notion of morphism between models: a functor that preserves all the specified structure. However, this notion of morphism has two defects.

The first is that the resulting category of models of ETCS+R is ill-behaved. In particular, the sort of "free constructions" we are interested in do not exist in it! However, this is a problem of a sort that is familiar in modern structural mathematics: when a category is ill-behaved, often it is because we have imposed too many "niceness" restrictions on its objects, and we can recover a better-behaved category by including more "ill-behaved" objects. For instance, the category of manifolds does not have all limits and colimits, but it sits inside various categories of more general "smooth spaces" that do. The same thing happens here: by dropping two of the axioms of ETCS+R we obtain the notion of an elementary topos, and the category of elementary toposes, with functors that preserve all their structure (called "logical functors"), is much better-behaved. In particular, we can "freely adjoin a new object/morphism" to an elementary topos.

(I am eliding here the issue of the replacement/collection axiom, which is trickier to treat correctly for general elementary toposes. But since my main point is that this direction is a blind alley for the purposes of forcing anyway, it doesn't matter.)

The second problem, however, is that these free constructions of elementary toposes do not have very explicit descriptions. This is important because our goal is not merely to freely adjoin an $F:\aleph_2\times \aleph_0 \to 2$, but to show that the existence of such an $F$ is consistent, and for this purpose we need to know that when we freely adjoin such an $F$ the result is nontrivial. Thus, in addition to characterizing $M[F]$ by a universal property, we need some concrete construction of it that we can inspect to deduce its nontriviality.

This problem is solved by imposing a different niceness condition on the objects of our category and changing the notion of morphism. A Grothendieck topos is an elementary topos that, as a category, is complete and cocomplete and has a small generating set. But, as shown by Giraud's famous theorem, it can equivalently be defined as a cocomplete category with finite limits and a small generating set where the finite limits and small colimits colimits interact nicely. This suggests a different notion of morphism between Grothendieck toposes: a functor preserving finite limits and small colimits. Let's call such a functor a Giraud homomorphism (it's the same as a "geometric morphism", but pointing in the opposite direction).

The category of Grothendieck toposes and Giraud homomorphisms is well-behaved, and in particular we can freely adjoin all sorts of structures to a Grothendieck topos -- specifically, any structure definable in terms of finite limits and arbitrary colimits (called "a model of a geometric theory"). (To be precise, this is a 2-category rather than a category, and the universal properties are up to isomorphism, but this is a detail, and unsurprising given the modern understanding of abstract mathematics.) Moreover, the topos $M[G]$ obtained by freely adjoining a model $G$ of some geometric theory to a Grothendieck topos $M$ -- called the classifying topos of the theory of $G$ -- has an explicit description in terms of $M$-valued "sheaves" on the syntax of the theory of $G$. This description allows us to check, in any particular case, that it is nontrivial. But for other purposes, it suffices to know the universal property of $M[G]$. In this sense, the universal property of a classifying topos is an answer to your question:

when I say that forcing "must have" these properties, I mean that by using these axioms, we can go ahead and prove that $M[G]$ satisfies ZFC, and only worry later about how to construct something that satisfies the axioms.

Only one thing is missing: not every Grothendieck topos is a model of ETCS+R, hence $M[G]$ may not itself directly yield a model of ZFC. We solve this in three steps. First, since ZFC satisfies classical logic rather than intuitionistic logic (the natural logic of categories), we force $M[G]$ to become Boolean. Second, by restricting to "propositional" geometric theories we ensure that the result also satisfies the axiom of choice. Finally, we pass to the "internal logic" of the topos, which is to say that we allow "truth values" lying in its subobject classifier rather than in the global poset of truth values $2$. We thereby get an "internal" model of ETCS+R, and hence also an "internal" model of ZFC.

So where does the complicated machinery in the usual presentation of forcing come from? Mostly, it comes from "beta-reducing" this abstract picture, writing out explicitly the meaning of "well-founded extensional relation internal to Boolean sheaves on the syntax of a propositional geometric theory". The syntax of a propositional geometric theory yields, as its Lindenbaum algebra, a poset. The Boolean sheaves on that poset are, roughly, those that satisfy the usual "denseness" condition in forcing. The "internal logic" valued in the subobject classifier corresponds to the forcing relation over the poset. And the construction of well-founded extensional relations translates to the recursive construction of "names".

(Side note: this yields the "Boolean-valued models" presentation of forcing. The other version, where we take $M$ to be countable inside some larger model of ZFC and $G$ to be an actual generic filter living in that larger model, is, at least to first approximation, an unnecessary complication. By comparison (and in jesting reference to Asaf's answer), if we want to adjoin a new transcendental to the field $\mathbb{Q}$, we can simply construct the field of rational functions $\mathbb{Q}(x)$. From the perspective of modern structural mathematics, all we care about are the intrinsic properties of $\mathbb{Q}(x)$; it's irrelevant whether it happens to be embeddable in some given larger field like $\mathbb{R}$ by setting $x=\pi$.)

The final point is that it's not necessary to do this beta-reduction. As usual in mathematics, we get a clearer conceptual picture, and have less work to do, when working at an appropriate level of abstraction. We prove the equivalence of ZFC and ETCS+R once, abstractly. Similarly, we show that we have an "internal" model of ETCS+R in any Grothendieck topos. These proofs are easier to write and understand in category-theoretic language, using the intrinsic characterization of Grothendieck toposes rather than anything to do with sites or sheaves. With that done, the work of forcing for a specific geometric theory is reduced to understanding the relevant properties of its category of Boolean sheaves, which are simple algebraic structures.

Mike Shulman
  • 65,064
  • 2
    This answer gives me a faint, but still better, hope of understanding forcing from a type-theoretic point of view than I've ever had before. Thanks. – Jacques Carette Aug 26 '20 at 12:25
  • 1
    @JacquesCarette: Quite the opposite for me, to be honest. If Timothy was lamenting how non set theorists get confused and wonder why do you need all these complex machinery for forcing; Mike's answer (interesting and illuminating as it may be) makes me wonder why these things are even necessary, yes, there's a "learning hump" (as with everything else in mathematics), but once you get over it, forcing is pretty easy to understand. If you feel like Mike's answer is "easier" that just means that you don't want to put the effort getting over the hump from the set theory side (which is fine). – Asaf Karagila Aug 26 '20 at 15:32
  • 8
    @AsafKaragila The point is that if you instead put in the effort to get over the "universal hump" of learning category theory, you end up at a high enough place that you can see over all other humps without any extra effort. (-:O – Mike Shulman Aug 26 '20 at 20:04
  • Mike, it seems to me that if you're trying to understand forcing like this, then you're not trying to understand forcing, but rather a similar technique in algebraic set theory, which could be perhaps described as "algebraic forcing". This is, at the end, not really Cohen's argument, to the point where forcing is important because it preserves the well-foundedness (and transitivity) of the ground model. This is exactly the point where we care that the model is not abstract, but concrete. – Asaf Karagila Aug 26 '20 at 20:38
  • @AsafKaragila: I am not sure I understand your claims about well-foundedness. Mike's answer explicitly points out that the resulting model is well-founded when he says “well-founded extensional relations can be defined in ETCS+R, and the relations of this sort in any model of ETCS+R form a model of ZFC”. – Dmitri Pavlov Aug 26 '20 at 22:27
  • @DmitriPavlov: But ETCS(+R) is not a theory where one thinks about transitive models, this is in contrast to ZFC. Exactly like if you're a number theorist, $\Bbb Q(\pi)$ is just a field; but if you're doing combinatorial semigroup theory, it is an ordered subfield of $\Bbb R$ (as per Carl-Fredrik's comment below my answer). There is a point where abstraction is not necessarily more illuminating. – Asaf Karagila Aug 26 '20 at 22:48
  • @AsafKaragila The usual/original motivation for forcing is, I believe, to prove that a certain statement is unprovable in ZFC, and for that purpose it doesn't matter what kind of model you end up with. I can believe that for other purposes the details might matter more, although I'd be surprised if there weren't also an algebraic version in that case. Can you explain why one might care that forcing preserves the well-foundedness and transitivity of the ground model? – Mike Shulman Aug 26 '20 at 23:54
  • 4
    @Asaf Maybe one should view it as the difference between knowing about tensor products, their universal property and multilinear algebra, and knowing about tensors as physicists do, all concrete symbol manipulation and efficient calculation. These two viewpoints are inherently different and achieve different viewpoints. Ideally one learns both! – David Roberts Aug 27 '20 at 01:22
  • @Mike: This is very much reflected in the fact that forcing preserves transitivity and does not add ordinals. A striking contrast with other model theoretic methods (ultraproducts, saturated models, compactness methods) which tend to modify the original model completely. There's a reason we do forcing with transitive models, and there's a reason proper forcing is called proper: it is forcing where you can take "nice models of nice fragments of ZFC and the generic extension commutes with the Mostowski collapse", in some sense, that is the proper way to approach forcing. – Asaf Karagila Aug 27 '20 at 08:22
  • @Mike: In other words, constructions that preserve "niceness" are not uncommon in mathematics. You want extensions of a category such that your original one is full in them, or you want compactifications of your space such that the space does not lose too many of its properties. There's a reason why the study of algebraic number theory is on finite algebraic extensions, and not in $\Bbb C$ as a field extension of $\Bbb Q$ (even though we sometimes go there, for clarity of thought), we want to have nice things. Set theorists want to have nice things as well. – Asaf Karagila Aug 27 '20 at 08:25
  • 1
    @Asaf: I can't see how the word 'nice' by itself explains very much. Surely you must be able to finish off this sentence without using it, "And the reason we set theorists look to preserves the well-foundedness (and transitivity) of the ground model is..." – David Corfield Aug 27 '20 at 10:00
  • @DavidCorfield: I can't see why this is a problem. Would you tell analysts to work in some hyperreal extension of size $(2^{\aleph_{\omega_1}})^+$? No. They work with the real numbers because they have nice properties. If you want to be concrete, transitive models are those that agree with the ambient universe on the membership relation and on the actual elements. I understand that in algebra none of this matter, but much to the dismay of some people, I suppose, not all of mathematics is algebra... – Asaf Karagila Aug 27 '20 at 10:03
  • 1
    @AsafKaragila I think analysis is not a good example. There's only one Dedekind-complete ordered field, and that's what analysts work with. Note that you definitely don't always want an extension of your category such that the original one is full in it, since that is not true for the category of sets of a model M and the category of sets of M[G]. :-) – David Roberts Aug 27 '20 at 10:21
  • @DavidR: Hilbert spaces, Banach spaces, fine, you get my point, find a good example. That's not the point. The point is that there is some "nice property" that we are interested in, because it interacts well with the universe, or with how we perceive that the universe "should" interact with our objects. Algebra is not about interactions with the universe, and that's fine, but not all mathematics is about viewing things through that lens. It's just not always that useful. – Asaf Karagila Aug 27 '20 at 10:25
  • @DavidR: Also, why do you need the Dedekind-completeness property? You can just as well work with any field that has sufficiently structure if you look at analysis from an algebraic lens. You just want certain types to be realised, and you want to ensure that the types you get from "things that interest you" are included in these certain types. You can recast a lot of analysis in terms that would make sense in higher up models. So let's ditch these fields and move on to wilder models that lie to the west, and have a real wild west vibe in freshman Calculus I. – Asaf Karagila Aug 27 '20 at 10:33
  • 2
    @AsafKaragila Sure, any series of requests as to why someone wants something will only go so far, ending perhaps with "I just like it". I'm surprised here it's reached so soon. Even a vague "I have the sense that it may be useful for the next step the field should take" adds something more. And now while writing this I see you are adding such content. – David Corfield Aug 27 '20 at 10:33
  • @DavidC: I'm sorry, I'm being a bit defensive here, perhaps because I feel ganged up by people who try to tell me that forcing should be seen through an algebraic lens. I mean, why does the first thing we prove about forcing is that it preserves transitivity and does not add ordinals if it's not important? You know, I mean, who cares about transitive models? Well, apparently set theorists do. Why? Because they are very nice. That's why we call them "standard models". Why do algebraists care so much about the "universal property" everywhere? Because it's nice and they like it... – Asaf Karagila Aug 27 '20 at 10:36
  • 2
    @AsafKaragila I'm still waiting to hear something concrete that you can do with nonalgebraic forcing that you can't do with algebraic forcing. If it's just an aesthetic preference, there's nothing a priori wrong with that, but I look at the other answers to this question and I see people struggling to use that perspective to give intuition for something that in the algebraic picture is much simpler and obvious, so it's hard for me to see what's "nicer" about it other than that you're used to it or haven't put the effort in to become familiar with the algebraic approach. – Mike Shulman Aug 27 '20 at 12:57
  • 3
    @AsafKaragila "tell me that forcing should be seen through an algebraic lens" I'm not we are telling you that, the question asked for something that would make sense for people who aren't set theorists. This answer is really not meant for people who understand ZF(C) inside out and force six impossible things before breakfast, but for 'generic mathematicians', in the sense that Kevin Buzzard talks about. At best they know what a category is, and what sets are, but are specialists in neither. – David Roberts Aug 27 '20 at 13:28
  • @Mike: How do you prove preservation theorems for countable support iterations of proper forcings? How do you deal with the machinery of proper forcings to begin with? Yes, there is a template that lets you copy-paste standard arguments into algebraic ones, which means it is probably impossible to point at something "you can't do", but if you try to understand forcing with side conditions, suddenly well-founded models become an important tool. – Asaf Karagila Aug 27 '20 at 13:35
  • @DavidR: I'm not complaining that the answer was given. I agree, it's illuminating. But it also feels to me that it is written to people who understand sheaves and toposes much more than "generic mathematicians". Yes, it's probably a new, wider audience whose intersection with set theorists is small, and that's great. I'm not here to take this away from Mike, or anyone else. I'm just trying to point out that it's not exactly "generic mathematicians" either. – Asaf Karagila Aug 27 '20 at 13:39
  • 1
    @Asaf the point at hand is not "how do you prove such-and-such a theorem", but how to explain forcing to someone that isn't already trained in its mysteries. Bringing up the practice of professionals in this space is a red herring. – David Roberts Aug 27 '20 at 14:30
  • 1
    @David: But the question is what does it mean "to explain forcing", when you take the method, change its setting, change its setup, change its environment, and change its language? Theseus had less problems with maintaining his ship. – Asaf Karagila Aug 27 '20 at 14:31
  • 1
    @Asaf see my comment about tensors above. Two approaches to the same thing, in different fields of research, with vastly different notation, practice, aim etc. – David Roberts Aug 27 '20 at 14:40
  • 1
    Maybe we are seeing the same problem at a meta-level. We "algebraists" are used to the idea that isomorphic objects are indistinguishable. So since "algebraic forcing" is isomorphic to ordinary forcing, we don't understand why anyone would feel the need to distinguish between them. Ultimately, this idea of isomorphism-invariance is probably one that can only be learned through experience, so continuing to try to justify it verbally is unlikely to get anywhere. – Mike Shulman Aug 27 '20 at 15:50
  • 1
    @Mike: And this is exactly the problem you'll find yourself in when you're trying to deal with improper forcing and "the usual finitary approach", where we take an countable elementary submodel of some $H(\kappa)$ and force over it. If the forcing is improper, then the generic extension is not going to commute with the transitive collapse. Which exactly tells you that something is not invariant under isomorphism (or that the Mostowski collapse is somehow the wrong isomorphism, or that the generic filter is somehow "wrong", I guess). – Asaf Karagila Aug 27 '20 at 17:30
  • 1
    Well, I don't have the time to actually understand all that right now, and I can't even get the Internet to tell me what "improper forcing" is. But I'll register my skepticism that anything mathematical can actually fail to be invariant under isomorphism, for a correct definition of "isomorphism". (-: – Mike Shulman Aug 27 '20 at 19:01
  • @Asaf I'd be be interested to learn more about this phenomenon, but not here. You have my email address... – David Roberts Aug 27 '20 at 22:05
  • Mike, $\Bbb P$ is a proper forcing is for every large enough regular $\kappa$, if $M$ is a countable elementary submodel of $H(\kappa)$ and $\Bbb P\in M$, then every condition in $\Bbb P\cap M$ can be extended to a condition which is $M$-generic, in the sense that every dense open $D\in M$ satisfies that $D\cap M$ is predense below the extension. Equivalently, this means that if we do the Mostowski collapse, add a generic to the model, and undo the collapse, we get "what we'd expect". A forcing is improper if it is not proper. Now, every proper forcing preserves $\omega_1$, for example. – Asaf Karagila Aug 27 '20 at 22:11
  • @DavidR: You can email me as well. Normally that's how it works, if you have a question, you send an email. :-) – Asaf Karagila Aug 27 '20 at 22:58
  • @AsafKaragila Touché. Will do. – David Roberts Aug 27 '20 at 23:38
  • 1
    To wrap up this overly-long comment thread, then, let me emphasize that I didn't mean to deny that the traditional perspective on forcing has important insights and uses (despite my own ignorance of what those might be). My point was that I think the algebraic perspective is a better way to explain forcing to a newcomer (as the original question asked), and in particular solves the specific issue asked about by the OP. After one is no longer a newcomer, one should certainly learn other ways of thinking about it as well. – Mike Shulman Aug 28 '20 at 00:53
  • 1
    I have not attempted to read all the comments; this might be redundant: I'd love to see an article implementing this overview in detail. It would be nice to carry it to the point of looking at a few specific famous examples of posets to force over (e.g. the one Cohen used for $\neg CH$) and see how the properties whose consistency they prove follow from the universal property of an appropriate classifying topos, and "compute" that these posets are the correct ones to use. I think this is almost done in Mac Lane and Moerdijk, but maybe they're not explicit about the universal properties? – Tim Campion Aug 31 '20 at 23:34
  • @TimCampion Yeah, there's definitely room for a nice expository article there. – Mike Shulman Sep 01 '20 at 00:23
  • Sorry for commenting two years later. Still, where can I find a proof that of such bijection between models of ZFC and ETCS+R? Also is such construction functorial isomorphism or just a bijection? – user40276 Apr 30 '22 at 19:09
  • 1
    @user40276 The original proofs were due to Cole, Mitchell, and Osius. Maybe I can be forgiven for pointing to my own https://arxiv.org/abs/1808.05204v2, which includes all those citations, and which does the same constructions in a more general context using a version of the replacement axiom that I think is more useful in practice. – Mike Shulman May 01 '22 at 00:24
  • 1
    Functoriality is a bit tricky to make sense of, since you have to specify what a "map of models of ZFC is" which isn't necessarily obvious. Mitchell claimed that the constructions are adjoint functors in some sense. – Mike Shulman May 01 '22 at 00:26
  • @MikeShulman Do you expect class forcing to be straightforward to develop in ETCS+R? I don't know that much about class forcing and I don't have explicit experience with ETCS+R, but it seems to me that things involving the large-scale structure of proper classes (e.g., most of inner model theory) would be easier to formalize in a context where the large-scale structure of models is rigidly controlled. – James Hanson Nov 23 '23 at 05:13
  • @JamesHanson I don't know enough about fancier versions of forcing to answer questions like that. David Roberts has thought some about class forcing in topos theory. – Mike Shulman Nov 24 '23 at 00:05
33

I have proposed such an axiomatization. It is published in Comptes Rendus: Mathématique, which has returned to the Académie des Sciences in 2020 and is now completely open access. Here is a link:

https://doi.org/10.5802/crmath.97

The axiomatization I have proposed is as follows:

Let $(M, \mathbb P, R, \left\{\Vdash\phi : \phi\in L(\in)\right\}, C)$ be a quintuple such that:

  • $M$ is a transitive model of $ZFC$.

  • $\mathbb P$ is a partial ordering with maximum.

  • $R$ is a definable in $M$ and absolute ternary relation (the $\mathbb P$-membership relation, usually denoted by $M\models a\in_p b$).

  • $\Vdash\phi$ is, if $\phi$ is a formula with $n$ free variables, a definable $n+1$-ary predicate in $M$ called the forcing predicate corresponding to $\phi$.

  • $C$ is a predicate (the genericity predicate).

As usual, we use $G$ to denote a filter satisfying the genericity predicate $C$.

Assume that the following axioms hold:

(1) The downward closedness of forcing: Given a formula $\phi$, for all $\overline{a}$, $p$ and $q$, if $M\models (p\Vdash\phi)[\overline{a}]$ and $q\leq p$, then $M\models (q\Vdash\phi)[\overline{a}]$.

(2) The downward closedness of $\mathbb P$-membership: For all $p$, $q$, $a$ and $b$, if $M\models a\in_p b$ and $q\leq p$, then $M\models a\in_q b$.

(3) The well-foundedness axiom: The binary relation $\exists p; M\models a\in_p b$ is well-founded and well-founded in $M$. In particular, it is left-small in $M$, that is, $\left\{a : \exists p; M\models a\in_p b\right\}$ is a set in $M$.

(4) The generic existence axiom: For each $p\in \mathbb P$, there is a generic filter $G$ containing $p$ as an element.

Let $F_G$ denote the transitive collapse of the well-founded relation $\exists p\in G; M\models a\in_p b$.

(5) The canonical naming for individuals axiom: $\forall a\in M;\exists b\in M; \forall G; F_G(b)=a$.

(6) The canonical naming for $G$ axiom: $\exists c\in M;\forall G; F_G(c)= G$.

Let $M[G]$ denote the direct image of $M$ under $F_G$. The next two axioms are the fundamental duality that you have mentioned:

(7) $M[G]\models \phi[F_G(\overline{a})]$ iff $\exists p\in G; M\models (p\Vdash\phi)[\overline{a}]$, for all $\phi$, $\overline{a}$, $G$.

(8) $M\models (p\Vdash\phi)[\overline{a}]$ iff $\forall G\ni p; M[G]\models \phi[F_G(\overline{a})]$, for all $\phi$, $\overline{a}$, $p$.

Finally, the universality of $\mathbb P$-membership axiom.

(9) Given an individual $a$, if $a$ is a downward closed relation between individuals and conditions, then there is a $\mathbb P$-imitation $c$ of $a$, that is, $M\models b\in_p c$ iff $(b,p)\in a$, for all $b$ and $p$.

It follows that $(M, \mathbb P, R, \left\{\Vdash\phi : \phi\in L(\in)\right\}, C, G)$ represent a standard forcing-generic extension: The usual definitions of the forcing predicates can be recovered, the usual definition of genericity can also be recovered ($G$ intersects every dense set in $M$), $M[G]$ is a model of $ZFC$ determined by $M$ and $G$ and it is the least such model. (Axiom $(9)$ is used only in the proof that $M[G]$ is a model).

  • 4
    This is excellent! From an expository point of view, I still desire a further "reversal," namely motivating the axioms by the way they are used to prove that $M[G]$ satisfies ZFC, but your work is big step in the direction I was hoping for. – Timothy Chow Aug 20 '20 at 20:56
  • 2
    I have tried to motivated the axioms in terms of control: The standard forcing-generic extension is a uniform adjuction of G, ground controlled by forcing. Axioms (1) to (8) are, I believe, not hard to motivate from this point of view. ( M[G] must be controlled from the ground in order to guarantee that G does not encode special properties one can only see from the outside and which would forbid M[G] to be a model). – Rodrigo Freire Aug 20 '20 at 21:11
  • @TimothyChow A more general axiomatic approach to forcing can be found in the paper "An axiomatic approach to forcing in a general setting", BSL, 28, 3, 2022. https://doi.org/10.1017/bsl.2022.15 – Rodrigo Freire Apr 20 '23 at 14:43
33

Great Question! Finally someone asks the simplest questions, which almost invariably are the real critical ones (if I cannot explain a great idea to an intelligent person in minutes, it simply means I do not understand it).

In this case, the idea is one of the greatest in modern history.

Let me start with a historical background: in the 90s I talked with Stan Tennenbaum about Forcing, hoping to (finally!) understand it (did not go too far) . Here is what he told me (not verbatim): during those times,late 50s and very early 60s, several folks were trying their hand to prove independence.

What did they know? They certainly knew that they had to add a set G to the minimal model, and then close up with respect to Godel constructibility operations. So far nothing mysterious: it is a bit like adding a complex number to Q and form an algebraic field.

First blocker: if I add a set G which certainly exists to construct the function you described above, how do I know that M[G] is still a model of ZF?

In algebraic number theory I do not have this issue, I simply take the new number , and throw it into the pot, but here I do. Sets carry along information, and some of this information can be devastating (simple example: suppose that G is gonna tell that the first ordinal outside of M is in fact reachable, that would be very bad news.

All this was known to the smart folks at the time. What they did not know is: very well, I am in a mine field, how then I select my G so it does not create trouble and do what is supposed to do? That is the fundamental question.

They wanted to find G, describe it, and then add it.

Enter Cohen. In a majestic feat of mathematical innovation, Cohen, rather than going into the mine field outside of M searching for the ideal G, enters M. He looks at the world outside, so to speak, from inside (I like to think of him looking at the starry sky, call it V, from his little M).

Rather than finding the mysterious G which floats freely in the hyperspace outside M, he says: ok, suppose I wanted to build G, brick by brick, inside M. After all, I know what is supposed to do for me, right? Problem is, I cannot, because if I could it would be constructible in M, and therefore part of M. Back to square one.

BUT: although G is not constructible in M, all its finite portions are, assuming such a G is available in the outer world. It does not exist in M, but the bricks which make it (in your example all the finite approximation of the function), all of them, are there. Moreover, these finite fragments can be partially ordered, just like little pieces of information: one is sometimes bigger than the other, etc

Of course this order is not total. So, he says, let us describe that partial order, call it P. P is INSIDE M, all of it. Cohen has the bricks, and he knows which brick fit others, to form some pieces of walls here and there, but not the full house, not G. Why? because the glue which attach these pieces all together in a coherent way is not there. M does not know about the glue. Cohen is almost done: he steps out of the model, and bingo! there is plenty of glue.

If I add an ultrafilter, it will assemble consistently all the pieces of information, and I have my model. I do not need to explicitly describe it, it is enough to know that the glue is real (outside). Now we go back to the last insight of Cohen. How does he know that glueing all pieces along the ultrafilter will not "mess things up"? Because, and the funny thing is M knows it, all information coming with G is already reached at some point of the glueing process, so it is available in M.

Finale

What I just said about the set of fragments of information, is entirely codable in M. M knows everything, except the glue. It even knows the "forcing relation", in other words it knows that IF M[G] exists, then truth in M[G] corresponds to some piece of information from within forcing it.

LAST NOTE One of my favorite books in Science Fiction was written by the set theorist converted to writer, Dr. Rudy Rucker. The book is called White Light, and is a big celebration of Cantorian Set Theory written by an insider. It just misses one pearl, the most glorious one: Forcing. Who knows, someone here, perhaps you, will write the sequel to White Light and show the splendor of Cohen's idea not only to "ordinary mathematicians" but to everybody...

ADDENDUM: SHELAH's LOGICAL DREAM (see commentary of Tim Chow)

Tim, you have no idea how many thoughts your fantastic post has generated in my mind in the last 20 hours. Shelah's dream can be made reality, but it ain't easy, though now at least I have some clue as to how to begin.

It is the "virus control method": suppose you take M and throw in some G which is living in the truncated V cone where M lives. Add G. The very moment you add it, you are forced to add all sets which are G-constructibles in alpha steps, where alpha is any ordinal in M. Now, let us say that the most lethal viral attack perpetrated by G is that one of these new sets is exactly alpha_0, the first ordinal not in M, in other words G or its definable sets code a well order of type alpha_0.

If one carries out the analysis I have just sketched, the conjecture would be that a G which does not cause any damage is a set which is as close as possible to be definable in M already, in some sense to be made precise,but that goes along Cohen's intuition, namely that although G is not M-constructible, all its fragments are.

If this plan can be implemented, it would show that forcing is indeed unique, unless.... unless some other crazy idea come into play

Greg Martin
  • 12,685
  • 1
    I've found that that model of construction (having little parts, and gluing them together into a whole, according to some instructions) to be a very useful paradigm for solving problems. The additional idea that some infinite processes/concepts keep some of the properties inherent in finite processes/concepts, is also quite useful. (This happens, for instance, in products and coproducts in category theory, all the time. The compactness theorem is another instance of this principle in action.) – Pace Nielsen Aug 20 '20 at 23:04
  • @PaceNielsen, thanks for the appreciation! Actually, I have just learned something from you: indeed forcing belongs to the same order of ideas you mention. It is not by chance that Fitting, formalizing its logic, found out that it is "intuitionistic". Writing this answer helped me to clarify to myself a lot of things: for instance, following your lead, how do you complete a category? Assume the job has already being done, now look back to you old structure. The new guy leaves "traces" in it. The idea is that by patching these finite traces you will assemble what you need... – Mirco A. Mannucci Aug 20 '20 at 23:17
  • 1
    This is a nice account of why forcing works, but I guess I'm trying to ask why something more simple-minded doesn't work. Certainly, pre-Cohen folks can't be faulted for failing to find G, describe it, and add it. That was a hard problem. But Cohen found G and described it. With the benefit of hindsight, why can't we simplify the argument? What is it about the ZFC axioms that seemingly compels us to use such elaborate machinery? – Timothy Chow Aug 21 '20 at 00:59
  • I hear you man. The key issue is of course that whatever G you add to M, it can carry some "hidden information" which is an obstruction toM[G] being a model. Now things become slippery: what kind of information? Unfortunately ZF is very complicated, and it is not trivial to determine a priori which kind of information G can bring in. For instance, you have replacement, so perhaps G alone can seem pretty harmless, but once it is inside it can be used to define a new well ordering. My lingering feeling is this: unless there is a totally NEW way of building models, – Mirco A. Mannucci Aug 21 '20 at 09:47
  • forcing is in a sense to be made precise inescapable here. Anyway: how about launching a plan to investigate the issue of "virus information"? For instance, suppose I have a weaker set theory, how does it make less likely that an external G would mess things up when added? I suspect that without full blown replacement and power set there would be less trouble to think about... – Mirco A. Mannucci Aug 21 '20 at 09:52
  • 2
    I am reminded that in Shelah's "Logical Dreams," one of his dreams is to "show that forcing is the unique method in some non-trivial sense." – Timothy Chow Aug 21 '20 at 12:08
  • 1
    @TimothyChow just added an addendum on that one – Mirco A. Mannucci Aug 21 '20 at 12:36
  • 1
    Cool! By the way, let me mention an idea/question of Scott Aaronson's. Instead of the word "generic" let's try using the word "random." We know that some G might create contradictions. But it seems that producing a contradiction is actually a delicate process. It won't happen at random. So just pick a random G, and with probability 1 (or least with some positive probability) everything will be fine. I think this idea works for $\neg$CH. If I pick a random function $F:\aleph_2^M\times \aleph_0\to\lbrace0,1\rbrace$ then the associated filter will be generic. – Timothy Chow Aug 21 '20 at 12:51
  • 2
    Of course I'm still piggy-backing on all the usual machinery to prove that a random function works, but still, it seems suggestive. The mentality has switched from, "We have to be really really careful to avoid a contradiction" to "The burden of proof is on the contradiction to manifest itself." Is this mentality accurate? If so, can it be pushed further? – Timothy Chow Aug 21 '20 at 12:54
  • 2
    @TimothyChow To the point of switching from "generic" to "random", don't they mean essentially the same thing? Cohen started with Cohen forcing over a countable model, where the generic filter corresponds to a real in a particular comeager set. In other words, any real from that comeager set would have worked for his argument, so a "generic" (aka "typical") real works. It was always my impression that this is where the terminology comes from. Random reals work the same way but with measure one sets. – Miha Habič Aug 21 '20 at 16:06
  • 1
    @MihaHabič : The difference may be mainly psychological, but for example, standardly, only G gets the adjective "generic." Stuff that depends on G doesn't. But if X is a random variable, variables that depend on X are also random variables. They inherit the randomness. But I don't want to get into a semantic debate. It's just an idea that switching terminology might introduce a new perspective. E.g., is avoiding contradiction an excruciatingly delicate process, or is it easy (and it's just the consistency proof that's hard)? – Timothy Chow Aug 21 '20 at 16:19
  • 1
    This is really an aside but I have been thinking lately that maybe we can have better foundations for random variables inspired partly by this forcing stuff (and algebraic geometry). In both places, we want to use generic as saying "almost all" and that is also what the phrase "with probability one" means. However, in AG/Forcing, it seems that having a generic variable is stronger than just thinking of "with probability one" - you can do operations on the generic object directly. Can something similar work in, for instance random graph theory where a random graph would be an actual graph!? – Asvin Aug 22 '20 at 07:44
  • 1
    @Asvin : Do you know about the Rado graph and about graphons? As an aside, in analysis there is a distinction between measure and category ("category" here means Baire category and not morphisms/functors). Generic filters are closely related to category whereas probability theory is related to measure. So one has to be a bit careful about being too literal in one's identification between "almost all" and "generic." – Timothy Chow Aug 22 '20 at 16:08
  • Yes but I only learnt about graphons very recently. Your point about category vs measure is very relevant and the Rado graph is "easier" because it's a 0-1 thing. For graphons, they are indeed very nice but I would be to think a little more about them before I can say more! – Asvin Aug 22 '20 at 16:13
  • @TimothyChow the last two points you touched upon are, in my modest opinion, both great, but will require some serious thinking to be put to use. Let us start from the first one, namely "generic" as " likely to be a non-virus" ie something that can be safely added to M without causing problems. I may be wrong, but to me it feels as if the situation is just the opposite: if I choose a function outside M which does the job, chances are it WILL mess things up. So, Cohen's methods seems to be: I wanna be as conservative as possible in choosing G, so that the chances of creating trouble are zero. – Mirco A. Mannucci Aug 22 '20 at 16:15
  • on the other hand, let us say you have a P inside M, or equivalently a boolean algebra B. ANY ultrafilter will squeeze the boolean model to a "real" model, so in this respect it looks as if there are many possibilities. It would be interesting to ask oneself: suppose I choose TWO ultrafilters G1 and G2, what is the relation between M{G1] and M{G2}? – Mirco A. Mannucci Aug 22 '20 at 16:18
  • I have just eaten my italian risotto, so I am not especially lucid right now, but my sense is that M{G1] and M[G2] are different models, and yet they agree as far as the job they need to accomplish. Perhaps this is a key: G would then be generic in the sense that it does not matter at all which one I choose... – Mirco A. Mannucci Aug 22 '20 at 16:21
  • Correction: replace "function" with "set" in " if I choose a function outside M which does the job" . – Mirco A. Mannucci Aug 22 '20 at 16:51
  • @Miha: The terminology of "generic" is a bit of an odd duck. Cohen did use it in the sense of "typical" or "random", but once the topological and Boolean-valued approaches were starting to clarify, the term "generic" was taken from the more standard sense of "meeting dense sets". And of course, that fits Cohen's use, but only because the term "generic" in topology/algebraic geometry came from that same sense of the word. I agree that changing "generic" to "random" seems like a confusing reason why forcing would suddenly make sense. – Asaf Karagila Aug 23 '20 at 10:34
  • 1
    @AsafKaragila : I'm still thinking it might be a useful crutch for the beginner. For $\neg$CH, it's easy to motivate adding a function from $\aleph_2^M \times \aleph_0$ to ${0,1}$ and it is easy to say what it means to add a random function. No need to define "filter" or "dense" or "generic". Forcing then means "almost surely implies." So this allows one to skip some annoying definitions and get to the main point more quickly. The downside is then you may have to "unlearn" the randomness terminology later. – Timothy Chow Aug 24 '20 at 00:35
  • @Timothy: Easy, then, use random reals. – Asaf Karagila Aug 24 '20 at 00:46
20

This answer is quite similar to Rodrigo's but maybe slightly closer to what you want.

Suppose $M$ is a countable transitive model of ZFC and $P\in M$. We want to find a process for adding a subset $G$ of $P$ to $M$, and in the end we want this process to yield a transitive model $M[G]$ with $M\cup \{G\}\subseteq M[G]$ and $\text{Ord}\cap M = \text{Ord}\cap M[G]$.

Obviously not just any set $G$ can be adjoined to $M$ while preserving ZFC, so we our process will only apply to certain "good" sets $G$. We have to figure out what these good sets are.

Let's assume we have a collection $M^P$ of terms for elements of $M[G]$. So for each good $G$, we will have a surjection $i_G : M^P\to M[G]$, interpreting the terms. We will also demand that the definability and truth lemmas hold for the good $G$s. Let's explain our hypotheses on good sets more precisely.

If $\sigma\in M^P$ and $a\in M$, write $p\Vdash \varphi(\sigma,a,\dot G)$ to mean that for all good $G$ with $p\in G$, $M[G]$ satisfies $\varphi(i_G(\sigma),a,G)$.

Definability Hypothesis: for any formula $\varphi$, the class $\{(p,\sigma,a)\in P\times M^P \times M: p\Vdash \varphi(\sigma,a,\dot G)\}$ is definable over $M$.

Truth Hypothesis: for any formula $\varphi$, any good $G$, any $\sigma\in M^P$, and any $a\in M$, if $M[G]\vDash \varphi(i_G(\sigma),a,\dot G)$, then there is some $p\in G$ such that $p\Vdash \varphi(\sigma,a,\dot G)$.

Interpretation Hypothesis: for any set $S\in M$, the set $\{i_G(\sigma) : p\in G\text{ and }(p,\sigma)\in S\}$ belongs to $M[G]$. (This must be true if $M[G]$ is to model ZF assuming $i_G$ is definable over $M[G]$.)

Existence Hypothesis: for any $p\in P$, there is a good $G$ with $p\in G$.

One can use the first three hypotheses to show that $M[G]$ is a model of ZFC.

Now preorder $P$ by setting $p\leq q$ if $p\Vdash q\in \dot G$. Let $\mathbb P = (P,\leq)$. Suppose $D$ is a dense subset of $\mathbb P$. Fix a good $G$. We claim $G$ is an $M$-generic filter on $P$. Let's just check genericity. Let $D$ be a dense subset of $\mathbb P$. Suppose towards a contradiction $D\cap G = \emptyset$. By the truth hypothesis, there is some $p\in G$ such that $p\Vdash D\cap \dot G = \emptyset$. By density, take $q\leq p$ with $q\in D$. By the existence hypothesis, take $H$ with $q\in H$. We have $q\Vdash p\in \dot G$, so $p\in H$. But $p\Vdash D\cap \dot G = \emptyset$, so $D\cap H = \emptyset$. This contradicts that $q\in H$.

  • Yes, thank you, this is in the direction I want! – Timothy Chow Aug 21 '20 at 13:14
  • 1
    As someone who has never found forcing intuitive...I lose you at the word “write”. Why do we say “write $p$...” when the definition does not involve $p$? How are we supposed to pronounce this $\dot G$? Why did we switch from figuring out which sets are good to figuring out which $\sigma$ and $a$ satisfy a criterion for all good sets? –  Aug 22 '20 at 07:56
  • @MattF. : I think I can answer the first question, which is that the intent was to say "for all good $G$ containing $p$." – Timothy Chow Aug 22 '20 at 16:15
  • @MattF Thanks for pointing out that typo. The question asks for a way of recovering the forcing machinery from the proof that forcing preserves ZFC. That's all I tried to accomplish in my answer. $\Vdash$ is motivated by a natural question: what can we conclude about a good extension $M[G]$ given that $p\in G$? We allow terms for elements of $M[G]$, elements of $M$, and a constant for $G$. This notation is used to express hypotheses on good $G$. The first three hypotheses are exactly what's required for the short proof that $M[G]$ satisfies ZFC. And $\dot{G}$ is pronounced "gee dot." – Gabe Goldberg Aug 23 '20 at 00:28
19

I think there are a few things to unpack here.

1. What is the level of commitment from the reader?

Are we talking about a casual reader, say someone in number theory, who is just curious about forcing? Or are we talking about someone who is learning about forcing as a blackbox to use in some other mathematical arguments? Or are we talking about a fledgling set theorist who is learning about forcing so they can use it later?

The level of commitment from the reader dictates the clarity of the analogy, and the complexity of the details.

  • To someone just wanting to learn about forcing, understanding what is "a model of set theory" and what are the basic ideas that genericity represent, along with the fact that the generic extension has some sort of a blueprint internal to ground model, are probably enough.

  • To someone who needs to use forcing as a blackbox, understanding the forcing relation is probably slightly more important, but the specific construction of $\Bbb P$-names is perhaps not as important.

  • Finally, to a set theorist, understanding the ideas behind $\Bbb P$-names is perhaps the biggest step in understanding forcing. From their conception, to their interactions with the ground model, and their interpretation.

These different levels would necessitate different analogies, or perhaps omitting the analogies completely in favour of examples.

2. Some recent personal experience

Just before lockdown hit the UK, I had to give a short talk about my recent work to a general audience of mathematicians, and I had to make the first part accessible to bachelor students. If you're studying some easily accessible problems, that's great. If your recent work was developing iterations of symmetric extensions and using that to obtain global failures of the axiom of choice from known local failures. Not as easy.

I realised when I was preparing for this, that there is an algebraic analogy to forcing. No, not the terrible "$\sqrt2$ is like a generic filter". Instead, if we consider subfields between $\Bbb Q$ and $\Bbb R$, to understand $\Bbb Q(\pi)$ we need to evaluate rational functions in $\Bbb Q(x)$ with $\pi$ in the real numbers.

When developing this analogy I was trying it out on some of the postdocs from representation theory, and two things became apparent:

  1. People in algebra very much resisted the idea that $\Bbb Q(\pi)$ is a subfield of $\Bbb R$. To then it was an abstract field, and it was in fact $\Bbb Q(x)$. It took some tweaking to the exposition to make sure that everyone is on board.

  2. The words "model of set theory" can kill the entire exposition, unless we explain what it is immediately after, or immediately before. Because the biggest problem with explaining forcing to non-experts is that people see set theory as "the mathematical universe", and when you're forcing you suddenly bring in new objects into the universe somehow. And even people who say that they don't think that way, it is sometimes apparent from their questions that they are kind of thinking that way.

There are still problems with the analogy, of course. It is only an analogy after all. For one, the theory of ordered fields is not a particularly strong theory—foundationally speaking—and so it cannot internalise everything (like the polynomials and their fraction field) inside the field itself, this is a sharp contrast to set theory. So what is a model of set theory? It's a set equipped with a binary relation which satisfy some axioms, just like a model of group theory is a set equipped with a binary operator which satisfy some axioms.

But now we can use the idea that every real number in $\Bbb Q(\pi)$ has a "name" of some rational function evaluated with $\pi$. It helps you understand why $\Bbb Q(e)$ and $\Bbb Q(\pi)$ are both possible generic extensions, even though they are very different (one contains $\pi$ and the other does not), and it helps you understand why $\Bbb Q(\pi)$ and $\Bbb Q(\pi+1)$ are both the same field, even though we used a different generic filter, because there is an automorphism moving one generic to the other.

Here is where we can switch to talk about genericity, give example of the binary tree, and what does it mean for a branch to be generic over a model, and how density plays a role.

So in this case, we did not go into the specifics. We only talked about the fact that there is a blueprint of the extension, which behaves a bit like $\Bbb Q(x)$, but because set theory is a more complicated theory, this blueprint is found inside the model rather than a "derivable object from our model".

3. What to do better?

Well, the above analogy was developed over a short period of time, and I will probably continue developing it in the next few years every time I explain someone what is forcing.

Where can we do better? Well, you want to talk about the forcing relation. But that's a tricky bit. My advisor, who is by all accounts a great expositor, had a story about telling some very good mathematician about forcing. Once he uttered "a formula in the language of forcing" the other party seemingly drifted off.

And to be absolutely fair, I too drift off when people talk to me about formulas in the language of forcing. I know the meaning of it, and I understand the importance of it, but just the phrase is as off-putting to the mind as "salted apples cores dinner".

I am certain that for the casual reader, this is unnecessary. We don't need to talk about the language of forcing. We simply need to explain that in a model some things are true and others are false. And the blueprint that we have of the model can determine some of that, but that the elements of the binary tree, or as they are called the conditions of the forcing, can tell us more information. They can give us more information on how the names inside the blueprint behave. Couple this with the opposite direction, that everything that happens in the generic extension, happens for a reason, and you got yourself the fundamental theorem of forcing. Without once mentioning formulas and the language of forcing, or even the forcing relation, in technical terms.

Yes, this is still lacking, and yes this is really just aimed at the casual reader. But it's a first step. It's a way to bring people into the fold, one step at a time. First you have an idea, then you start shaping it, and then you sand off the rough edges, oil, colour, and lacquer, and you've got yourself a cake.

Asaf Karagila
  • 38,140
  • 2
    These are all very useful ideas and suggestions for explaining forcing, and I will add them to my arsenal of tricks. But as for what audience I have in mind for this specific question, I've had a number of people read my article and "get stuck" at roughly the same point, which to be honest is roughly the same point that I myself get stuck. Namely, why is all this machinery being dragged in? There is of course the a posteriori justification, "because it works." But is there an a priori justification for why it's needed? – Timothy Chow Aug 21 '20 at 12:24
  • I always had similar questions about localisation of rings, to be honest. Why do you need to drag me through these definitions. Just give me the $p$-adic numbers and get it over with. – Asaf Karagila Aug 21 '20 at 12:25
  • 1
    Maybe another way to think about it is that the audience is a theory builder who insists on asking annoyingly basic questions and isn't content with just using forcing as a black box. There may not be too many theory builders out there, but my feeling is that if the theory builder's questions can be answered then it could unlock benefits for other people too. – Timothy Chow Aug 21 '20 at 12:27
  • 4
    If you want to appeal to theory builders you need to engage with other constructions that they might know, where truth is controlled "from below". Things like limits, or anything continuous really. We understand the truth of the limit as a limit of the truth of the sequence/diagram that led to it (in the infinite case, that is). Here it's the same. We want to understand the generic extension, so we need a blueprint, this is where the rational functions help. This motivates the need for names. Why are they complicated? Well, models of ZF are complicated. If you're a theory builder, you'll see. – Asaf Karagila Aug 21 '20 at 12:53
  • "I was trying it out on some of the postdocs from representation theory". I remember you trying it out on some other unsuspecting victims too... – Carl-Fredrik Nyberg Brodda Aug 21 '20 at 18:46
  • 2
    I should add that the idea that $\mathbb{Q}(\pi)$ is a subfield of $\mathbb{R}$ seemed very natural to me from the point of view of combinatorial (semi)group theory, where it's often very useful to makes analogous identifications. – Carl-Fredrik Nyberg Brodda Aug 21 '20 at 18:53
  • 1
    In number theory at least, people usually think about and talk about "embeddings" of $F$ into $K$ more than "subfields" $F\subseteq K$. That is, there is this abstract field $\mathbb{Q}(x)$ and it can be embedded in $\mathbb R$ in all kinds of ways, with $\mathbb{Q}(\pi)$ being one such embedding. – Timothy Chow Aug 21 '20 at 19:27
  • 3
    @TimothyChow: Once you replace "field" by "ordered field", the abstract field is now the blueprint, the $\Bbb P$-names, and by choosing which rationals are smaller than $x$ you get a semblance of genericity. (And right here you can see how talking about this analogy made it better.) – Asaf Karagila Aug 21 '20 at 20:22
3

Here is currently my favorite way to motivate forcing, and I conjecture that it works for most "real" mathematicians (non-logicians). A proof/disproof is left to the reader.

The forcing relation is indeed daunting at first sight; I was never able to remember the definitions until learning the Boolean-valued model approach. Meanwhile, I'm a material set theorist by training and don't want to go so far as advertising topos-theoretic forcing (at least before I understand how iterated forcing works in that setting), so let me advertise Boolean-valued model. Another reason I like the Boolean approach is the nice analogy with probability theory.

First, one doesn't need to know every axiom of $\mathsf{ZFC}$ in order to understand forcing, but several concepts especially helpful to be aware of are models, independence and absoluteness. Of course, models are everywhere: groups are models of group axioms, both $\mathbb{R}^2$ and Poincare disk are models of Hilbert's plane geometry axioms, etc. A model of set theory isn't any different: although we don't usually think of it this way, a model of $\mathsf{ZFC}$ is just a well-founded extensional directed graph (of course I am ignoring the subtleties around set model and class model) that satisfies some extra axioms. Independence is also everywhere: a group may or may not be abelian, a model of plane geometry may or may not satisfy Parallel Postulate, and it's easy to show the independence of power set axiom, replacement axiom, etc., from the rest of $\mathsf{ZFC}$. Absoluteness also has plenty of examples; if $A$ is an abelian group, $a\in A$ and $A$ satisfies the statement $\exists x\ x+x=a$ ($a$ can be divided by two in $A$), and $B$ is a subgroup, then $\exists x\ x+x=a$ isn't necessarily true in $B$. Nevertheless, bounded formulas and more generally $\Delta_1$ formulas are absolute between (transitive) models of set theory. This is because $\Delta_1$ formulas represent recursive constructions, and recursive constructions are intuitively absolute: think about $\Delta_1$ formulas in Peano arithmetic. In particular the constructible universe $L$ is absolute.

Now we can talk about forcing. Say we want to create a model of $\mathsf{ZFC}+\lnot\mathsf{CH}$. The method of inner model (like $L$) cannot possibly work, as observed by Shepherdson and Cohen, due to the existence of minimal model. So let's try the other way round: start with a model $M$ and expand it instead of shrinking it. For simple reasons we should not choose $M$ to be the whole universe $V$ or some level of von Neumann hiearachy $V_\kappa$, so maybe let's choose $M$ to be as small as possible, say countable, so that there are many things outside of $M$ that we can potentially throw into $M$.

Let $G\subseteq\omega$ be a set of natural numbers that is not in $M$; we want to adjoin it to $M$ and create a larger model $M[G]$ having the same ordinals. If we can add one then presumably we can add many, plus if we manage to add one then it already shows the independence of $V=L$ (by absoluteness of $L$ and the fact that $M[G]$ has the same ordinals), which is not bad. There are simple examples showing not all $G$ would work, but let's not worry about that yet, and think about what $M[G]$ should be. It is certainly not $M\cup\{G\}$ since the latter doesn't satisfy any interesting set theory. At the very least, $M[G]$ should contain all the sets "generated by $G$ over $M$ using simple operations", such as $\omega\setminus G$, $G\times G$, $\{n\in\omega: \text{the $n$-th prime is in }G\}$, etc. Note that:

$\omega\setminus G=\{n\in\omega:n\notin G\}$

$G\times G=\{(m,n)\in\omega\times\omega:m\in G\land n\in G\}$

$\{n\in\omega: \text{the $n$-th prime is in }G\}=\{n\in\omega:p_n\in G\}$, where $p_n$ denotes the $n$-th prime.

All these sets have the form $u=\{x\in X:b_x\}$; we can view it as a function $u$ that sends $x\in X$ to $b_x$, where $X$ is a set in $M$ and $b_x$ is a Boolean combination of statements of the form $n\in G$. I want to further rewrite these sets as follows. Let $\mathcal{G}$ be a fixed symbol. Consider the set $B$ of all Boolean combination of the expressions $n\in\mathcal{G}$, such as $(0\in\mathcal{G})\land(1\in\mathcal{G}\lor3\notin\mathcal{G})$. This is the free Boolean algebra with countably many generators $b_n$, where $b_n$ stands for $n\in\mathcal{G}$. A Boolean algebra is a structure $(B,\lor,\land,*,0,1)$ that behaves similar to union, intersection and complementation of sets; in our example, $b^*$ is the negation of $b$, e.g., $(0\in\mathcal{G})^*=0\notin\mathcal{G}$ and $[(0\in\mathcal{G})\land(1\in\mathcal{G}\lor3\notin\mathcal{G})]^*=[(0\notin\mathcal{G})\lor(1\notin\mathcal{G}\land3\in\mathcal{G})]$.

It should be emphasized that $\mathcal{G}$ is just a symbol intended to make $B$ more suggestive, in contrast to $G$, which is an actual subset of $\omega$. In particular $B\in M$. Any free Boolean algebra on countably many generators might be used as $B$ (as long as it is in $M$).

For a real number $G\subseteq\omega$, say that it satisfies $b\in B$ if $b$ is true if we plug $G$ into $\mathcal{G}$; for example, $G$ satisfies the statement $(0\in\mathcal{G})\land(1\in\mathcal{G}\lor3\notin\mathcal{G})$ iff $0\in G$ and at least one of $1\in G$ and $3\notin G$ happens. For a function $u:X\rightarrow B,x\mapsto b_x$, define the interpretation of $u$ under $G$ by $u_G=\{x:G\text{ satisfies } b_x\}$. Now observe that:

$\omega\setminus G=\{(n,b_n^*):n\in\omega\}_G$;

$G\times G=\{((m,n),b_m\land b_n):m,n\in\omega\}_G$;

$\{n\in\omega: \text{the $n$-th prime is in }G\}=\{(n,b_{p_n}):n\in\omega\}_G$.

So they are all of form $u_G$, where $u:X\rightarrow B$ is a function, and most importantly $X$ and $u$ are in $M$. This suggests that a set in $M[G]$, roughly speaking, consists of an $M$-part and a $G$-part. The $M$-part is a function $u:X\rightarrow B,x\mapsto b_x$; we can think of $u$ as a "random subset" of $X$, and $b_x$ as the "probability" of $x\in u$. And once we choose a point $G$ from the "sample space", the random set $u$ is determined to be $u_G$.

These random sets are more commonly called names: imagine that people living in $M$ cannot see $G$ or other sets in the extension $M[G]$, but nevertheless can name them and even reason about them. In particular there is a name for $G$, namely $\dot{G}=\{(n,b_n):n\in\omega\}$; it has the property that $\dot{G}_G=G$ regardless of $G$. If we let $\dot{G}'=\{(n,b_n^*):n\in\omega\}$, then people living in $M$ can see that "$\dot{G}'$ is the complement of $\dot{G}$", although they don't know any particular element of $G$.

Now comes one of the key ideas of forcing: pretend that we are the people living in $M$, don't know what $G$ is, but have names for $G$ and all the sets it generate. Although we don't know whether $3\in\dot{G}$ or not, we know its probability is $b_3$ or $3\in\mathcal{G}$. More generally, it turns out we can calculate the probability of any statement $\varphi$ about $M^B$, the collection of random sets in $M$; the probability will be a Boolean value, that is an element $||\varphi||\in B$. And the miracle is that every $\mathsf{ZFC}$ axiom holds with probability $1$ in this "probabilistic model" $M^B$. Incidentally, in this approach it's not important anymore that we start with a countable $M$---we could have started with the whole universe $V$ and form the probabilistic model $V^B$.

Unfortunately, our previous definition of random set, namely a function from some set $X$ to $B$, is problematic in two ways. First there are sets such as $\{n\in\omega:G\text{ contains some element divisible by }n\}$ that definitely should be in $M[G]$; what's the corresponding random set $u$? We would like to define $u(n)$ to be the ``sum'' $\displaystyle\sum_{n\mid m}b_m$, but by definition $B$ only contains finite combinations of the $b_n$s. If we replace $B$ by its Boolean completion, then there is a natural definition of the sum of an arbitrary set $A\subseteq B$: it's just the supremum $\bigvee A$. So let's assume $B$ is complete from now on.

Another issue is that sets belong to other sets, so we should also allow random sets to belong to other random sets, in a random way. This naturally leads to the probabilistic von Neumann hierarchy: $V^B_0=\emptyset$, $V^B_{\alpha+1}$ is the set of functions from $V^B_{\alpha}$ to $B$ (it's actually a bit more convenient to take partial functions), and at limit ordinals take union. In one sentence, a random set is a random set of random sets. Every set $x\in V$ has a canonical name $\check{x}$ in $V^B$, defined recursively by letting $\check{y}$ belong to $\check{x}$ with probability $1$ for all $y\in x$.

The technical part is to actually define the probability, or Boolean value of an arbitrary statement $\varphi$ in $V^B$; this is the counterpart of the recursive definition of forcing relation in poset approach. The gist is really the case of atomic formulas $||u\in v||$ and $||u=v||$; the propositional connectives and quantifiers are easily handled, thanks to the completeness of $B$. I mentioned above that $u(x)$ can be viewed as the probability of $x\in u$; in fact this is true only for random sets that are simple enough, in general we should interpret $u(x)=b$ as $x\in u$ with probability at least $b$. For example, since $V^B$ is supposed to be an extension of $V$, it is reasonable to expect that for any canonical names $\check{x}$ and $\check{y}$, $||\check{x}=\check{y}||$ is $1$ if $x=y$ and $0$ if $x\neq y$, similarly for $||\check{x}\in\check{y}||$. If $u$ is a random set whose domain $\mathrm{dom}(u)$ is a set of canonical names, it is natural to let $||\check{x}\in u||$ be $u(\check{x})$ if $\check{x}\in\text{dom}(u)$, and $0$ otherwise. Now suppose $x,y,z$ are different sets in $V$ and let

$u=\{(\check{x},a),(\check{y},b)\}$,

$v=\{(\check{y},c),(\check{z},d)\}$,

what should $||u=v||$ be? Intuitively, $u=\{y\}$ with probability $a^*\land b$ and $v=\{y\}$ with probability $c\land d^*$, and that's the only way they could be equal, so the probability of $u=v$ is $a^*\land b\land c\land d^*$. Next consider

$w=\{(u,p),(v,q)\}$.

What is $||u\in w||$? It is certainly at least $p$, but also $u=v$ with probability $a^*\land b\land c\land d^*$ and $v\in w$ with probability at least $q$, and in order for $u=v\land v\in w\rightarrow u\in w$ to hold in our Boolean-valued model, $u\in w$ should have probability at least $a^*\land b\land c\land d^*\land q$. Altogether, $||u\in w||$ is at least $p\lor(a^*\land b\land c\land d^*\land q)$; there's no obvious reason it should be any bigger, so we make this the definition of $||u\in w||$. Formally, we define $||u=v||$ and $||u\in v||$ simultaneously by transfinite induction, following this line of thought.

Once we show that this definition indeed gives us a Boolean-valued model, namely it has properties such as $||u=v||\land||v=w||\leq||u=w||$, it's not difficult to verify the $\mathsf{ZFC}$ axioms all hold with probability $1$. For example, to construct the subset of $u$ consisting of elements with property $\varphi$, simply "reweight" elements of $u$ according to their probability of satisfying $\varphi$. There is no need to pass to countable model: one can directly argue with the Boolean-valued model $V^B$ to do independence proof, using the Boolean version of soundness theorem. If you insist, though, it is not difficult at all to relativize all of the above to some countable model $M$, choose a generic ultrafilter $G$ of the Boolean algebra $B$ and get a good old generic extension $M[G]$. The proof of the truth and definability lemmas are also cleaner and somewhat motivated: in order that $||\varphi||\in G$ iff $M[G]\models\varphi$, one naturally wants the filter $G$ to be generic for arguments to go through.

No intent to trivialize Cohen's accomplishments, but (something like) Boolean-valued models were long considered before him, although nobody came up with the idea of using them to prove independence result; see Dana Scott's preface to the book Boolean-Valued Models and Independence Proofs. Another interesting sentence from The Origins of Forcing, an interview of Cohen by Gregory H. Moore:

Cohen recalls that in the eyes of various logicians his forcing results went from being incorrect, to being extremely difficult to understand, to being easy, and finally to being already present in the literature.

I believe this shows that to some extent, Boolean-valued model does make forcing easier to understand.

Edit: I just realized that the notes your wrote followed exactly the Boolean-valued model approach, so this answer is sort of redundant...Still I wish to share the journey of how I eventually came to view forcing as an intuitive thing.

There does exist something similar to "axiomatization" in the Boolean approach, namely the standard definitions of $||u\in v||$ and $||u=v||$ are the only way to make $V^B$ a Boolean-valued model, subject to the following requirements: (i) $u(v)\leq||v\in u||$; (ii) the Boolean value of bounded quantification depends only on the domain of the name quantified.

new account
  • 727
  • 2
  • 10
  • "No intent to trivialize Cohen's accomplishments, but (something like) Boolean-valued models were long considered before him" But Cohen, if memory serves, didn't talk at all about BVMs; those were introduced into forcing by Scott and Solovay, if I have my history right. – Noah Schweber Jul 07 '23 at 18:29
  • @NoahSchweber I meant due to the similarities between forcing and pre-Cohen work on BVMs, some people viewed Cohen's work as essentially present in the literature, so presumably BVMs did make it seem simpler – new account Jul 07 '23 at 18:34