19

The sum of two nilpotent elements of a commutative ring is nilpotent. This can be checked by a direct calculation using the binomial theorem. In fact, this calculation shows the stronger statement $x^n=y^m=0 \Rightarrow (x+y)^{n+m-1}=0$.

But we can also give a more sophisticated proof: If $x,y$ are nilpotent, they are contained in every prime ideal. Hence, the same is true for $x+y$. Hence, $x+y$ is nilpotent: otherwise, the localization at $x+y$ would be non-zero and therefore have a prime ideal, but this corresponds to a prime ideal in the given ring not containing $x+y$. (In short: The set of nilpotent elements is the intersection of all prime ideals, hence closed under addition.)

The general existence of prime ideals is equivalent to the Boolean Prime Ideal Theorem and therefore the proof above is not constructive. The proof shows nothing about the nilpotence exponent of the sum. On the other hand, it is quite elegant and it is really a no-brainer if you are used to commutative algebra. Moreover, it can be made "more constructive" (not really constructive, as Matt F. points out), or at least provable in $\mathsf{ZF}$, as follows:

We restrict our attention to the subring generated by $x,y$. This ring is countable. The same is true for the localization at $x+y$. There is a constructive proof that every non-trivial countable commutative ring has a maximal ideal, hence has a prime ideal. And now we may proceed as before.

So we have two constructive proofs: (a) the direct calculation using the binomial theorem, (b) the proof using prime ideals. The question is: Assume that we know the proof (b), is there a general method how to produce the proof (a) from it? Perhaps even including the stronger statement about the nilpotence exponent? This is just a toy example for the general question how to get rid of prime ideals in proofs in commutative algebra where we would expect to have, or already know, more direct proofs. I have only picked this toy example because I hope that the general method can be easily explained with it.

Another toy example: How to produce the direct proof of $I+J=A \Rightarrow I^n+J^n=A$ for ideals $I,J \subseteq A$ from the proof using prime ideals? A more sophisticated example can be found here, where I have no idea how a direct calculation looks like (perhaps I will ask this in a separate question).

I know that Thierry Coquand and Henri Lombardi have worked on related questions, but after some skimming through their work I couldn't find an answer to my question.

HeinrichD
  • 5,402
  • 1
    I don't completely understand your question. But given a multiplicative set $S \subset A$, the existence of a prime ideal $\mathfrak{p}$ such that $\mathfrak{p} \subset A \setminus S$ is equivalent to AC, so for instance "the intersection of all prime ideals is the set of nilpotent elements" cannot be made into a proof without AC. – user40276 Sep 14 '16 at 13:57
  • 1
    It is not equivalent to AC; it is weaker. But I only use this for countable $A$ anyway, where can even give a proof in ZF ("just add elements until you are done"). References are included in the text. – HeinrichD Sep 14 '16 at 14:14
  • The keyword to search is “proof mining”, but I have no idea whether something relevant to prime ideals has been done. – Emil Jeřábek Sep 14 '16 at 14:51
  • The word "otherwise" in proof b gives me pause. Even if "for all n (x+y)^n !=0" leads to a contradiction, that does not construct n such that(x+y)^n=0. So proof b may need more to work constructively. –  Sep 14 '16 at 17:48
  • @MattF. This is a very good point. Thank you. – HeinrichD Sep 14 '16 at 17:53
  • 3
    The book https://arxiv.org/abs/1605.04832 by Lombardi and Quitte has its whole Chapter VII devoted to a (not fully formalized, but often rather straightforward-to-follow) method for obtaining a constructive proof from a non-constructive one. Roughly speaking, the idea is to replace the "too-perfect-to-be-true" notions like "prime ideal" or "maximal ideal" or "algebraic closure" by their "dynamic" counterparts. For instance, the "dynamic counterpart" of "maximal ideal" is something like "a prime ideal $I$ that, each time we find two elements $f$ and $g$ satisfying $fg \in I$, can ... – darij grinberg Sep 14 '16 at 21:00
  • 3
    ... grow to encompass either $f$ or $g$". The "dynamic counterpart" of "algebraic closure" would be "a field that, every time we have a polynomial over it, can grow by adjoining the roots of this polynomial". The latter example is actually a bit of an oversimplification, since "adjoining the roots" in itself isn't always constructive (it relies on the factorization of the polynomial into irreducibles, which cannot always be computed), so it too gets replaced by the dynamic concept of "adjoining the roots as if the polynomial had a symmetric Galois group and then, ... – darij grinberg Sep 14 '16 at 21:02
  • 2
    ... every time the resulting extension ring reveals itself to have zero divisors, getting rid of them by quotienting it by an ideal". As I said, this is not a fully automatic rewriting of a proof (unless the proof is rather limited in its tooling), and some interpretation is required. It also seems to fail whenever some sort of Noetherian or Artinian properties are involved; this is why we constructivists tend to think that the real non-constructive arguments in mathematics are not the "maximal ideal / prime ideal / limit / algebraic closure" kinds of arguments, but the ... – darij grinberg Sep 14 '16 at 21:05
  • 1
    ... "Noetherian ring / Artinian ring / compactness" kinds of arguments. Sorry if I misrepresented anything! – darij grinberg Sep 14 '16 at 21:06
  • 1
    @Darij Thank you for your comments. I would love to see these turned into an answer, perhaps including the application to the specific example I gave. I couldn't find this in the book by Lombardi and Quitte so far. – HeinrichD Sep 14 '16 at 21:07
  • I'll mark this thread for the future, but I won't have the time to expand this into an answer today. – darij grinberg Sep 14 '16 at 21:09
  • 2
    A direct reference is http://www.cse.chalmers.se/~coquand/sitesur.pdf. Also the slides located at http://www.cse.chalmers.se/~coquand/FISCHBACHAU/ are very good. – Ingo Blechschmidt Sep 15 '16 at 09:31
  • @Ingo: The slides were very helpful. Thank you. – HeinrichD Sep 16 '16 at 08:00
  • @HeinrichD: Just saw (part of) your deleted question on characterization of Zariski toposes in my mail feed. If you drop me a mail (iblech@web.de), I can try to give a (very partial) answer. Also I'd be interested in the full text of your question! – Ingo Blechschmidt Oct 13 '16 at 06:29
  • @IngoBlechschmidt: http://mathoverflow.net/questions/252009 :-) – HeinrichD Oct 13 '16 at 06:48

3 Answers3

8

Exactly what kind of answer you get depends on the kind of proof you start with and what exactly you mean by constructive. But the proof mining approach offers an answer to some questions of this kind.

The crucial idea that comes out of it is that, to prove concrete facts, one can generally replace the notion of a prime ideal with sort of "local primality" notion. Basically, if P is a possibly prime ideal, one often doesn't need it to be truly prime to complete the proof; it might suffice, say, to look at a single pair $f_Pg_P\in P$ and need $f_P\in P$ or $g_P\in P$ to complete the argument. Therefore one can first fix the function $P\mapsto (f_P,g_P)$, and then ask for an ideal P with the property that $f_P\in P$ or $g_P\in P$. (More generally, we might hope to look at only finitely many test cases of this kind.)

At least in suitably restricted settings (say, where we are considering ideals which have reasonable finitary representations of some kind), one can extract constructive proofs along these lines.

Under suitable circumstances (in particular, one has to consider questions about how the ideal $P$ is represented) one can obtain constructive proofs this way. William Simmons I will be soon (I hope by the end of the month) uploading a paper on proof mining in polynomial and differential polynomial rings which tackles some related problems. Being polynomial rings, however, makes a big difference, because it means ideals are guaranteed to have finite representations.

  • Thank you. Can you say something about how to apply this method to the specific toy example of showing that the sum of two nilpotents is nilpotent? – HeinrichD Sep 14 '16 at 17:33
8

Here is a method which is very efficient in the case were "constructive" is interpreted as "no axiom of choice at all, not even countable and no law of excluded middle", i.e. essentially "topos logic".

It is possible to construct a very well behaved "Zariski spectrum" (including its structural sheaf whose globale section will be $A$ exactly) as a locale instead of a topological space: on will simply say that $\text{Spec } A$ is the classifying space of the theory of "complement of prime ideals" described below.

One can then construct the structural sheaf and so one and it is relatively trivial that the set of global section is $A$ and that an element of $A$ which is nowhere invertible in the structural sheaf is nilpotent.

points of the spectrum are still the prime ideal, but because it is now a locale instead of a topological space one does not really care about existence of points or not.

well this is not entirely true: technically the point (in the sense of classyfing topos) are the "complement of prime ideal" I.e. subset $I$ that satisfies $0 \notin I$, $1 \in I$, if $x+y \in I$ then $x\in I$ or $y \in I$ and $yx \in I$ if and only if $x \in I$ and $y \in I$. assuming the law of excluded middle this is the same as saying that the complement of $I$ is a prime ideal...

almost all "geometrical argument" can be made constructive by replacing the ordinary zariski spectrum by the localic Zariski spectrum, and this include most prof that involve using all prime ideal.

This technique is very well known among topos theorist but I don't know any reference explaining this clearly, maybe someone will know of one ?

In the mean time, I will try to give a little more explanation: Basically, instead of saying "let $\rho$ be a prime ideal" you move to the structural sheaf over the Zariski spectrum, especially if you know a little bit of internal logic this amount to assume that you have a subset $I$ as above which play the role of (the complement of) your prime ideal and if you can prove that your element $x$ is never in $I$ then it is nilpotent or if your some ideal "always contains an element of $I$" it has to be the whole ring and so one... Moreover, the structural sheaf is the localization at $I$ and $I$ is exactly the set of element that are invertible in the structural sheaf.

The drawback is that the rest of the proof has to be performed internally in in the topos of sheaves over the zariski spectrum, hence really has to be constructive (not involving the law of excluded middle) or has to involve working explicitly with sheaves.

Let me illustrate this on your two examples:

The sum of two nilpotent is nilpotent:

($I$ denote the universal "completment of prime ideal" in the logic of $spec A$, it is also the subobject of the structural sheaf of invertible element)

let $x$ and $y$ be nilpotent, internally in $\text{Spec } A$ $x$ and $y$ are not invertible (i.e. not element of $I$) hence $x+y$ is not invertible either (because $x+Y \in I \Rightarrow x \in I$ or $y \in Y$), hence $x+y$ is nowhere invertible on the spectrum hence nilpotent.

The thing about sum of ideals: well the exact same proof apply, just redefine $V(A)$ to be the corresponding closed subspace of the Zariski spectrum instead of a set of prime ideal...

Simon Henry
  • 39,971
  • What value of n does this proof produce for which (x+y)^n=0? –  Sep 14 '16 at 19:47
  • There is at least an explanation of these ideas in Makkai and Reyes, First Order Categorical logic Ch 9.3. – Bas Spitters Sep 14 '16 at 20:55
  • This seems to be the kind of answer I am looking for, but I don't understand it fully yet. You seem to say that $x$ is nilpotent iff $x$ is not contained in any $I$, but I cannot prove this constructively. Probably I have misunderstood something here. I don't understand the proof at the end, but hopefully I can when I have read the basics on the localic Zariski spectrum. I assume that this is the same as the Zariski lattice studied by Lombardi and others? Also, I second Matt's question. – HeinrichD Sep 14 '16 at 21:12
  • 1
    @Matt F. : It doesn't produce any explicite value of $n$ (the proof don't show that the value of $n$ does not depends on the ring and the element...) – Simon Henry Sep 14 '16 at 21:29
  • @HeinrichD : the trick is that you don't prove it for any $I$ that you can construct in the topos of sets, but for the "universal I" that lives in the topos of sheaves over the Zariski spectrum, or equivalently of any $I$ in any topos. – Simon Henry Sep 14 '16 at 21:32
  • 1
    @HeinrichD : the relation with the works of Lombardi & others is as follows: what they call the Zariski lattice is the lattice of quasi-compact open subspace of the Zariski locale. now there is a duality (similar to stone duality) between distributive lattice and coherent locale given by associated to a coherent locale the distributive of quasi-compact open. The construction of Lombardi and the one I'm talking about are related by this duality and hence essentially equivalent. – Simon Henry Sep 14 '16 at 21:37
  • 1
    Of course it depends on the ring and the element...but a constructive proof that x^m=y^n=0 implies (x+y)^k=0 can always be unwound to exhibit k as a function of m and n (and perhaps some other characteristics of the elements or the ambient ring which are classically less relevant). The simple proof gives k=m+n-1, what function does this proof give? –  Sep 14 '16 at 22:13
  • 3
    well the only reason I saw such a proof produce $k$ as a function of $m$ and $n$ is because you can apply it to $\mathbb{Z}[x,y]/(x^n,y^m)$ and actually get a $k$ that work for all ring by universality. but as long as you don't apply it explicitly to that ring you are not going to get an explicit and "computable" value of $k$" : you get such a value if you start from a ring that is "computable" in some sense, which is the case of the ring $\mathbb{Z}[x,y]/(x^n,y^m)$ – Simon Henry Sep 14 '16 at 22:44
  • So yes by using the universal example and the fact its operation are nice and computable you can get (basically by using the above argument internally in some form of effective topos) that $k$ will be a computable function of $n$ and $n$, but the proof being relatively high level it will be hard to compute this function (I have no idea of even the order of the result)... – Simon Henry Sep 14 '16 at 22:47
  • @SimonHenry Thank you your comments. I now realize that you didn't mean "not in any $I$", but rather "nowhere in $I$" (which seems to refer to the language internal to the topos of sheaves on $\mathrm{Spec}(A)$). Can you give a precise definition of the locale $\mathrm{Spec}(A)$ and its structure sheaf (or a reference)? Is $I$ defined to be the sheaf of invertible sections? If yes, how to prove $x+y \in I \Rightarrow x \in I \vee y \in I$? (This seems to be the statement that the sheaf is a local ring object). How to prove that $x$ is nilpotent if and only if $x$ does nowhere lie in $I$? – HeinrichD Sep 15 '16 at 08:15
  • @SimonHenry And how to write down this proof concretely with elements for some given commutative ring? Do we get the equational proof (a)? What about the toy universal example $\mathbf{Z}[x,y]/(x^2,y^2)$? – HeinrichD Sep 15 '16 at 08:17
  • 1
    Some more examples of the technique explained by Simon are in Section 11.4 of these rough notes of mine. – Ingo Blechschmidt Sep 15 '16 at 09:27
  • 1
    @HeinrichD : You are correct. for the construction of spec A you should have a look to the reference to Makkai & Reyes given by Bas Spitters above and to Ingo's work. The details of the construction will answer all the questions of your first comment. Regarding how you extract concretely a low level proof with explicit algebraic manipulation for explicit example : It is theoretically possible by working internally in well chosen toposes, but it can become very tedious and I would not advise this method if this is the end goal ! – Simon Henry Sep 15 '16 at 09:39
  • I haven't understood the localic or topos-theoretic point of view yet, but when we work with the Zariski lattice, the whole argument is circular, at least for my toy example. In a slide by Coquand, he defines the Zariski lattice of $A$ as the free distributive(?) lattice generated by symbols $D(a)$, $a \in A$ modulo the relations $D(0)=0$, $D(1)=1$, $D(ab)=D(a) \wedge D(b)$, $D(a+b) \leq D(a) \vee D(b)$. In order to prove that $D(a)=0$ holds iff $a$ is nilpotent, we need to realize this lattice more concretely, for example as the lattice of radical ideals of f.g. ideals of $A$. Even if we ... – HeinrichD Sep 15 '16 at 14:35
  • ... assume that we didn't already know that these are ideals and pretend that they are just radical sets, we somehow have to prove the relation $D(a+b) \leq D(a) \vee D(b)$ for $D(a) := \sqrt{\langle a \rangle}$. For nilpotent $a,b$, this says exactly that $a+b$ is nilpotent. So it seems that this basic statement is necessary for the construction of the Zariski lattice (and probably, also in the construction of the Zariski locale), but with that lattice we can constructively prove more sophisticated statements. What do others think about this? – HeinrichD Sep 15 '16 at 14:37
  • For a simple example, we can prove $I+J=A \Rightarrow I^n+J^m=A$ for finitely generated ideals $I,J$ (not for arbitrary ideals in this setting, right?) using the Zariski lattice: $D(I^n,J^m)=D(I^n) \vee D(J^m)=D(I) \vee D(J)=D(I,J)=D(1)=1$. Now it should be an exercise to produce from this a direct calculation with elements ... – HeinrichD Sep 15 '16 at 14:43
  • ... or rather with radical ideals: $\sqrt{I^n+J^m}=\sqrt{I^n} \vee \sqrt{J^m} = \sqrt{I} \vee \sqrt{J} = \sqrt{I+J}=\sqrt{A}=A$. So the trick here seems to be that we do not work in the lattice of ideals, but rather in the lattice of radical ideals, where $\vee$ is not the sum of ideals, and the observation $\sqrt{I+J}=\sqrt{I} \vee \sqrt{J}$ (clear without calculation since left adjoints preserve colimits). – HeinrichD Sep 15 '16 at 14:51
  • @HeinrichD : I didn't say it was trivial ! basically what makes things work is the explicit construction of the structural sheaf in this constructive framework, which involve the same sort of work on localization of rings. It is not easy, but it is exactly the same kind of stuff as in the classical framework. If you think about it proving that an element of a ring is nilpotent if and only it belongs to all prime ideal is not a simple task (it involve manpulatin properties of localizations and Zorn lemma). (...) – Simon Henry Sep 16 '16 at 11:09
  • When you move to the constructive framework, the part involving Zorn lemma is somehow replaced by some 'topos theoretic magic' but the part involving algebra and localization of ring is still essentially the same. Unfortunately a complete treatement of the basic theory of the Zariski spectrum is in my opinion beyond the scope of my answer. that is why I was asking if someone have references. – Simon Henry Sep 16 '16 at 11:11
  • 1
    @IngoBlechschmidt: Your notes are very interesting. There you give a constructive proof of generic flatness for f.g. modules on reduced rings, which is "magically simple". What about f.g. algebras? This would answer http://mathoverflow.net/questions/250040/ - thus I would appreciate if you answer there, if you have some ideas. – HeinrichD Sep 19 '16 at 12:57
  • @HeinrichD: I've been following this thread and your other one with much interest. I believe that the techniques explained in my notes indeed suffice to settle the other question; I'll try to write up an answer tomorrow. The key is to use that the structure sheaf of the spectrum of a reduced ring looks like a field from the internal point of view. Therefore finitely generated modules are not not free. – Ingo Blechschmidt Sep 22 '16 at 00:20
3

Since this is somewhat hidden in the comments, let me give the following answer:

  • The statement that the sum of two nilpotents is nilpotent is so basic that it seems to be used in the construction and the verification of the Zariski locale/topos/lattice. I don't think that constructive algebra can prove this without circular arguments.
  • However, the statement $I+J=A \Rightarrow I^n+J^m=A$ can be proven by working in the lattice of radical ideals: $$\sqrt{I^n+J^m}=\sqrt{I^n} \vee \sqrt{J^m}=\sqrt{I} \vee \sqrt{J}=\sqrt{I+J}=A.$$ Whenever one uses the open subset $D(I)$ in a proof, one may simply replace it by the radical ideal $\sqrt{I}$
HeinrichD
  • 5,402