99

This question is cross-posted from academia.stackexchange.com where it got closed with the advice of posting it on MO.


Kevin Buzzard's slides (PDF version) at a recent conference have really unsettled me.

In it, he mentions several examples in what one would imagine as very rigorous areas (e.g., algebraic geometry) where the top journals like Annals and Inventiones have published and never retracted papers which are now known to be wrong. He also mentions papers relying on unpublished results taken on trust that those who announced them indeed have a proof.

He writes about his own work:

[...] maybe some of my work in the p-adic Langlands philosophy relies on stuff that is wrong. Or maybe, perhaps less drastically, on stuff which is actually correct, but for which humanity does not actually have a complete proof. If our research is not reproducible, is it science? If my work in pure mathematics is neither useful nor 100 percent guaranteed to be correct, it is surely a waste of time.

He says that as a result, he switched to formalizing proofs completely, with e.g. Lean, which guarantees correctness, and thus reusability forever.

Just how widespread is the issue? Are most areas safe, or contaminated? For example, is there some way to track the not-retracted-but-wrong papers?


The answer I accepted on academia.stackexchange before the closure gives a useful general purpose method, but I'd really appreciate more detailed area-specific answers. For example, what fraction of your own papers do you expect to rely on a statement "for which humanity does not actually have a complete proof" ?

geodude
  • 2,129
Archie
  • 883
  • 38
    If my work in pure mathematics is neither useful nor 100 percent guaranteed to be correct, it is surely a waste of time. There would appear to be a gap in this argument, unless the author has a general proof that "not useful" plus "not 100% guaranteed to be correct" implies "waste of time", in which case there are myriad special cases outside of mathematics, many of them quite surprising and unsettling. – Steven Landsburg Jan 31 '20 at 17:50
  • 4
    Just to be clear, this depends on precisely what you mean by "wrong" doesn't it? You have not defined this term yet. I imagine that there are many ways to be wrong, not all of the same kind. For example, does a trivial typographical error count? Grammatical errors? etc. Complete proof? What is that? – Somos Jan 31 '20 at 17:58
  • 1
    @Somos : I mean it as a substantial error making the proof incomplete as such.The paper in Annals mentioned by Buzzard being an example : the opposite statement seems to have been proved later on. – Archie Jan 31 '20 at 18:03
  • 2
    There has been substantial discussions on this site on the meaning (and ambiguity) of the word "wrong"; if somebody can track these discussions and link them it would be useful. – YCor Jan 31 '20 at 18:05
  • 2
    Possible duplicate: https://mathoverflow.net/questions/291890/what-percentage-of-published-mathematics-papers-are-correct/ – Sam Hopkins Jan 31 '20 at 18:20
  • Perhaps relevant: https://en.wikipedia.org/wiki/List_of_incomplete_proofs. – jeq Jan 31 '20 at 19:08
  • @ Sam Hopkins : thanks for this reference, but my question is precisely about papers which have not had an errata issued yet should have one. – Archie Jan 31 '20 at 19:23
  • 1
    @StevenLandsburg, I see no gap: after all, the phrases “not useful”, “useless”, and “waste of time” are ordinarily close to synonymous. The question is just about what gets called “useful”, and for some people, glorifying mathematical creativity is useful enough. –  Jan 31 '20 at 20:34
  • 9
    what fraction of your own papers do you expect to rely on a statement "for which humanity does not actually have a complete proof" Zero, but the catch is that I insisted on claiming the proof of the reduction of A to B (B being published by other people and universally accepted as valid) instead of claiming A itself at least twice (not that my co-authors were very happy about it, but I find it a good practice if one has trouble going through the whole proof tree of B yourself). On the other hand, I'm guilty of publishing at least 2 papers in which some gap was later found (and closed). – fedja Jan 31 '20 at 22:17
  • 5
    A complete proof *from what axioms? fedja just described the perfectly acceptable practice of working in a high-level system where you just take some B on faith, a.k.a. make it part of your axioms. (Likewise people in Paris used to speak of “axiom Valiron”: [everything in this book is true*](//zbmath.org/?q=an:68.0099.03).) Of course, the risk in doing this is, your resulting axiom system might be inconsistent. But then so might ZFC, right? (1/2) – Francois Ziegler Jan 31 '20 at 23:22
  • 3
    So I see “relying on stuff for which humanity does not actually have a complete proof” (a contradiction might be found) as hugely different from “relying on stuff that is wrong” (a contradiction has been found). (2/2) – Francois Ziegler Jan 31 '20 at 23:22
  • 3
    The actual answers will probably suffice for you, but I still recommend that you look at Thurston’s response to the Jaffe-Quinn paper that was mentioned in one of the answers. You may find it helpful in alleviating some of the underlying concerns you seem to have about mathematics as a whole. – Robin Goodfellow Feb 03 '20 at 05:55

2 Answers2

120

"Are most areas safe, or contaminated?"

Most areas are fine. Probably all important areas are fine. Mathematics is fine. The important stuff is 99.99999% likely to be fine because it has been carefully checked. The experts know what is wrong, and the experts are checking the important stuff. The system works. The system has worked for centuries and continues to work.

My talk is an intentionally highly biased viewpoint to get people talking. It was in a talk in a maths department so I was kind of trolling mathematicians. I think that formal proof verification systems have the potential to offer a lot to mathematicians and I am very happy to get people talking about them using any means necessary. On the other hand when I am talking to the formal proofs people I put on my mathematician's hat and emphasize the paragraph above, saying that we have a human mathematical community which knows what it is doing better than any computer and this is why it would be a complete waste of time formalising a proof of Fermat's Last Theorem -- we all know it's true anyway because Wiles and Taylor proved it and since then we generalised the key ideas out of the park.

It is true that there are holes in some proofs. There are plenty of false lemmas in papers. But mathematics is robust in this extraordinary way. More than once in my life I have said to the author of a paper "this proof doesn't work" and their response is "oh I have 3 other proofs, one is bound to work" -- and they're right. Working out what is true is the hard, fun, and interesting part. Mathematicians know well that conjectures are important. But writing down details of an argument is a lot more boring than being imaginative and figuring out how the mathematical world works, and humans generally do a poorer job of this than they could. I am concerned that this will impede progress in the future when computers start to learn to read maths papers (this will happen, I guess, at some point, goodness knows when).

Another thing which I did not stress at all in the Pittsburgh talk but should definitely be mentioned is that although formal proof verification systems are far better when it comes to reliability of proofs, they have a bunch of other problems instead. Formal proofs need to be maintained, it takes gigantic libraries even to do the most basic things (check out Lean's definition of a manifold, for example), different systems are incompatible and systems die out. Furthermore, formal proof verification systems currently have essentially nothing to offer the working mathematician who understands the principles behind their area and knows why the major results in it are true. These are all counterpoints which I didn't talk about at all.

In the future we will find a happy medium, where computers can be used to help humans do mathematics. I am hoping that Tom Hales' Formal Abstracts project will one day start to offer mathematicians something which they actually want (e.g. good search for proofs, or some kind of useful database which actually helps us in practice).

But until then I think we should remember that there's a distinction between "results for which humanity hasn't written down the proof very well, but the experts know how to fill in all of the holes" and "important results which humanity believes and are not actually proved".

I guess one thing that worries me is that perhaps there are areas which are currently fashionable, have holes in, and they will become less fashionable, the experts will leave the area and slowly die out, and then all of a sudden someone will discover a hole which nobody currently alive knows how to fill, even though it might have been the case that experts could once do it.

Kevin Buzzard
  • 40,559
  • 46
    It is nice to have a reply from the actual author here. – Gerald Edgar Feb 01 '20 at 22:36
  • 3
    My perspective on what you say in your fifth paragraph is that writing formal proofs is a lot like programming, so if mathematicians start doing it more they will get to "enjoy" all the nuisances that occur in programming. – Robert Furber Feb 02 '20 at 01:19
  • 3
    I enjoyed the talk slides and have been lurking around some of your other online channels for this kind of stuff, but when I see "It was in a talk in a maths department so I was kind of trolling mathematicians" I can't help thinking of "Rex Kramer, Danger Seeker" (look it up on YouTube as I'd rather not link in case other readers of MO take it the wrong way) – Yemon Choi Feb 02 '20 at 02:05
  • 2
    Thank you very much for this answer, I had not understood this was "trolling", I'm quite relieved ! – Archie Feb 02 '20 at 06:28
  • For those interested, there's a nice older question about correcting published errors https://mathoverflow.net/questions/31337/how-do-i-fix-someones-published-error?rq=1 I guess the wiki is still lacking. – Archie Feb 02 '20 at 06:32
  • 9
    The only reason that you need a huge library before you can define a manifold is that mathematicians set things up that way. You need three years of undergraduate math before you can teach gemetric topology. Formalization just reflects this fact. If we want to do better, then we need to organize math in better ways. Example: homotopy type theory lets you define a circle directly in 5 lines. (But there is a price to pay.) – Andrej Bauer Feb 02 '20 at 07:50
  • 3
    Do you really think that only one in ten million important research papers is flawed? That is delusional. – TonyK Feb 02 '20 at 14:57
  • 13
    @TonyK: what are you talking about? There obviously aren't 10000000 "important stuff" research papers, so I think we need to read Kevin's "99.99999%" as hyperbole. – Andrej Bauer Feb 02 '20 at 22:39
  • 1
    @AndrejBauer: this is MathOverflow. Where numbers are numbers. – TonyK Feb 02 '20 at 23:24
  • 20
    @TonyK: Oh please. – Andrej Bauer Feb 02 '20 at 23:59
  • 6
    @AndrejBauer: OK, I think you must agree that Kevin Buzzard is saying that the chance that there exists an important research paper with a serious error is vanishingly small. And that in itself is delusional in my opinion. – TonyK Feb 03 '20 at 00:20
  • 10
    You're probably right. But you know what the real problem is? It's not that there are probably faulty important papers out there, but rather that we have no idea how many or which ones. – Andrej Bauer Feb 03 '20 at 07:13
  • 14
    "... experts will leave the area and slowly die out, and then all of a sudden someone will discover a hole which nobody currently alive knows how to fill, even though it might have been the case that experts could once do it." A natural solution to this problem is to incentivize grad students and advanced undergrads to writing papers filling in these gaps, e.g. for the journal of "Trivial Remarks" or "Not Research Level Questions" (or perhaps less self-depracting venues). MO rep or similar doesn't create this incentive because it doesn't count towards publications. – Elle Najt Feb 03 '20 at 20:28
  • 2
    The problem that I have with contemporary research mathematics is that the proofs for important results are usually very complicated and involved. Actually, so complicated and involved that a generic research mathematician working in that area still has a certain chance of not spotting any crucial mistake or gap in the argument. This chance is actually so high, that people usually still do not believe in the argument if he / she says that 'he / she went through the proof and assured him- or herself that it is correct'. Only if an 'elder' says this, people actually start believing the argument. – AlexE Dec 10 '20 at 10:08
47

As Kevin Buzzard himself admits in his answer, he somewhat exaggerated his point for effect.

However, I'd submit that if you were unsettled by his talk, then that's a good thing. I don't think that the proper reaction is to look for reassurance that mathematics really is fine, or that the problems are restricted to some easily quarantined corner.

Rather, I think the proper reaction is to strive for a more accurate view of the true state of the mathematical literature, and refuse to settle for comforting myths that aren't based on reality. Some of the literature is rock-solid and can stand on its own, much more of it is rock-solid provided you have access to the relevant experts, and some of it is gappy but we don't really care. On the other hand, some small percentage of it is gappy or wrong and we do care, but social norms within the mathematical community have caused us to downplay the problems. This last category is important. It is a small percentage, but from a scholarly point of view it is a serious problem, and we should all be aware of it and willing to acknowledge it. If, every time someone brings it up, we try to get them to shut up by repeating some "propaganda" that makes us feel good about mathematics, then we are not solving the problem but perpetuating it.

Some related concerns were raised over 25 years ago by Jaffe and Quinn in their article on Theoretical Mathematics. This generated considerable discussion at the time. Let me quote the first paragraph of Atiyah's response.

I find myself agreeing with much of the detail of the Jaffe–Quinn argument, especially the importance of distinguishing between results based on rigorous proofs and those which have a heuristic basis. Overall, however, I rebel against their general tone and attitude which appears too authoritarian.

My takeaway from this is that Jaffe and Quinn made many valid points, but because this is a sensitive subject, dealing with societal norms, we have to be very careful how we approach it. Given the way that the mathematical community currently works, saying that someone's work has gaps and/or mistakes is often taken to be a personal insult. I think that if, as a community, we were more honest about the fact that proofs are not simply right or wrong, complete or incomplete, but that there is a continuum between the two extremes, then we might be able to patch problems that arise more efficiently, because we wouldn't have to walk on eggshells.

Timothy Chow
  • 78,129
  • 8
    I have had some push-back against the rather gung-ho approach I have been taking to highlight the problems I perceive with the way mathematics is currently being done, and I think Tim's post above does a very good job of highlighting the same issues but in a more balanced way. Thanks. Some of the issues I highlight in my talk are huge gaps in the literature which the experts have absolutely no doubt can be filled. I personally am uncomfortable about this but not everyone is. However there are other gaps, he said, treading on eggshells, where there might be more of a problem. – Kevin Buzzard Feb 07 '20 at 19:08
  • 7
    @KevinBuzzard : I do like the way you point out that everyone eventually dies and that therefore gaps which can be filled only by a tiny number of experts are at risk of becoming permanent gaps. I share your feeling that these sorts of gaps are worrisome. Maybe something can be done about them without ruffling too many feathers, because there's no implicit accusation of an incomplete argument, just a call that the experts perform "community service." – Timothy Chow Feb 09 '20 at 20:19
  • 5
    The classification of finite simple groups could serve as a model. Although you express concern that even the current plan might not be completed successfully, at least it's a huge improvement over the situation as it stood in, say, 1985. Credit should be given to the experts who acknowledged the need for a proper writeup and who dedicated an enormous amount of time and effort to the project. – Timothy Chow Feb 09 '20 at 20:22