What is a proof? by Keith Devlin
This is a copy of a web-page: Devlin’s Angle: What is a proof? by Keith Devlin, from June 2003, originally published in the public domain by the Mathematical association of America, at times it is not available on that association’s website.
What is a proof? The question has two answers. The right wing (“right-or-wrong”, “rule-of-law”) definition is that a proof is a logically correct argument that establishes the truth of a given statement. The left wing answer (fuzzy, democratic, and human centered) is that a proof is an argument that convinces a typical mathematician of the truth of a given statement.
While valid in an idealistic sense, the right wing definition of a proof has the problem that, except for trivial examples, it is not clear that anyone has ever seen such a thing. The traditional examples of correct proofs that have been presented to students for over two thousand years are the geometric arguments Euclid presents in his classic text Elements, written around 350 B.C. But as Hilbert pointed out in the late 19th century, many of those arguments are logically incorrect. Euclid made repeated use of axioms that he had not stated, without which his arguments are not logically valid.
Well, can we be sure that the post-Hilbert versions of Euclid’s arguments are right wing proofs? Like most mathematicians, I would say yes. On what grounds do I make this assertion? Because those arguments convince me and have convinced all other mathematicians I know. But wait a minute, that’s the left wing definition of proof, not the right wing one.
And there you have the problem. Like right wing policies, for all that it appeals to individuals who crave certitude in life, the right wing definition of mathematical proof is an unrealistic ideal that does not survive the first contact with the real world. (Unless you have an army to impose it with force, an approach that mathematicians have hitherto shied away from.)
Even in the otherwise totally idealistic (and right wing) field of mathematics, the central notion of proof turns out to be decidedly left wing the moment you put it to work. In other words, the only notion of proof that makes real sense in mathematics, and which applies to what mathematicians actually do, is the left wing notion.
So much for “What is a proof?” An argument becomes a proof when the mathematical community agrees it is such. How then about the related question “When is proof?” At what point does the community of mathematicians agree that a purported statement has been proved? When does the argument presented become a proof?
In last month’s column, I looked at three recent examples of mathematical proofs that illustrate this question “When is proof?” They are: Andrew Wiles’ 1993 presentation of a proof of Fermat’s Last Theorem, Russian mathematician Grigori Perelman’s 2002 claim to have “possibly proved” the Poincare Conjecture, and Daniel Goldston and Cem Yildirim’s 2003 announcement of a major advance on the Twin Primes Conjecture. All three are far too long and complicated for anyone to seriously believe these are anything but left wing proofs.
In Wiles’ case, a major flaw was discovered in his argument within months of his initial announcement, which took him over a year to fix. (His proof is now accepted as being correct.) Perelman has been guarded in his claim, admitting that it will likely take months before the mathematical community will know for sure whether he is right or not. In the case of the Goldston-Yildirim result, they and the rest of the mathematical community were still sipping their celebratory champagne when Andrew Granville of the University of Montreal and Kannan Soundarajan of the University of Michigan discovered a flaw in the new proof, a flaw that is almost certainly fatal.
Subsequent to the publication of my commentary, Granville sent me an email providing some background on what led him and his colleague to discover the error in the Goldston-Yildirim proof. This episode provides an excellent example of the psychology of doing mathematics.
Like everyone else, when they first read through Goldston and Yildirim’s proof, Granville and Soundarajan thought it was correct. The result was dramatic and surprising, but the argument seemed to work. Like any new mathematical result, the proof was a mixture of old and familiar techniques and some new ideas. Because the result was such a major breakthrough, Granville and Soundarajan, like everyone else, looked especially carefully at the new ideas.
Everything seemed okay.
Goldston and Yildirim did not claim to have proved the Twin Prime Conjecture, that there are infinitely many pairs of primes just 2 apart (such as 3 and 5 or 11 and 13), but they did claim to have made major progress in that direction, showing that there are infinitely many primes p such that the gap to the next prime is very small. (See last month’s column for a precise statement.)
Granville and Soundarajan took Goldston and Yildirim’s argument and extended it to show that there are infinitely many pairs of primes differing by no more than 12. In Granville’s own words, “We were damn close to twin primes !”
Too close, in fact. Not believing their result, the two decided to look again at Goldston and Yildirim’s core lemmas to see if there was some crucial detail that was being too easily glossed over. It took only a couple of hours to home in on one tiny detail that was not fully explained and which they could not prove. And with the discovery of that one tiny flaw, buried deep in the “tried and tested” part of the argument that everyone had accepted as correct, the entire Goldston and Yildirim result fell apart. It wasn’t that the established procedure was in itself wrong; rather, it did not apply under the new circumstances in which Goldston and Yildirim were using it.
Brian Conrey, the director of the American Institute of Mathematics, which played an instrumental role in the research that led Goldston and Yildirim to develop their argument, has commented on how this incident highlights the psychology of breakthrough in mathematics. Goldston and Yildirim’s core lemmas had a familiar flavor and their conclusions were very believable, so everyone believed them. The expectation was that if there was a mistake it would be among the new ideas. So, once several people (including Granville and Soundarajan) had verified that the new ideas were correct, everyone signed off on the new result as being correct. Granville himself arranged to give a series of lectures on the new proof.
But then he and Soundarajan made their own, “gap 12” discovery. As Granville puts it, this took them from the “fantastic” (Goldston and Yildirim’s result) into the realm of “unbelievable”. At that point the psychology changed. Neither Granville nor Soundarajan really believed it could possibly be correct. With that change in belief it became relatively straightforward to pinpoint the error.
But as Granville himself points out, the psychology is important here. They had to have good reason to suspect there was an error before they were able to find it. Conrey has observed that if Granville and Soundarajan had not used the new method to make their own “unbelievable” gap-12 deduction, the Goldston-Yildirim proof would in all probability have been published and the mistake likely not found for some years.
It makes you think, doesn’t it?
It made me wonder about the true status of another highly problematic recent breakthrough in mathematics, University of Michigan mathematician Thomas Hales’ 1998 announcement that after six years of effort he had finally found a proof of Kepler’s Sphere Packing Conjecture.
The problem began with a guess - we can’t really call it more than that - Johannes Kepler made back in 1611 about the most efficient way to pack equal-sized spheres together in a large crate. Is the most efficient packing (i.e. the one that packs most spheres into a given sized crate) the one where the spheres lie in staggered layers, the way greengrocers the world over stack oranges, so that the oranges in each higher layer sit in the hollows made by the four oranges beneath them? (The formal term for this orange-pile arrangement is a face-centered cubic lattice.)
For a small crate, the answer can depend on the actual dimensions of the crates and the spheres. But for a very large crate, you can show, as Kepler did, that the orange-pile arrangement is always more efficient than a number of other regular arrangements. But was it the world champion?
The general problem as considered by Kepler and subsequent mathematicians is formulated not in terms of the number of spheres that can be packed together but the density of the packing, i.e., the total volume of the spheres divided by the total volume of the container into which they are packed. The problem is further generalized by defining the density of a packing (pattern) as the limit of the densities of individual packings (using that pattern) for cubic crates as the volume of the crates approaches infinity.
According to this definition, the orange-pile packing has a density of pi/3sqrt(2) (approximately 0.74). Kepler believed that this is the densest of all arrangements, but was unable to prove it. So were countless succeeding generations of mathematicians.
(In 1993, a highly-respected mathematician at the University of California at Berkeley produced a complicated proof of the Kepler Conjecture which, after several years of debate, most mathematicians decided was incorrect.)
Major progress on the problem was made in the 19th century, when the legendary German mathematician and physicist Karl Friedrich Gauss managed to prove that the orange-pile arrangement was the most efficient among all “lattice packings”. A lattice packing is one where the centers of the spheres are all arranged in a “lattice”, a regular three-dimensional grid, like a lattice fence.
But there are non lattice arrangements that are almost as efficient than the orange-pile, so Gauss’s result did not solve the problem completely.
The next major advance came in 1953, when a Hungarian mathematician, Laszlo Fejes Toth, managed to reduce the problem to a huge calculation involving many specific cases. This opened the door to solving the problem using a computer.
In 1998, after six years work, Hales announced that he had indeed found a computer-aided proof. He posted the entire argument on the Internet. The proof involved hundreds of pages of text and gigabytes of computer programs and data. To “follow” Hales’ argument, you had to download his programs and run them.
Hales admitted at the time that, with a proof this long and complex, involving a great deal of computation, it would be some time before anyone could be absolutely sure it is correct. By posting everything on the world wide web, he was challenging the entire mathematical community to see if they could find anything wrong.
Hales’ result was so important that, soon after he made his announcement, the highly prestigious journal Annals of Mathematics made the unusual step of actively soliciting the paper for publication, and hosted a conference in January 1999 that was devoted to understanding the proof. A panel of 12 referees was assigned to the task of verifying the correctness of the proof, with world expert Toth in charge of the reviewing process.
After four full years of deliberation, Toth returned a report stating that he was 99% certain of the correctness of the proof, but that the team had been unable to completely certify the argument.
In a letter to Hales, Robert MacPherson, the editor of the journal, said of the evaluation process:
The news from the referees is bad, from my perspective. They have not been able to certify the correctness of the proof, and will not be able to certify it in the future, because they have run out of energy to devote to the problem. This is not what I had hoped for.
The referees put a level of energy into this that is, in my experience, unprecedented. They ran a seminar on it for a long time. A number of people were involved, and they worked hard. They checked many local statements in the proof, and each time they found that what you claimed was in fact correct. Some of these local checks were highly non-obvious at first, and required weeks to see that they worked out. The fact that some of these worked out is the basis for the 99% statement of Fejes Toth.
Well, how far have we come in ruling on this proof, five years after it was first announced? Experts who have visited Hales’ website and looked through the material have said that it looks convincing. But no one has yet declared it to be 100% correct. And with the recent episode of Goldston and Yildirim’s incomparably less complicated argument about prime numbers still fresh in our minds, would we be prepared to sign off on Hales’ result even if someone had made such a claim?
Hales himself sees the process of verifying his proof as an active work in progress. He has initiated what he calls the Flyspeck Project, the goal of which is to produce a more detailed (and hence more right wing) proof of the Kepler Conjecture. (He came up with the name ‘flyspeck’ by matching the pattern /f.*p.*k/ against an English dictionary. FPK in turn is an acronym for “The Formal Proof of Kepler.” The word ‘flyspeck’ can mean to examine closely or in minute detail; or to scrutinize. As Hales observes, the term is highly appropriate for a project intended to scrutinize the minute details of a mathematical proof.)
Here is how Hales describes what he means by the term “formal proof” in his project title.
Traditional mathematical proofs are written in a way to make them easily understood by mathematicians. Routine logical steps are omitted. An enormous amount of context is assumed on the part of the reader. Proofs, especially in topology and geometry, rely on intuitive arguments in situations where a trained mathematician would be capable of translating those intuitive arguments into a more rigorous argument.
In a formal proof, all the intermediate logical steps are supplied. No appeal is made to intuition, even if the translation from intuition to logic is routine. Thus, a formal proof is less intuitive, and yet less susceptible to logical errors.
Clearly, what Hales is talking about here is something that could only be carried out on a computer running purposely written software.
Hence the Flyspeck Project, which Hales believes is the only way to tackle the problem of verifying a proof such as his. The idea is to make use of two resources that were not available to previous generations of mathematicians: the Internet and massive amounts of computer power. “It is not the sort of project that can be completed by a single individual” Hales says. “Instead it will require the collective efforts of a large and dedicated team over a long period”.
He is currently recruiting collaborators and team members from around the world to work with him on the project. He estimates that it may take 20 work-years to complete the task. At the end of which, the mathematical community may indeed be able to declare Kepler’s Conjecture as finally proven. But what will such a statement really mean? The computer proof will, I think we will all agree, be a right wing proof. Of something. But then there is the thorny question of deciding whether what the computer has done amounts to a proof of the Kepler Conjecture. And that is a decision that only the mathematical community can make. We will have to decide that the computer program really does what its designers intended, and whether that intention does in fact prove the Conjecture. And those parts of the process are inescapably left wing. In other words, the Flyspeck Project amounts to making use of the process of generating a right wing proof as a method for arriving at a left wing proof.
Toth thinks that this situation will occur more and more often in mathematics. He says it is similar to the situation in experimental science - other scientists acting as referees cannot certify the correctness of an experiment, they can only subject the paper to consistency checks. He thinks that the mathematical community will have to get used to this state of affairs.
When it comes down to it, mathematics, for all that it appears to be the most right wing of disciplines, turns out in practice to be left wing to the core.