# How do mathematicians read and understand long and involved equations/proofs?

You know, those equations or proofs you encounter from time to time that have line after line of dense notation. Do they chunk sections down and identify the meaning in words “…the rate of change of the first variable compared with the second variable…”, or do they perhaps follow what happens to one of the variables, then see what happens to another then intuit a relationship? How does it happen?

A previous thread you may find of interest: Can those brilliant professor types really look at those long ass equations on the whiteboard?

Very carefully.

I’m not a mathematician, but I do sometimes have to follow proofs when I’m reviewing a paper. The way to do it is to understand the flow of the proof, what is meant by the components, and figure out why the transformations in each step are being done, and then the method used to make the transformation. (All of which should be explained in the text.) If you see each step as a random collection of variable and operators you’ll never understand it.

Kind of like how grandmasters can memorize chess positions which mean something better than those which are a random collection of pieces.

You know how they say that you really speak a foreign language once you don’t translate in your head any more?

It’s a lot like that.

Even the most formal proofs can still be broken down step-by-step. Each line must follow from previous lines according to established rules, and if it doesn’t, then the proof is invalid (or at least, incomplete).

Now, this doesn’t help much in constructing the proof in the first place, since there are always a large number of possible rules you can apply at each step, and it’s often not clear at first glance which ones are actually going to bring you closer to the conclusion you’re looking for. That’s an entirely different skill set, and having the author of the proof explain how he came up with the particular path he did can make it easier to follow. But even without that understanding, you can still always take it step by step.

Most of the proofs a mathematician reads in his or her own field will involve familiar lines of argument with relatively familiar objects. A well-written proof will not only show you the main goal but subgoals along the way, or the obstacles will be pretty clear. It’s like navigating terrain: I know we want to end up at this point north of here, but between here and there I recognize a mountain, which we’ll either have to go around or through – ah, we’re turning to the east, so we must be preparing to go around it…

It’s when the line of argument or the objects involved are novel that you have to read very slowly and carefully and, with luck, build up some of your own intuition about what is going on. Again, a good writer will help, but intuition is idiosyncratic. When a really novel proof comes along (like Wiles’s proof of Fermat’s Last Theorem) you may get teams of mathematicians poring over it together to try to understand and verify it.

If a mathematician has to read a proof in an unfamiliar field, sometimes it can be almost as hard as it would be for a non-mathematician (or, say, an undergrad math major).

Very much what I was coming in here to say. As a computer scientist, non-programmers often ask me more or less the same thing about reading and writing code. When it’s something I work with on a regular basis, it’s really not all that different from just reading English. I’ve also read a lot of very mathy papers and the biggest issue I’ve always had with how readable they are is how carefully the authors of the paper designed their notation and how they laid it out.

In that regard, a lot of the same sorts of practices that make for good code or even writing a good paper in natural language also make for good organization of mathematical concepts. I’ve read a few papers that covered topics I knew very well but were still very difficult reads because they were organized poorly or they chose a manner of notation that didn’t work for me. Other times, I’ve read ones that weren’t so close to my expertise, but weren’t too bad to work through because they were well designed.

For example, one mistake I’ve seen is authors who, perhaps not aware of certain conventions, use a different way of representing well understood concepts. As a lame example, imagine if you’re reading a mathematical paper and the author decides it’d be a good idea to let x be equal to the ratio of a circle’s circumference to it’s diameter and then does all kinds of formulae using x instead of pi.

I’ve seen that sort of stuff in computer science papers ALL the time and it makes for a brutal read. But at the same time, when the authors DO use the well established conventions, and choose logical variables and other ways of representing concepts, it’s remarkably easier to read than you might think. To a layman, seeing all sorts of weird greek letters and super/sub scripts might look needlessly complex, but it’s quite likely that within the field many of those letters have well established concepts in the same way that when most people see pi in a formula, it doesn’t need explanation and anyone reading it will already have an intuitive understanding of the concept it represents.

That all said, sometimes papers are far enough out there with enough new or esoteric concepts, or just so complex, that they need to come up with new notations or perhaps find a way to make certain things consistent that they’re going to take time to process anyway. But in those cases, I don’t think you’ll ever see anyone who wasn’t actually participating in writing the paper just run through the math easily, it’s going to be a difficult read no matter what.

That reminds me of one of my favorite junior high school English teachers, who had started as a math major in college. She told us that an English essay should be as carefully and logically structured as a mathematical proof. It works the other way as well.

Blaster Master makes a good point. Any technical paper should always define all of its terms, and in principle, you can use any terms you want in your definitions, and it’s not invalid. But there are some widespread conventions about definitions, and it’s a good idea to write your definitions to correspond to those. For instance, you can label an angle with any letter you want… But it’s conventional to use theta for an angle (and phi and maybe psi, if you have more than one). If you have only one relevant angle in your work, and it isn’t labeled theta, or if you use theta to represent something that isn’t an angle, you’re making things harder for your reader.

Feynman described a method he used to follow complex steps:

It’s something you pick up in the course of reading and writing proofs. You start to recognize the clever bits and distinguish them from the brute force calculations, to notice why intermediate results or calculations are invoked, and to keep track of a large number of hypotheses and results in your head. Also, if you’re reading a long and involved proof, you’re probably fairly conversant with the subject already; it’s not like being handed a novel on the first day of French 101. In papers and books, results are broken into lemmas, propositions, etc. While you need to understand the proof of Lemma 1.1, the proof of Theorem 2.5 will only refer to Lemma 1.1 itself, not its proof. That allows for a bit of compartmentalization in particularly long papers.

Plus, those lemmas can often be re-used in multiple parts of the proof, or even be interesting results in their own right.

You’ve just got to be careful that you’re being consistent with what’s needed. If you have a lemma that states “Under conditions A, B, C, and D, then Z holds”, and you apply it under conditions A, B, and C, you might be in for a rude surprise if it turns out that you didn’t actually have D.

The analogy I was going to draw would be with, say, a piano score, which is just a jumble of dots to most of us, a sequence of notes that cannot be read well enough to do anything with for me, and a road-map for an experienced concert pianist.

First place, let’s admit it, it is hard. Damnably hard. Second, unless the author has taken real pains, it can get to be impossible. I am currently working with a coauthor who is a really clever mathematician. He has done several really brilliant things that blow my mind. Then he writes it up. First thing that happens is he never looks back to ask, “Now what is really going on here?” I have to take his proofs, think seriously about what they mean and then recast them in terms I (and he too) can understand. Sometimes they simplify enormously.

That said, good writing means to make it somehow a kind of narrative. You try to explain the idea behind a proof, you then introduce each step by saying something like, “Now we will…” and hope the reader will follow it. Whenever possible, you cast everything in terms of words. But of course, you use symbols when you have to. And you try to build as much mnemonic information in the that you can. x will be an element of the object X which will lie in a category script X and so on. And f: X --> Y will denote a function. Also, and this may surprise some people, especially physicists who are, I believe, held by referees and editors to notational orthodoxy, we do not have reserved symbols. If there are no circles with radius or diameters anywhere in sight, pi may be used for something unrelated, say a projection operator. Perfectly fair (and even commonplace). Another thing. Often a definition will be given early on and then not mentioned for 20 pages. When it is used again, it is helpful to say something like, “Recall from Section 2 that …”. There are many such devices to help the reader.

Favorite math cartoon: http://star.psy.ohio-state.edu/coglab/Pictures/miracle.gif

All of this is true of physicists, as well. It’s a good idea to follow the standard conventions (and to state them explicitly, in case others don’t agree on what’s standard), but it’s not actually enforced, except informally in that a confusing paper will get less citations. And I’ve seen pi used for a variety of things unrelated to three-and-a-bit-more, though if you’re doing that it’s a really good idea to make it explicit what you mean by it.

A mathematician I knew answered a similar question of mine (i.e. “How did you even understand this, let alone solving it so fast ?”) with “Oh, I just visualized the problem in an N-dimensional space and it’s intuitive from there”.
I could never figure whether he was taking the piss.

We got a new Physical Chemistry teacher in college who stumped us by writing on the blackboard

ПΦ=νΡΤ

After some back and forth, we figured he meant PV=nRT and asked him to please, pretty please, if you’re going to use symbols which are not what every book and article has used for upwards of 100 years for those same things, define your terms. We were never able to figure out where the heck had he learned to write that equation with Greek letters, when we asked he just looked confused.

Also you can very quickly pass judgement on someone as they are standing at the board writing out their proof. That’s why many people hate math, there is no where to hide up there and you will instantly face criticism from your more knowledgeable peers and instructors. You are basically showing the world how your brain functions and are taking a risk of being called out as being illogical (and thus ugly in the math world). :eek: Many people can’t hack even helpful criticism and the further along you get in math criticism becomes less helpful and more competitive.

I once shamed a very bright Vietnamese fellow at the board by correcting his work while i was writing out my proof to another equally complex differential equations problem on the other board. I went over with my chalk and marked up his equation and had a short conversation with him and then he acted as if I had just successfully beat him in a fencing match. And then the class cheered.