I tried to read Dirac’s paper on this and soon got lost. Can someone dumb it down for me?
Whenever you have an electric field and a magnetic field in some space, you get an angular momentum associated with that combination of fields (strictly speaking, an angular momentum density). With ordinary dipole magnetic fields, this doesn’t matter, because if you add up all of the pieces of angular momentum from all of space, the total is zero. But if you have both an electric charge and a magnetic monopole, when you add up all of the angular momentum, the total isn’t zero, and it doesn’t depend on where the charges are, or how far apart they are, but is just proportional to the product of the electric charge and the magnetic charge.
But we know that angular momentum is quantized (it comes in units of Planck’s constant). So the product of any amount of magnetic charge and any amount of electric charge must be an integer times Planck’s constant. But the only way for that to happen is if electric charge and magnetic charge individually both only come in integer multiples of some fundamental unit.
That’s what I thought but I got lost in all of the mathematics.
To be fair, that’s not what Dirac’s paper is about anyway. Dirac’s argument is rather different (and more complicated but arguably more rigorous). If this thread is still a going concern by next week, I can outline Dirac’s treatment. (No time before then, unfortunately.) The semi-classical argument summarized by @Chronos is due to M. N. Saha.
This is my shocked face, that Dirac is both more complicated and more rigorous than me:
Oh, no, wait, that’s just my normal face.
I hereby request such an outline
Here’s a copy of Dirac’s 1931 paper
Prerequisites for the summary below:
Required: Complex numbers
Recommended: Calculus in one variable; some passing notions of quantum mechanical ideas
Ideal but skippable: Vector calculus; Gauss’s law
I had to aim for a fairly technical reader or else this would turn into a book. There’s just a lot of tools used in Dirac’s argument. I can break any piece down further, but it’ll be based on people having interest in specific parts being broken down.
So, for better or worse, here’s a summary.
Section 1 of the paper is just Dirac preparing the reader for his upcoming stunt-work by drawing analogies to other advances in physics that have come through mathematical abstraction. Nothing essential happens in this section, although it is a fun way to travel back to that time period.
Section 2: In quantum mechanics (QM), the state of a system (say just a single particle) can be described by a complex-valued function of space and time. This section discusses how such QM wavefunctions have some level of arbitrariness to them.
Any complex number can be written in the form of the paper’s Eq. (2), namely Ae^{i\gamma}. If we take the amplitude A and phase \gamma to be functions of space and time, we can decompose any wavefunction \psi into these (real-valued) component functions.
Wavefunctions can be used in QM to calculate what an experiment might measure (i.e., to calculate an observable quantity). Observables in QM always involve a product (i.e., multiplication) of some wavefunction and the complex conjugate of another wavefunction (where the complex conjugate takes i\rightarrow -i). Dirac starts with the simplest case of this type of product: a wavefunction times its own conjugate.
For calculations involving this product, we could multiply the wavefunction of interest by e^{ia}, with a being some constant, and the calculation won’t change. This is because e^{ia} times its conjugate e^{-ia} is precisely 1 and thus it falls out of the calculation. Note that multiplying our wavefunction by e^{ia} in this way is the same as saying we’re adding a to our phase \gamma. So this can be thought of as an indeterminacy (or level of arbitrariness) of the phase of the wavefunction. The phase can be bigger everywhere by a and nothing observable changes.
In fact, for calculations involving simply \psi times its own conjugate, we can make the phase much more arbitrary, since the phase perfectly cancels out no matter what we do to it, including having it change by different amounts all over space and time.
But, we should consider other, more general, products that we might encounter when applying QM. In general, an observable is calculated by applying an “operator” (mathematical transformation) to a wavefunction and then multiplying the result of that by the conjugate of possibly some other wavefunction. Thus, you don’t have freedom to completely yank around phases with no consequence. It is natural to ask, though, what freedom you do have for any possible observable and for any possible QM superposition of states.
(Required aside on integrability.)
Consider a topographical map (elevation map) of some piece of land. If you draw a path between two points on the map, the starting elevation value and the ending one will probably be different. Along the path, your elevation will rise and fall. If you drew your path in a closed loop (same start and end points), you would definitely end at the same elevation that you started at, despite any changes along the way.
You can think of the journey along a path as integrating (in the calculus sense; or “summing” if no calculus) the slopes (derivatives) of the terrain along your path. If you “add up” all the step-by-step elevation changes along the path, you get the net elevation change from start to end. And for a closed path, this net change in elevation will be zero.
This is an example of an “integrable” situation. If you tell me where you start and where you end, I don’t have to physically walk the path to learn what the elevation change will be between start and end. I can rely on the “integral” between the two points being well-defined for any path, so I have everything I need to know about the elevation change just by knowing the start and end points’ elevations.
Obviously that’s true for a topo map. For a general mathematical function of space (and, here, time), that doesn’t have to be the case. In a non-integrable case, moving between two points along different paths might lead to a different net integral (or sum) of slopes (changes; derivates; etc.)
(End of aside.)
As we look to make a wavefunction’s phase more arbitrary, can non-integrability enter the picture? We still need observables to be well-defined, but those always involve products, so maybe a single wavefunction can have non-integrable phases. Dirac shows that this is indeed okay so long as all wavefunctions have the same non-integrability. In particular, this will mean that any residual phase offsets around a closed loop will cancel when multiplying wavefunctions with conjugates of other wavefunctions.
This in turn means that all wavefunctions in the system must share a set of phase shifts that have well-defined derivatives througout spacetime. (It’s the slopes that are well-defined, not the phases themselves.) These derivatives are the kappas (\kappa) defined in the paper.
There is a bit of vector calculus at this stage, but I’ll try to steer around it. You can just note that Stokes’ theorem (stated in Eq. (4)) relates the integral of the kappas around a closed loop to the integral of the curl of the kappa vector across the surface bounded by that loop. If that sentence is gibberish, the takeaway is that we can recast our kappa condition as “all wavefunctions need to have the same curl of the kappa vector.” Curl is a certain multi-dimensional derivative. It shows up centrally in electromagnetism, importantly for us.
If you take a “normal” wavefunction that satisfies a free-space QM wave equation and you compare how a “kappa-shifted” version of it looks, the latter looks just like a charged particle moving through an electromagnetic field, and further the magnetic field part is simply the curl of the kappa vector (times physical constants that include electric charge). And, the change in phase around a closed loop now maps onto the surface integral of the magnetic field passing through the surface that the loop bounds (a.k.a. the magnetic flux through the loop). If this isn’t clear, it might be better later.
Section 3: But first! The conclusion that all wavefunctions have to have the same change in phase around a closed loop can maybe be softened a little bit more. The change could be arbitrary up to any integer multiple of 2\pi, since e^{i\beta} and e^{i(\beta+2\pi n)} are the same for any integer n. This can’t apply to any loop, though, since a wavefunction \psi still has to be continuous, and I can make my loops very tiny. If my loop is very tiny, I can’t build up any significant phase change when going around it given the continuity of \psi, certainly not 2\pi's worth of change. But there’s an exception to this, namely cases where our wavefunction goes to zero along a line (a “nodal line”). In this case, the continuity argument doesn’t work all the way to zero loop size, so the we could still have phase changes around arbitrarily tiny loops, although they must still be integer multiples of 2\pi for our QM reasons.
A common trick in vector calculus is to sum up lots of adjacent small loops to form one big loop. Doing that allows us to grow our infinitessimal loops into macroscopic loops, and thus we can say that any loop that doesn’t go around a nodal line of a wavefunction doesn’t have the 2\pi n phase arbitrariness but any loop that does, does.
Returning to the Stokes’ theorem piece: we observe that the change in phase around a closed loop is directly related to the magnetic flux through any surface that the loop encloses. Think of the magnetic flux as the net amount of field lines poking through it one way vs. the other. Separately, the total change in phase around the loop that encloses the surface is related to the sum of a bunch of 2\pi n arbitrariness factors, one for each wavefunction nodal line that passes through.
That last paragraph leads to Eq. (8). Of interest is when we let the surface “poof” out from the loop, leaving a surface that is a big sphere (say) with a hole where the loop is. Let’s go even further and take the size of the loop to zero to actually form a fully closed ball-like surface. As long as we placed our going-to-zero-size loop not on a nodal line, then the sum in Eq. (8) must be zero, by wavefunction continuity.
The only way to have net magnetic flux passing through the closed ball-like surface is to have a monopole inside. (This argument is the same as Gauss’s law in electrostatics, if that’s useful. If there’s no monopole inside, then flux lines entering and exiting will perfectly cancel.) And, the only way to cancel such a net flux in Eq. (8) is to have some discretized value for the first term. Thus, a product of electric charge and magnetic flux, which in turn means a product of electric charge and magnetic charge, has to equal some discrete value or values. The only way to ensure that holds for arbitrary configurations is if these charges are quantized.
I’ll just add that Stokes’ Theorem is, unsurprisingly, very closely related to Gauss’s Theorem.
Beautiful post
@Pasta let me second this, very well written and comprehensible.
When I was studying physics last century, one of my profs told a story about a lecture that Dirac was invited to give to the faculty of a university. At the end of the lecture someone in the audience raised their hand to say “Professor, I don’t understand the derivation you showed to lead to your conclusion.” Apparently Dirac just stood there without responding, and after a moment of awkward silence the host said “Professor, I think that my colleague has asked you a question.” Dirac is supposed to have replied “Oh, I thought that he was making a statement.”
What exactly should a lecturer reply to such a question at the end of a lecture. Presumably he explained it to the best of their ability, and the “question” seems to ask for a complete review. If the “question” were more specific: "Professor, I don’t understand how you got from equation n to n+1 in the derivation, then an answer might be forthcoming.
Wow. This post alone may be worthy of an academic paper.
I heard someone did write a paper about it, circa 1931.
It could maybe be fodder for a “special topics” lecture for physics majors or something. I was at first surprised not to find something like this already out there, but reviews of monopoles and quantization tend to skim over Dirac quickly and move on to more modern treatments. But it makes sense, since the mathematical toolkit of physics has evolved enough that advanced treatments are in a way more direct, though less accessible.