Small confusion about the Aharonov-Bohm effect

Question

I am mostly aware of the Aharonov-Bohm effect's (AB effect) physical interpretation, as well as the corresponding mathematical/differential geometric interpretation.

What does confuse me slightly however is the physical part of the derivation leading to it. Namely, in a "heuristic" description, one usually brings up trajectories, namely that either "an electron going one way and another the other way around the cylinder will pick up a phase shift" or an "electron going around the cylinder will pick up a phase shift compared to its original value".

In QM there are no trajectories, however, though of course there is the path integral point of view, and I know the AB effect can be approached from this perspective too (Sakurai, for example). Buuuut, in a hungarian textbook I have seen a particularly simple way to derive the phase shift.

Let $C$ be the (solid) cylinder in $\mathbb R^3$ and let $M=\mathbb R^3\setminus C$. The manifold $M$ is not contractible, so Poincaré's lemme does not apply. In particular, if $\mathbf A$ is the vector potential, $$ \boldsymbol{\nabla}\times\mathbf A=\mathbf B=0 $$ does not imply that there exists a globally defined scalar field $\chi$ such that $\mathbf A=\boldsymbol{\nabla}\chi$.

Let $\psi$ be the wave function satisfying the Schrödinger equation $$ i\hbar\partial_t\psi=-\frac{\hbar^2}{2m}D^2\psi $$ with $$ \mathbf D=\boldsymbol{\nabla}+iq\mathbf A $$ the covariant derivative. I assume the proper interpretation should be that $\psi$ simply describes the state of an electron that is diffracted on the cylinder. Let $\psi_0$ be the wave function corresponding to the case of $\mathbf A=0$.

Now, let us partition $M$ into two halves, $M^+$ and $M^-$ such that both domains are contractible. Since they are contractible, with $\mathbf B=0$, one can choose gauge transformations $\chi^+$ and $\chi^-$ to "turn off" $\mathbf A$. Dropping the $\pm$ signs, this gauge function is given by $$ \chi(\mathbf x)=-\int_{\mathbf {x}_0}^{\mathbf{x}}\mathbf A(\mathbf y)\cdot d\mathbf y $$ where the integral is performed over any curve connecting the arbitrary initial point $\mathbf x_0$ with the target point $\mathbf x$ (as the integral is path-independent).

Since $\chi$ turns off $\mathbf A$ (in one of the $M^\pm$ domains), we have $$ \psi_0(\mathbf x)=e^{-\int^\mathbf x \mathbf A(\mathbf y)\cdot d\mathbf y}\psi(\mathbf x). $$

Reversing, we have $$ \psi(\mathbf x)=e^{\int^\mathbf x \mathbf A(\mathbf y)\cdot d\mathbf y}\psi_0(\mathbf x). $$

Now we perform this procedure on both trivializations and compare them: $$ \psi^+(\mathbf x_1)/\psi^-(\mathbf x_1)=e^{\int_{\gamma^+}^{\mathbf x_1}\mathbf A (\mathbf y)d\mathbf y}e^{-\int_{\gamma^-}^{\mathbf x_1}\mathbf A (\mathbf y)d\mathbf y}=\exp\left(\oint\mathbf A(\mathbf y)\cdot d\mathbf y\right). $$ (I have probably dropped some $q$s and $\hbar$s somewhere, but doesn't affect the basic method)

Question:

I am imagining that if the diffraction on the cylinder actually happens, then the diffracted electron is described by a wave function $\psi$, in particular, a wave function that is single-valued.

If the wave function is single-valued, then we should have a well-defined $\psi(\mathbf x_1)$, and we cannot have differing $\psi^+(\mathbf x_1)$ and $\psi^-(\mathbf x_1)$ wave functions.

However, despite what the connection-theoretic background would suggest, we did not calculate a parallel transport actually, but a gauge transformation. So the two wave functions need not agree, as they are in different gauges. However, then, why do we compare them? Comparing them and saying they differ would be akin to comparing a vector to itself in two different coordinate systems and saying they differ, cause the components don't agree.

So

If this derivation is "correct", then why do we compare wave functions in different gauges? In particular, why do we expect to get physically meaningful results from that.
If the derivation is incorrect, then what is a simple way to show that the phase shift is given by $\oint A$, that does not rely on path integrals?

This derivation is an example of excess mathematical formalism confusing the physics. It can be made right but at the moment it is physically completely wrong, because it doesn’t use the fact that the charge must move slowly. It’s just doing familiar-looking mathematical operations until the answer falls out by accident. — knzhou, Aug 03 '18 at 19:41
@knzhou Since asking this question I have been able to consult other sources (for example Ballantine's book), which contain the same derivation - set up two overlapping trivializations and gauge transform a "free" wave function separately. No source on the AB effect I have read ever said anything about charges needing to move slowly. Please elaborate? — Bence Racskó, Aug 03 '18 at 20:01

score 16 · Accepted Answer · edited Mar 10 '24 at 16:45

This "derivation" hits a pet peeve of mine, which is that mathematical treatments of topological phases persistently confuse the phase shift resulting from a physical process with abstract, physically meaningless phases computed by blinding plugging equations into each other.

Physical and Formal Phases

The Aharanov-Bohm effect isn't even the worst example; that award goes to anyons. Anyons pick up a phase $e^{i \phi}$ when their positions are physically exchanged, i.e. when two of them are picked up and swapped by an experimentalist, assuming that there are no extra external fields, the anyons are moved slowly, and so on. However, this is persistently confused with the phase that results from formally swapping two variables in the many-body wavefunction, $$\psi(x_1, x_2, \ldots) = e^{i \theta} \psi(x_2, x_1, \ldots).$$ It is trivial to prove this formal phase is always $\pm 1$ in any dimension, leading even very capable mathematicians to assert that anyons cannot exist. The majority of introductory quantum mechanics books that attempt to treat anyons make precisely this mistake, then mumble something incorrect about topology allowing the formal phase to differ from $\pm 1$. It's a mess. (For a good treatment, see this.)

Similarly, the Aharanov-Bohm phase is the fact that a particle picks up an extra phase $e^{i \theta}$ upon being transported around a flux. It is easy to see where both the Aharanov-Bohm and anyon phase come from if you use the path integral. Mathematically minded students often dismiss this argument, based on trajectories, as "heuristic", but this misses the point, because the physics of the situation is explicitly about trajectories. You can't easily see the phase shift between two trajectories if you use the time-independent Schrodinger equation.

If you don't like the path integral, you can also derive these phases with the adiabatic theorem: trap a particle in a box at location $\mathbf{R}$ and transport the box around the flux. The gauge connection $\mathbf{A}$ functions precisely as the Berry connection on the states $|\mathbf{R} \rangle$, and the derivation then proceeds exactly the same way as the formal fiber bundle derivation below. Note that in both the path integral and adiabatic theorem explicitly require the transport to be slow. In the former case, it's to avoid picking up extra $\int \mathbf{p} \cdot d \mathbf{x}$ phases, and in the latter case it's a condition of the adiabatic theorem.

A Correct Fiber Bundle Derivation

The argument you gave rests on comparing wavefunctions in two different gauges, which is physically meaningless. Here is a correct derivation.

As you know, we may describe the gauge field in terms of a $U(1)$-bundle over $M$. All such bundles are trivial, which is why most courses don't talk about them; it just makes things more complicated. However, suppose we chose to use bundles anyway and covered $M$ with two patches. Then we may compute the phase picked up by transporting a particle around the flux as follows.

Within the first patch, integrate $\int \mathbf{A} \cdot d\mathbf{x}$.
When the particle passes from the first patch to the second, add a phase to account for the transition function between the patches.
Within the second patch, integrate $\int \mathbf{A} \cdot d \mathbf{x}$.
When the particle passes from the second patch back to the first, add another transition function phase.

Since the bundle is trivial, the transition functions can be chosen to be trivial, reducing to the non-bundle formalism. However, we could also choose to gauge away the connection within each patch. Then the particle picks up no phase at all as it is parallel transported through the patches (again, assuming it is moving slowly, with no extra external fields, ignoring dynamical phases, etc.) but does pick up phases from nontrivial transition functions. Of course, since the answer is a physical quantity, it will be the same calculated either way. Your text just showed this explicitly.

Using the Fake Derivation

The comparison of wavefunctions in two different gauges has nothing to do with the physical process in the Aharanov-Bohm effect, but your text gets the right answer basically by accident; there's only one answer you could possibly get in this simple situation. Luckily, your text's setup is useful for a different thing: finding the spectrum of particles on a ring.

Suppose a particle is constrained to a ring, through which a flux passes. If there were no flux, the energy eigenstates would be $$\psi_n(\theta) \propto e^{i n \theta}, \quad E_n \propto n^2.$$ Now suppose the flux is turned on, giving an Aharanov-Bohm phase $e^{i \phi}$. Usually, to get the spectrum you have to solve the Schrodinger equation with a vector potential, but using the fiber bundle setup we can just set it to zero on each patch. Supposing we set one of the transition functions to be trivial too, and letting the other patch intersection be at $\theta = 0$, we have $$\lim_{\theta \to 0^+} \psi_n(\theta) = e^{i \phi} \lim_{\theta \to 2\pi^-} \psi_n(\theta)$$ where $\psi_n(\theta)$ satisfies the Schrodinger equation for zero vector potential. (Of course the wavefunction remains single-valued, as long as we remember it only makes sense to compare it to itself within one patch.) Then we have $$\psi_n(\theta) \propto e^{i (n - \phi/2\pi) \theta}, \quad E_n \propto (n - \phi/ 2 \pi)^2$$ which gives a measurable change in the spectrum. This is a case where you do want the time-independent Schrodinger equation, not path integral trajectories, but that is because the physics is completely different.

Thanks for the answer. My only issue is that in the "fiber bundle derivation" you give, I find it nontrivial to motivate that the quantity of interest is $\oint A$. Maybe it is actually trivial, but I cannot see it now. Could you provide a source that treats this matter rigorously, but preferably without the path integral formalism?
To make it more understandable what I am looking for, I am gonna give a backstory. I often help out my supervisor with the practice seminar of a GR course he holds (I do research in GR). His course is not very mathematical, so I often cover some (cont'd) — Bence Racskó, Aug 04 '18 at 21:15
of the differential geometric background in the practice seminar. I know a nice and rigorous proof that the vanishing of the curvature tensor (in an open region) implies trivial holonomy, if the loop is null homotopic. I do not know any simple-to-show GR example of flat holonomy, so to show the consequence of topological obstructions, I intend to show the AB effect as an example. However I prefer an all-or-nothing approach, and I find it is perhaps the most stringent point in all of this is to physically motivate that the phase shift is related to $\oint A$. — Bence Racskó, Aug 04 '18 at 21:18
The reason why I want to avoid path integrals unless absolutely necessary is that our usual QM courses don't cover it, so I do not want to burden the seminar with path integrals as well. — Bence Racskó, Aug 04 '18 at 21:18
@Uldreth Well, $\oint A$ falls out of the classical action (where you account for electromagnetism by adding an $\mathbf{A} \cdot d\mathbf{x}$ term), and as you know the path integral directly uses the classical action in $e^{iS}$. I get why you might not want to use it, but the link between quantum phases and classical action is really really fundamental, so not using it makes everything much harder. However, you can get by, by using the Berry phase instead. — knzhou, Aug 04 '18 at 21:22
@Uldreth I alluded to this briefly above, but the Aharanov-Bohm phase can also be viewed as a Berry phase associated with slowly transporting the particle position $\mathbf{R}$ in a loop, where the Berry connection is the gauge connection. You can look up derivations of this by just googling those keywords. So that might work. — knzhou, Aug 04 '18 at 21:24
@Uldreth In fact, here's possibly an even better way: first argue that the Berry phase is exactly zero when $\mathbf{A} = 0$. (If you're sneaky, you could even tacitly assume this without saying it; most wouldn't notice.) Then the only phase you pick up by transport in a loop is from the transition function. Finally, use the argument from your book in reverse to show that, by undoing the gauge transformations they did (which should not change the physical phase), that transition function phase is precisely $\oint A$. — knzhou, Aug 04 '18 at 21:25
Just wondering, shouldnt it still be possible to derive this from the time independent Schroedinger equation without using any adiabatic switching? (and without path integral?) — lalala, Jun 14 '21 at 10:00
@knzhou If I may ask, I have a question related to this. In the case of picking up an AB phase factor, should't the boundary conditions be periodic up to a phase difference? While I'm not knowledgeable in fibre bundles, I assume that after going around the loop, the boundary conditions should allow for this extra phase factor, and we shouldn't impose $\psi(0)=\psi(\pi)$ like make authors do. Am I right? — TheQuantumMan, Sep 21 '23 at 23:19
@knzhou I don't see how this can be seen as an answer to the OP. The particle must have a single-valued state function. There are two functions in this answer (on the two overlapping patches), right? Our Hilbert space contains functions, not pairs of functions. Another issue is that I don't know what is meant by "the particle passes" here or there, slowly or quickly. We're in quantum mechanics, so we don't have such notions here. "Parallel transport" is a mathematical concept and has nothing to do with the motion of the particle. — mma, Mar 12 '24 at 04:46

mike stone · Answer 2 · 2018-08-03T22:22:10.607

The original Bohm-Aharonov proplem [Y. Aharonov and D. Bohm Phys. Rev. 115, 485 (1959)] is about electrons scattering from a solenoid. They give a nice solution to the Schrodinger equation as a sum of Besssel functions that it is fun to plot:

The image is the real part of the wave function for the case of 1/4 unit of flux through the solenoid. Wave incoming from right. The B-A phase shift results in the downstream above and below wavecrests being offset by 1/4 of a wavelength. The wavefunction itself is everywhere single-valued though.

Ján Lalinský · Answer 3 · 2018-08-05T19:21:49.483

0

If the derivation is incorrect, then what is a simple way to show that the phase shift is given by , that does not rely on path integrals?

While the BA shift has been experimentally confirmed many times, I believe all those derivations of physical effect (shift) based on mathematical arguments about $\mathbf A$ outside the solenoid are unconvincing, perhaps entirely invalid.

The major problem is that all those derivations assume that loop integral along the loop $\partial S$ around the solenoid equals magnetic flux through the surface $S$ that is defined by the loop:

$$ \oint_{\partial S} \mathbf A \cdot d\mathbf{l} = \iint_S \mathbf B\cdot d\mathbf S~~~(1) $$

While this is true for the vector potentials considered in most situations discussed in physics textbooks, it is not necessary property of a vector potential. The only conditions that restrict the vector potential in EM theory are $$ \mathbf B = \nabla \times \mathbf A, $$ $$ \mathbf E + \nabla \varphi = -\partial_t \mathbf A. $$

From the first condition, the formula (1) can be derived, but only if $\mathbf A$ is well behaved on all points of the surface (including the surface and inside of the solenoid). If it is not (if there is a discontinuity or singularity), the derivation fails. Consequently, there are valid functions $\mathbf A(\mathbf x)$ which, when integrated outside the solenoid, do not obey (1).

For example, there is a function $\mathbf A_0(\mathbf x)$ that vanishes outside the solenoid (so it gives $\nabla \times \mathbf A = 0$ trivially) and is only non-zero inside the solenoid. It also has, necessarily, discontinuity on the surface of the solenoid (or, there is another function that is continuous across the surface, but then has singularity inside the solenoid). So the relation $\mathbf B=\nabla \times \mathbf A$ fails on the surface of the solenoid, but that is true for all functions, including the standard one, if the current distribution on the surface of the solenoid is infinitely thin.

These details should not influence the solution to the Schroedinger equation if the solenoid is modeled as infinite potential barrier (admittedly, this is not very clear and perhaps there is an effect of the discontinuity or singularity even through the infinite potential wall...).

It is only when we restrict the vector potential to the family that has non-zero loop integral that we can get any effect of magnetic flux on the $\psi$ function.

For these reasons, I think it is good to either 1) seek some argument for why only certain vector potentials are allowed (which does not seem likely to be very fruitful, considering they are just auxiliary tool to get the physical field) or 2) seek other explanations, preferably those that do not rely on special property of vector potential.

There has been some intriguing work on the possibility of classical explanation of the BA shift, see, for example, papers by Timothy Boyer who argues that there is classical EM interaction between the electrons and the metallic solenoid which suggests that the explanation could be much more classical and not require special properties of vector potential:

https://philpapers.org/rec/BOYCEA

https://link.springer.com/article/10.1023%2FA%3A1003602524894

edited Aug 05 '18 at 19:21

answered Aug 04 '18 at 11:10

Ján Lalinský

37,229

This sounds like excess mathematical formalism clouding a very simple physical point. The vector potential in an ideal solenoid is not singular. It is often written down explicitly in freshman-level electromagnetism classes. There is no reason it should have to be singular, unlike for the magnetic monopole, for the fiber bundle here is trivial. – knzhou Aug 04 '18 at 11:16
You are free to confuse yourself by forcing yourself to work with artificially singular gauge potentials, but that doesn't invalidate the actual derivation, which is perfectly straightforward. – knzhou Aug 04 '18 at 11:17
Re "artificially singular gauge potentials": do you believe some vector potentials are more correct outside the solenoid than others, even if they all satisfy the definition $\nabla \times \mathbf A = \mathbf B$ in that region? I don't. Vector potential does not have to be differentiable everywhere. – Ján Lalinský Aug 05 '18 at 00:41
I believe everything real is not only differentiable, but infinitely differentiable. So I think going against this principle, even for something unobservable like the gauge potential, is physically unacceptable. – knzhou Aug 05 '18 at 09:09
Differentiable or discontinuous, these are just properties of physics models. Each is useful in some situations. There is a lot of occurrence of non-differentiable functions in physics and it is generally not a problem. We can differentiate even such functions, with help of delta distributions. – Ján Lalinský Aug 05 '18 at 13:13
I'm sure you can choose to represent something smooth with something singular, but this artificially makes things more confusing. If, by making this choice, you inadvertently confuse yourself into thinking the Aharanov-Bohm effect does not exist, that is not a refutation, that's just your own fault. – knzhou Aug 05 '18 at 13:25
Take the example of the Dirac delta function. That comes up in the normalization of position states in QM. If you didn't accept Dirac deltas, you might wrongly argue that means that position states can't exist, so Schrodinger's equation needs to be modified. But that's getting it all wrong; the apparent problem only comes from using singular objects in your model, and you can rephrase all the physical results without using anything singular. – knzhou Aug 05 '18 at 13:27
You keep insisting I am confused, but so far you have given no argument for that. There is no rule in physics or mathematics that forbids use functions that have a discontinuity or singularity. That is true for physical quantities like $\mathbf E,\mathbf B,\mathbf j,\rho$. And even more true for unphysical auxiliary functions such as $\mathbf A$. Lots of useful models manifest such discontinuities, such as the Coulomb potential of point, line and surface charge distributions. As to your hypothetical example with the Dirac delta function, I don't follow what you mean. – Ján Lalinský Aug 05 '18 at 14:04
Well, go back to mechanics and consider the equation $F = - dU/dx$. If one chose $U$ to be not differentiable, $F$ cannot be defined, so that must mean Newton's laws are incorrect, right? It means that we cannot predict what will happen in a real mechanical system. But this argument is nonsense, because there is nothing singular about what is actually happening. You can always choose your mathematical description to be worse than it has to be, but that does not concern me. – knzhou Aug 05 '18 at 14:42
Obviously, you must know that discontinuities and singularities in $\mathbf{E}$, $\mathbf{B}$, $\mathbf{j}$ and $\rho$ are all physically nonexistent. In reality everything is smeared out. You don't have to use singular physical objects in electromagnetism, and you certainly don't have to use a singular gauge potential here, so why would you? – knzhou Aug 05 '18 at 14:44
"If one chose $U$ to be not differentiable, $\mathbf F$ cannot be defined, so that must mean Newton's laws are incorrect, right?" No, it means the function is not differentiable and it may or may not be appropriate in the model. Sometimes singularity in $U$ is fine, like singularity in the Newtonian gravitational potential energy $Gm_1m_2/|\mathbf r_1-\mathbf r_2|$. "Obviously, you must know that discontinuities and singularities in $\mathbf E,\mathbf B,\mathbf j$ and $\rho$ are all physically nonexistent. In reality everything is smeared out." Nobody knows that for a fact.
– Ján Lalinský Aug 05 '18 at 15:50
"you certainly don't have to use a singular gauge potential here, so why would you?" we do not have to, but we can, and it changes the result (value of $\oint \mathbf A\cdot d\mathbf l$). There is no rule in EM theory, afaik, that forbids vector potentials that have singularity somewhere outside the region of interest. In the region of interest - outside the solenoid - $\mathbf A=0$ is a perfectly fine vector potential. Such potential does have a singularity inside the solenoid, but it gives magnetic field correctly even there.
– Ján Lalinský Aug 05 '18 at 16:05
Can you give an example of a case where choosing something to be singular, when it does not have to be, has ever produced a nontrivial and correct prediction? – knzhou Aug 05 '18 at 16:09
I mean, I am sure you can choose the vector potential to be singular. You could also choose it to be quaternion-valued, or to be Grassmann-valued, or multivalued, or whatever other mathematical material you want to toss in. Perhaps you could take the ultrafinitist position and get rid of the real line, so the derivative doesn't work. There is an endless amount of extra mathematical complication you can add. Which complications you think are "natural" are a sociological phenomenon stemming from which issues mathematical physicists enjoy worrying about. But the actual physics doesn't care. – knzhou Aug 05 '18 at 16:11
The motivation for a potential with singularity inside the solenoid is that it allows $\mathbf A = 0$ outside the solenoid, which is quite natural solution of the equation $\nabla \times \mathbf A = 0$ there. If you can find quaternion-valued $\mathbf A$ that gives correctly magnetic field outside and inside the solenoid, I think that would be valid too. – Ján Lalinský Aug 05 '18 at 16:19
"Can you give an example of a case where choosing something to be singular, when it does not have to be, has ever produced a nontrivial and correct prediction?" Yes - treating physical objects as point particles. That brings singularity in field/interaction when the particles meet, but it allows tractable models which are useful.
– Ján Lalinský Aug 05 '18 at 16:28
I think we have to agree to disagree, as neither of us is going to convince the other. I still hold that nature does not know or care about any of the functional-analytic niceties that humans have invented. However, I understand how one could think otherwise. – knzhou Aug 05 '18 at 16:42
Actually, I agree that nature does not care about mathematics that humans have invented. That's why I do not like the idea that only one particular choice of potential (continuous, discontinuous), when having no impact on physical fields, is "correct". But ok, let's leave this for now. – Ján Lalinský Aug 05 '18 at 17:13

Small confusion about the Aharonov-Bohm effect

3 Answers3

Physical and Formal Phases

A Correct Fiber Bundle Derivation

Using the Fake Derivation

Linked