In what sense are the equations of motion conserved by symmetries?

Question

I am studying variational principles and I have been reading this set of notes by Townsend. In the first paragraph of Section 9, Townsend defines what it means for a transformation to be a symmetry of a system:

We have $$I[\mathbf Q]=\int_{t_A}^{t_B} L[\mathbf Q, \mathbf {\dot Q}, t]dt=\int_{t_A}^{t_B} L[\mathbf q, \mathbf {\dot q}, t]dt+K(t_B)-K(t_A)=I[\mathbf q]+K(t_B)-K(t_A).$$ However, if $K(t)$ can be any differentiable function and can depend on $\mathbf q(t)$ and its derivatives, I am quite confused as to what it means to say that the equations of motion for $\mathbf Q$ are the same as those for $\mathbf q$ and why this is the case.

Adding a constant to a function does not change the location of its minima. — eyeballfrog, Aug 08 '19 at 17:21
But if K(t) can depend on q(t) then surely K(t_B) - K(t_A) can also depend on our choice of q(t), i.e. K(t_B) - K(t_A) is not a constant? — MB10000, Aug 08 '19 at 17:48
That term is constant as it's evaluated at the endpoints, certainly for different values of the endpoints this number will change, but the term itself is constant. Plus we are talking about the equations of motion which come from finding the extrema of the action so these terms don't contribute — Triatticus, Aug 08 '19 at 18:31
But unless we fix the values of the derivatives of q at the boundary, surely the value of K at the boundary depends on our choice of q(t)? I thought that Hamilton's principle only specified that the value of q, but not necessarily its derivatives, had to be fixed at the boundary. Is this incorrect or am I missing something else? — MB10000, Aug 08 '19 at 19:24
True but usually the assumption is that the paths are smooth functions of the variables in question. — Triatticus, Aug 08 '19 at 21:47

Qmechanic · Answer 1 · 2019-08-09T08:56:23.310

Eq. (9.1) defines a finite quasi-symmetry. Note that Noether's theorem only needs an infinitesimal quasi-symmetry of the action to produce a conservation law.
Townsend assumes that the two Lagrangians $L$ in eq. (9.1) are the same function (although their arguments are obviously different). This is strictly speaking not necessarily if we are only interested in Euler-Lagrange (EL) equations, which OP asks about, although Townsend has of course Noether's theorem in mind a couple of pages later in his notes.
Townsend allows $Q(t)$ to depend on $\dot{q}(t)$. If the Lagrangian $L$ is first-order in $Q$ typically it then becomes second-order in $q$. If eq. (9.1) holds then the second-order terms must hide in the boundary terms. This might have implications for appropriate boundary conditions (BCs) necessary for the existence of functional/variational derivative for the action, and in turn, EL equations.
More generally, given two action functionals that satisfy $$ \widetilde{S}[Q] ~=~ S[q] ~+~ \text{boundary terms}, \tag{A}$$ where $$Q(t)~=~f(q(t),\dot{q}(t),\ddot{q}(t), \ldots ;t).\tag{B}$$ Moreover, assume that we have imposed appropriate boundary conditions (BCs) that ensure that both functional derivatives $$ \frac{\delta S[q]}{\delta q(t)} \qquad \text{and} \qquad \frac{\delta \widetilde{S}[Q]}{\delta Q(t)}\tag{C}$$ exist. Then the statement (which Townsend is aiming at) is that the EL equations $$ \frac{\delta S[q]}{\delta q(t)}~\approx~0\qquad \stackrel{(B)}{\Leftrightarrow} \qquad \frac{\delta \widetilde{S}[Q]}{\delta Q(t)}~\approx~0 \tag{D}$$ are equivalent under the map (B).

References:

P.K. Townsend, Variational Principles, Part 1A, Mathematics Tripos, Lecture notes, 2018.

I understand now that 4. is what Townsend is trying to convey but isn't it only true if the values of all of the derivatives of q on which K depends are fixed at the boundary? — MB10000, Aug 08 '19 at 19:14
Morally yes, but the specific details are more complicated. This is why we speak of appropriate BCs. — Qmechanic, Aug 08 '19 at 19:36

CR Drost · Answer 2 · 2019-08-08T20:14:22.273

So the important thing is simply that it take the form of a total derivative, even if that symbolically involves the expressions for $\mathbf q$ and its derivatives. So for example it could take the form $\mathbf q \dot{\mathbf q},$ and that’s fine because it is symbolically a total derivative.

Background: the Lagrangian doesn’t know about paths

Backing up one step to explain this better: we start from a notion of paths, which map a time interval, say $t\in(0, T)$, to a bunch of vectors in a coördinate space, say $\mathbf q(t)$ in $\mathbb R^{3n}$ for $n$ particles in unconstrained 3D space, although one of the great strengths about the Lagrangian formalism is that it does not care about how you parameterize the space and therefore you can impose constraints on the space without messy constraint forces. Call the time interval $\mathsf T$ and the coördinate space $\mathsf C$, paths are functions $\mathsf T \to \mathsf C.$ [If you really want to go streamlined-and-abstract, you can also make time one of the coördinates in the space and take $\mathsf T = (0, 1)$ or so, as a “progress along the path” parameter rather than a “time” parameter.]

We then invent action principles, we say that the laws of physics can somehow be encoded in a function $ \mathcal S: (\mathsf T\to\mathsf C)\to \mathbb R,$ assigning numbers to paths. Then we are saying that of all the paths which a particle can take between two points in $\mathsf C$, the ones that it does take according to physics are the ones where for all path-perturbations $\delta \mathbf{q}$ vanishing at the endpoints of $\mathsf T$, $$ S[\mathbf q + \delta \mathbf q] \approx S[\mathbf q]$$to first order in $\delta \mathbf q.$ Obviously this doesn’t really help us if we don’t have some additional structure, which is why we impose the Lagrangian structure. Now this is important, while $S$ only has one path and has to deal with strange things like “taking derivatives with respect to time” of that path, the Lagrangian doesn’t really know about those.

An $n^\text{th}$ order Lagrangian is just a function from $n+1$ coördinates and one time to the real numbers. It doesn’t know, as a function, that its various coördinates are going to come from different paths or that those paths are connected to each other by time derivatives in the action principle. It’s just a function $L : \mathsf T \times \mathsf C^{n+1} \to \mathbb R.$ The fact that these arguments are symbolic derivatives comes from the fact that we assume that the action principle $S$ can be phrased in terms of $L$ by an expression of the form, $$S[\mathbf q] = \int_\mathsf T dt~L\big(t, \mathbf q(t), \dot{\mathbf q}(t), \ddot {\mathbf q}(t), \dots\big).$$ Note that the logic has us generate $n$ paths from the one path, then we evaluate them at some position upon the path, feed them to the Lagrangian, get a number, and then sum those numbers for all points along the path. Then you know the rest of the major part of this story: we do this path-perturbation procedure and find that assuming $L$ is a nice function then it has partials with respect to all of its $\mathbf q_{0,1,2,\dots n}$ arguments, not knowing that $q_i$ corresponds to a coördinate of the $i^\text{th}$ time derivative of a path; and if the coordinate space is a vector space then we understand these partials as covectors $\mathsf C\to\mathbb R$. These partials mean that to first order, $$ \begin{align} S[\mathbf q + \delta\mathbf q] &=\int_\mathsf T dt~L\big(t, \mathbf q(t) + \delta\mathbf q(t), \dot{\mathbf q}(t) + \delta\dot{\mathbf q}(t), \ddot {\mathbf q}(t) + \delta\ddot{\mathbf q}(t), \dots\big)\\ &\approx\int_\mathsf T dt~\left[L\big(t, \mathbf q(t), \dot{\mathbf q}(t), \ddot {\mathbf q}(t), \dots\big) + \sum_{i=0}^{n}\frac{\partial L}{\partial \mathbf {q_i}}\cdot \left(\frac{d~}{dt}\right)^{i}\delta \mathbf q\right] \end{align},$$and we then integrate-by-parts all of these time derivatives away into boundary terms which vanish because $\delta q, \delta \dot q, \dots = 0$ at the boundaries of $\mathsf T,$ getting the Euler-Lagrange equations of motion,$$0 = \sum_{i=0}^{n} (-1)^i ~ \left(\frac{d~}{dt}\right)^{i} \frac{\partial L}{\partial \mathbf {q_i}}.$$

Now, interpreting these equations requires a sort of “dance” in your head!

First, we take the partial derivatives of the Lagrangian ignoring the connections of the different derivatives to each other, that is what $\partial L / \partial q_i$ means.
Then, we insert into those functions the actual path $q(t)$ and its derivatives $\dot q, \ddot q$.
Then, we take the total time derivatives with respect to $t,$ and insert minus signs corresponding to integration-by-parts.
And only after all of that is done, does the resulting expression need to be equal to zero.

It was very important to me, in resolving the sort of confusion that you are dealing with now, to see that this step 2 sits in the middle of this interpretation. I even had a professor at Cornell who taught me all this confess “I’m actually not completely sure why we take partials and then total derivatives, but that is what the mathematicians and textbooks tell me to do.” It is the same confusion.

But we know about paths

Now, we usually impose our knowledge of the relationship upon the equations. We don’t write $q_{0,1,\dots n-1}$ but rather $q, \dot q, \ddot q$ as if we were taking derivatives. From the perspective of the Lagrangian these are all just symbols, but we abuse the notation for the sake of our own sanity.

Now we come to this total-time-derivative invariance. The Lagrangian function itself does not know that its arguments are time derivatives, but we know that certain assemblies like $\dot q \ddot q$ or $q \dot q$ or $q \ddot q + \dot q^2$ are all total time derivatives of something.

Given a total time-derivative, it cannot affect the equations of motion. And the proof is really simple, we go back to the place where the equations of motion came from: the principle of least action $S[\mathbf q + \delta \mathbf q]\approx S[\mathbf q].$

If we add a total time derivative of something to our Lagrangian, our action principle looks like, for some symbolic expression $K$, $$\begin{align} S'[\mathbf q] &= \int_\mathsf T dt~\left[L\big(t, \mathbf q(t), \dot{\mathbf q}(t), \ddot {\mathbf q}(t), \dots\big) + \frac{dK}{dt}\right]\\ &= S[\mathbf q] + K[\mathbf q_1,\dot{\mathbf q}_1, \ddot{\mathbf q}_1, \dots] - K[\mathbf q_0,\dot{\mathbf q}_0, \ddot{\mathbf q}_0, \dots] \end{align} $$where subscript $1$ indicates the final value at the end of $\mathsf T$ and subscript $0$ indicates the initial value. You substitute this with $q + \delta q$ and none of these $K$ terms change because after the perturbation, $q_{0,1}, \dot q_{0,1}, \dots$ are all the same: $\delta q$ vanishes for all of these.

So $K$ just vanishes when we try to analyze the actual physics of the system $S[\mathbf q + \delta \mathbf q]\approx S[\mathbf q].$ Whatever number it is, it is the same number on both sides and gets subtracted out.

You can formalize this by saying that you can always add or subtract any expression to/from a Lagrangian that looks like a total time derivative. It does not preserve the identity of the Lagrangian, it does not even preserve the value of the action integral, but it only introduces a boundary term into the results of the action integral and therefore it must disappear when the equations of motion are considered.

Thank you for such a clear explanation. I still have one question though. Why can we assume that the values of the derivatives of q are fixed at the boundary of T? Doesn't Hamilton's principle only specify that the value of q must be fixed at the boundary but not necessarily the values of its derivatives? — MB10000, Aug 08 '19 at 19:22
@MB2269 That's a fair question. Let $X^{(n)}$ mean $(d/dt)^n X.$ I think the answer is that, just because I fix $\delta q^{(n)}=0$ for all $n$, that doesn't mean that I fix anything about $q^{(n)}$. So in my preferred understanding, the perturbations $\delta q$ decay to zero like bump functions but still reveal all this rich internal structure; then you discover that the solutions to those equations-of-motion can fail to exist when you over-specify the BCs: so the equations-of-motion push back on you, “I can’t also specify the velocities—sorry!” — CR Drost, Aug 08 '19 at 20:17
Are you saying that we are free to specify the BCs as much as we like, e.g. specifying that d/dt(δq) = 0 on the boundary, provided that we don't over-specify so that there are no longer solutions to the E-L equations? — MB10000, Aug 08 '19 at 20:24
I am saying that $\delta\dot q=0$ on the boundaries doesn't tell you anything about $\dot q$ on the boundaries. Or if you look at how this functions mechanically, it seems to me like the set of solutions for $q$ you get when you assume that $\delta\dot q=0$ on the boundaries is a superset of the solutions when you assume that it is not—boundary terms looking like $q~\delta \dot q$ could be safely neglected in the one case but not the other, for example. So the question to me is really just “how many new solutions are we adding here?” and I am not sure that I should be scared by the answer. — CR Drost, Aug 08 '19 at 20:55

score 1 · Answer 3 · answered Aug 08 '19 at 16:58

1

Just have a look at the Euler-Lagrange equations, which are the equation of motion in the Langrange theory for the $\mathbf{Q}$ and $\mathbf{q}$. It is the case because this equations come from the functional derivative of $I$ and if $I$ is invariant under the transformation $\mathbf{q} \rightarrow \mathbf{Q}$, then the equations of motion are the same.

answered Aug 08 '19 at 16:58

Jan2103

116

But how can we say that I is invariant under the transformation if K(t_B) - K(t_A) is not necessarily 0? – MB10000 Aug 08 '19 at 17:50
I only know it in covariant form, that you can add $\partial_\mu F^\mu$ so that with gaussian law, the boundary terms are zero. – Jan2103 Aug 08 '19 at 20:30

In what sense are the equations of motion conserved by symmetries?

3 Answers3

Background: the Lagrangian doesn’t know about paths

But we know about paths