4-vectors in special relativity and spacetime interval

Question

In one of my vector mechanics lectures, the lecturer said the the space time interval was the dot product of the four position vector. But then he proceeded to show it was this: $s^2$ = $\Delta r^2$ - $c^2\Delta t^2$ ,

Where did the minus sign come from? All the dot products I have done so far do not have a negative sign. What changes with a four vector? He did go through a proof, but that did not make it any clearer. He used something called the Minkowski metric, but that just happened to have a minus sign there without any given reason.

possible duplicate of Why does Minkowski space provide an accurate description of flat spacetime? — ACuriousMind, Sep 03 '15 at 11:43
Though physics lecturers often conflate the two terms, pedantically I would say "dot product" means "pairwise multiply, then add" and is not coordinate independent. "(Pseudo) inner product" refers to an abstract method of creating a scalar from two vectors. The confusion is that so far all the inner products (concepts) you've seen have been computable with dot products (algorithms), but this isn't always going to hold. — , Sep 05 '15 at 20:01

CR Drost · Answer 1 · 2015-09-03T16:55:00.710

I think this question is a little more low-level then the one that it's being marked as a duplicate of, so I'm going to answer it.

The basic thing that you need to know is that an inner product on the 4-vector space need not have the form it has in Euclidean coordinates. That is, defining $w = ct$, it is not necessarily the case for all 4-D vector spaces that:$$\vec a \cdot \vec b = a_w b_w + a_x b_x + a_y b_y + a_z b_z.$$Probably the easiest one to wrap your head around is a skewed coordinate system where your basis vectors $\hat e_{w,x,y,z}$ are not orthogonal: yes $\vec a = \sum_i a_i \hat e_i$ for some basis vectors, but $\hat e_i \cdot \hat e_j \ne \delta_{ij}$ (where $\delta$ is the Kronecker delta symbol). Then it's obvious that if $C_{ij} = \hat e_i \cdot \hat e_j$ the result has instead a matrix hiding inside of it: $$\vec a \cdot \vec b = \sum_{ij} a_i C_{ij} b_j = \mathbf a^T ~\mathbf C~ \mathbf b.$$This matrix has a special name and it is called the metric or the metric tensor.

In turn, we can imagine an "inner product" for vectors in $\mathbb R^4$ with any other metric, and see where it goes. The only usual stipulation is that the matrix be symmetric, $\mathbf C^T = \mathbf C,$ and usually invertible, so that the "dual space" is "isomorphic to" the original vector space -- we'll talk about what these "dual vectors" are in a second.

So in special relativity we have this "Lorentz group" of coordinate transformations, and if we're going to go whole-hog with this relativity business then everything physical as we know it must depend on 4-vectors and other quantities which are unchanged by the Lorentz group, otherwise your experimental predictions will nontrivially depend on what coordinates you use.

Well, it happens to be the case that the Lorentz group preserves "dot products" for a different metric, which is given as either $\pm$ (depending on convention) of the matrix:$$\mathbf {g} =\begin{bmatrix}1&0&0&0\\ 0&-1&0&0\\ 0&0&-1&0\\ 0&0&0&-1\end{bmatrix},$$sometimes also called by the symbol $\mathbb \eta.$ Whether you use $+$ or $-$ depends essentially on whether you like to think of time as an imaginary dimension of space or space as an imaginary dimension of time; either way some factors of $\sqrt{-1}$ appear in some expressions but not others. I prefer $+$ because it means that trajectories which stay "inside" a light cone have a positive spacetime interval and the "proper time" is just the square root of the spacetime interval, but other people may have other conventions.

Now the Lorentz group has three sorts of "generators" (different things that can happen that build up the whole group). These are the parity transforms (multiplying the w-component or the whole 4-vector by -1), the rotations of the 3D subspace (x, y, z), and the "Lorentz boosts" of the form $$\mathcal L_x(\beta) = \frac{1}{\sqrt{1 - \beta^2}} ~ \begin{bmatrix}1&-\beta&0&0\\ -\beta&1&0&0\\ 0&0&1&0\\ 0&0&0&1\end{bmatrix}.$$The exact derivation of these boosts I will leave to other tutorials, and since rotations in 3D preserve the Euclidean metric $\mathbf C = \mathbf I$ it shouldn't be too hard to see that they also preserve the 3D $\mathbf C = -\mathbf I$ while doing nothing to the time coordinate, so they preserve $\mathbf g$ and we'll skip that proof. Let's also ignore the $y$ and $z$-directions and focus on the $wx$-mixing Lorentz boost. (There is no loss of generality here: any transform in the Lorentz group can be written as a rotation followed by a Lorentz boost in the $x$ direction followed by another rotation.)

First off, before we start that, you can confirm the mathematical consistency of special relativity by $\mathcal L(-\beta) \mathcal L(\beta) = I,$ which is always a great way to start.

Now our boost $\mathcal L = \mathcal L_x(\beta)$ maps $\mathbf a \mapsto \mathcal L~\mathbf a$ and $\mathbf b \mapsto \mathcal L~\mathbf b$, so our inner product between these two becomes:$$\mathbf a^T ~\mathbf g ~ \mathbf b \mapsto (\mathcal L~\mathbf a)^T ~\mathbf g~(\mathcal L~\mathbf b) = \mathbf a^T (\mathcal L^T ~ \mathbf g ~ \mathcal L) \mathbf b.$$ For this to be a scalar it must be unchanged by the Lorentz boost, hence we need $\mathcal L^T ~ \mathbf g ~ \mathcal L = \mathbf g.$ You can confirm that the following matrix product works out:$$\begin{align}\mathcal L(\beta)^T ~\mathbf g~ \mathcal L(\beta) &= \frac{1}{1 - \beta^2}\begin{bmatrix}1&-\beta\\-\beta&1\end{bmatrix} \begin{bmatrix}1&0\\0&-1\end{bmatrix} \begin{bmatrix}1&-\beta\\-\beta&1\end{bmatrix}\\ &=\frac{1}{1 - \beta^2}\begin{bmatrix}1 - \beta^2&0\\0&\beta^2 - 1\end{bmatrix}\\ &=\begin{bmatrix}1&0\\0&-1\end{bmatrix} = \mathbf g \end{align}$$which proves that Lorentz boosts distinctively preserve this particular dot product from this particular metric for all 4-vectors.

Similar arguments about $\mathbf A^T ~ \mathbf g ~ \mathbf A$ apply for the parity transforms and for the rotations, of course. Usually we demand an even wider invariance of all of our physical predictions under the "Poincaré group", which takes the Lorentz group of coordinate transformations and adds spacetime-translations to it, but this just means that we always talk about differences in positions, say by explicitly including our spacetime "origin" point in our expressions.

This metric, therefore, is the way that we produce scalar numbers out of 4-vectors that are "invariant" in special relativity, which helps for making physical theories that are "manifestly covariant" -- their predictions do not change with respect to Lorentz boosts or rotations or translations of coordinates.

One more point of notation: when we have a metric which is not the trivial Euclidean metric, people often write the column vector $\mathbf b$ with "upper" indices $b^i$ and the row-vector $\mathbf a^T ~\mathbf g$, often called the "dual" of $\mathbf a$, with "lower" indices $a_i$. This preserves the appearance of the summation formula above; we can always state:$$\vec a \cdot \vec b = \sum_i a_i b^i = \sum_i a^i b_i.$$ It becomes in turn very common to just implicitly sum whenever you see the same symbol for a lowered and raised index, the so called "Einstein summation convention." With the above metric this becomes very easy: whenever you have a 4-vector $(A, \vec b)$ (time component plus space component), the dual vector is $(A, -\vec b)$, and the Lorentz-covariant inner product between two such things is $A_1 A_2 - \vec b_1 \cdot \vec b_2$ for the "ordinary" Euclidean definition of the dot product.

In turn, doing this with the 4-displacement $(c~\Delta t, \Delta \vec r)$ gives a Poincaré-invariant quantity $c^2 (\Delta t)^2 - |\Delta \vec r|^2$. This is a quantity which Lorentz boost preserve, and it can be thought of as a "dot product" (really should be called the "Lorentz-covariant inner product") of the two spacetime 4-vectors.

If it is positive the square root is called the proper time between the two events that it measures the spacetime displacement between; it is the time that elapses for the inertial reference frames which think that both of the events happened "at the same place." If it is negative then $\sqrt{-\sum_i r_i r^i}$ is a "proper distance" between the two events seen by the reference frames which think that both of the events happened "at the same time". In relativity these are mutually exclusive: things which are objectively space-separated are not objectively time-separated and vice versa.

score 2 · Answer 2 · answered Sep 03 '15 at 13:34

In one of my vector mechanics lectures, the lecturer said the the space time interval was the dot product of the four position vector. But then he proceeded to show it was this: $s^2$ = $\Delta r^2$ - $c^2\Delta t^2$ Where did the minus sign come from?

See Einstein's Simple Derivation of the Lorentz Transformation. It's really simple. The light moves a distance x in time t at speed c so x = ct so x - ct = 0. There's the minus sign.

All the dot products I have done so far do not have a negative sign. What changes with a four vector? He did go through a proof, but that did not make it any clearer. He used something called the Minkowski metric, but that just happened to have a minus sign there without any given reason.

I'm not sure what he said. But see this article along with Wikipedia:

"In a light-like interval, the spatial distance between two events is exactly balanced by the time between the two events. The events define a spacetime interval of zero ($S^2 = 0$). Light-like intervals are also known as "null" intervals".

The spacetime interval isn't really the "spacetime distance" people say it is. It denotes your proper time. See the time-like interval on WIkipedia: "The measure of a time-like spacetime interval is described by the proper time interval". If you're a light-like photon, your proper time is always zero, even if you travel a light year. That light year is not a zero distance, or a zero separation between events. If you're just you, sitting at your desk with a parallel-mirror light-clock in front of you, that light clock ticks at a certain rate. But if I snap my magic fingers and send you on a fast trip through space, your light-clock ticks at a a slower rate in line with Pythagoras's theorem. See the simple inference of time dilation due to relative velocity. Your proper time is in essence the number of reflections. When you move faster and I watch you through my gedanken telecope, I see a reduced rate of reflections. And if you could move at the speed of light, the light in your light-clock flatlines, and there aren't any reflections. So your proper time is zero.

Note that if I stay at home with my parallel-mirror light clock, and you go on an out-and back trip with yours, the light-path-lengths in both our clocks are the same. That's what underlies the invariant spacetime interval.

score 1 · Answer 3 · answered Sep 03 '15 at 16:13

Vector spaces with mixed signature metrics (both pluses and minuses in the metric signature) are known and understood mathematical structures, but it's not surprising you might not have heard of them before this point: from an educational standpoint, you need everything you've learned about metric spaces with positive-definite metrics and more to make sense of these mixed-signature spaces. For instance, all nonzero vectors have nonzero norm in Euclidean spaces, but for a mixed-signature space, there are "null vectors" that are nonzero yet have zero norm. Certain facts you might've taken for granted may need tweaking or rethinking altogether.

Once you understand that this is indeed a consistent and well-understood mathematical structure, you can go about convincing yourself that this is, in turn, a reliable model for the behavior of the real world--for spacetime. Such a process is not about proof, but about identifying physical phenomena with corresponding mathematical structures: for instance, identifying the possible trajectories of light rays with those null vectors I mentioned earlier, or how the relativity of velocity--Lorentz boosts--corresponds to rotation-like operations on a mixed-signature space.

Ah right, would you be able to point me towards a resource that explains vector spaces? — Shaurya Bhave, Sep 03 '15 at 17:16
You said you've done dot products before; a dot product is just something you can do with elements of a normed vector space. So if you already know how dot products work, you've already been working with vector spaces. The big difference here is merely that the dot product is not always nonzero when the vector is nonzero. That has a lot of consequences--all thoroughly explored and derived by now, and many of those have been used to model physical phenomena. Have you worked with dot products before? And in what context? — Muphrid, Sep 03 '15 at 17:37

4-vectors in special relativity and spacetime interval

3 Answers3