7

In elementary linear algebra, we talk about matrices, i.e. rectangular arrays of numbers. In advanced linear algebra, we prefer whenever possible to talk about abstract tensors, such as linear operators or bilinear forms, without going into any particular basis. This framing is considered more elegant, and also allows for easier generalization to vector spaces more abstract than the simple case of $\mathbb{R}^n$ for finite $n$.

In particular, I've found that almost all theorems about matrices generalize fairly easily to more abstract theorems about linear operators and/or bilinear forms. (One strong hint as to which one is that matrix similarity is the natural expression of "equivalence" for linear operators, and matrix congruence is the natural expression of "equivalence" for bilinear forms. For an inner product space, the distinction is often blurred, because there's a natural isomorphism between linear operators and bilinear forms.)

But I'm not quite how this works for Sylvester's law of inertia, because it seems to combine elements of both notions without apparently requiring any inner product on the vector space.

Part 1 of Sylvester's law of inertia says that every real symmetric square matrix is congruent to exactly one diagonal matrix with entries 1, -1, and 0 (up to permutation). This pretty clearly seems to generalize to the Wikipedia article's basis-independent statement about real quadratic forms:

For any real quadratic form $Q$, every maximal subspace on which the restriction of $Q$ is positive definite (respectively, negative definite or totally isotropic) has the same dimension.

But I'm confused about part 2 of Sylvester's law of inertia, which says that two symmetric square matrices are congruent iff they have the same numbers of positive, negative, and zero eigenvalues (which equal the numbers of entries 1, -1, and 0 defined in part 1). I'm not sure how to state this in a basis-independent way, because matrix congruence is a equivalence relation on representations of bilinear or quadratic forms, but the notion of eigenvalues only makes sense for linear operators. Moreover, the statement of the theorem doesn't seem to require any inner product that would allow you to naturally convert between the two types of tensors. Is part 2 of Sylvester's law of inertia a statement about forms or about linear operators, and how can we generalize it to tensors in a basis-independent way?

The only solution I can think of, which seems to me to be very much a hack, is to introduce an arbitrary inner product - something like the following:

Let $Q$ be an arbitrary quadratic form on a real vector space $V$, and $B$ be the naturally associated symmetric bilinear form. Define an arbitrary inner product $\langle \rangle$ on V. Consider the unique linear operator $L_{B,\langle \rangle}$ on $V$ associated with $B$, which maps any vector $v$ to the vector $u$ such that $B(v, w) = \langle u, w \rangle$ for all $w \in V$. Then regardless of the choice of inner product $\langle \rangle$, the numbers of positive, negative, and zero eigenvalues of $L_{B, \langle \rangle}$ equal the indices of inertia of $Q$ defined in part 1.

To me, this proposition seems ... rather convoluted. Is there a simpler version?

tparker
  • 6,219
  • 1
    interesting name. is this related to physics? or not really? – BCLC May 03 '21 at 12:26
  • 2
    @BCLC https://math.stackexchange.com/questions/6906/why-is-it-called-sylvesters-law-of-inertia. By “inertia” Sylvester wanted to invoke the concept that today would more likely be called “invariants”. – tparker May 03 '21 at 15:21
  • thanks for the info – BCLC May 03 '21 at 15:56

1 Answers1

4

You are correct that eigenvalues only make sense for operators and not bilinear forms. Maybe the following reinterpretation will look more natural for you:

Let $(V,\left< \cdot, \cdot \right>)$ be a finite dimensional real inner product space. Then we have a bijective correspondence between symmetric bilinear forms $B \colon V \times V \rightarrow \mathbb{R}$ and self-adjoint operators $T \colon V \rightarrow V$. In one direction, it is given by $B \mapsto T_B$ where $T_B$ is the unique linear operator which satisfies $\left< T_B(v), w \right> = B(v,w)$ for all $v, w \in V$.

Now, a priori the eigenvalues of $T_B$ have nothing to do with the inertia of the bilinear form $B$. However, the spectral theorem over $\mathbb{R}$ guarantees that the operator $T_B$ is orthogonally diagonalizable so one can find a $\left< \cdot, \cdot \right>$-orthonormal basis which is also $B$-orthogonal. This implies that under the correspondence $B \mapsto T_B$, the inertia of $B$ corresponds to the number of positive/negative/zero eigenvalues of the operator $T_B$ and so two bilinear forms are congruent iff the corresponding operators have the same number of positive/negative/zero eigenvalues.

I think the main point is that you can think of the classification of symmetric bilinear forms in two ways:

  1. You start with two symmetric bilinear forms on a finite dimensional real vector space $V$. In this case, there is no need to talk about eigenvalues at all. You define the inertia of a symmetric bilinear form, show that such forms can be diagonalized (in the sense of bilinear forms) and that the inertia can be read-off the diagonal representing matrix and finally that two forms are congruent iff they have the same inertia.
  2. You start with two symmetric bilinear forms on a finite dimensional real inner product space $(V, \left< \cdot, \cdot \right>)$. Then you can use the inner product to translate the forms into self-adjoint operators and ask whether you can deduce something about the congruence of the forms from the operators themselves. In this case, you can rephrase the answer in terms of the eigenvalues associated to the operators.
levap
  • 65,634
  • 5
  • 79
  • 122
  • Thank you, this subtle reframing of the concept was very helpful. What was bothering me was the asymmetry between the isomorphic spaces of (1) symmetric linear forms $B$ over a general real finite-dimensional vector space $V$ (no inner product), and (2) self-adjoint linear operators $T$ over a finite-dimensional real inner product space $V$. But now I see that the better way to think about it is that if you happened to have started with an inner product space, then this isomorphism is natural, ... – tparker May 03 '21 at 23:38
  • ... but if not, then it's most natural to just stick entirely with the space of real symmetric forms and never talk about the eigenvalues at all, rather than artificially inserting an inner product. – tparker May 03 '21 at 23:38
  • But here's a question. Suppose you have explicit matrix representations for two real symmetric forms over a generic (not inner product) space. What's the most efficient way to actually calculate their indices of inertia? If it's to calculate the eigenvalue spectrum, then you really are (morally speaking) doing the rather contrived procedure described in my question - formally defining an arbitrary inner product that depends on your initial choice of basis for the representation, then effectively throwing it away at the end when you apply the $\sgn$ function to the resulting eigenvalues? – tparker May 03 '21 at 23:45
  • If this is fact the most efficient procedure - and it may not be - then it's an interesting example of taking a shortcut of adding additional intermediate structure and then throwing it away, kind of like when you use complex analysis to prove a result about real numbers. Maybe that's more elegant than I had appreciated. – tparker May 03 '21 at 23:48
  • 1
    @tparker: Good question. I'm not sure about "the most efficient" but the standard way to calculate the signature of a symmetric matrix is to perform simultaneous row-column operations until you reach a diagonal matrix. This procedure is very similar to Gauss elimination, works over any field of characteristic $\neq 2$ and can be done without errors if you can do exact rational arithmetic. In particular, it doesn't require for you to try and find the eigenvalues of a matrix (which are the roots of the characteristic polynomial so you usually can't find them exactly). – levap May 04 '21 at 07:57
  • 1
    See this answer: https://math.stackexchange.com/questions/2461652/diagonalization-of-symmetric-matrix/2464340#2464340 and the linked questions/answers. – levap May 04 '21 at 07:57
  • 1
    @tparker Here's another viewpoint, referring to your question a few comments back: If you have an explicit matrix representation of a real symmetric form, then a basis has already been chosen. And for this particular basis, one might consider the "associated orthonormalizing inner product" [defined so that $(e_i,e_j)=\delta_{i,j}$] to be canonical in some sense. This is precisely the inner product that corresponds to the familiar entry-wise matrix/dot product operations. – WillG Feb 08 '22 at 22:55
  • 1
    I agree that it seems unsatisfying think in terms of eigenvalues despite them being unnecessary to define the signature/inertia of $B$. However, because every basis has a naturally associated inner product, the procedure might at least not be as "contrived" as you're suggesting. – WillG Feb 08 '22 at 23:38
  • 1
    @WillG That's a great point. I guess that's why in elementary linear algebra treatments, which primarily deal with explicit matrices, you can get away with not explicitly specifying an inner product, or even whether there is an inner product at all, because you almost always just implicitly use the natural inner product determined by the basis that you happen to be working in. – tparker Feb 09 '22 at 04:59
  • 1
    @WillG I guess that's also why you can get away with being sloppy about covariant vs. contravariant indices, because you're always implicitly working in an orthonormal basis where the inner product form is simply represented by the identity matrix. – tparker Feb 09 '22 at 05:00