0

Let $A$ be an $n\times m$ matrix with entries $a_{ij}$ and $$\sigma = \max_{x\neq 0} \frac{||Ax||}{||x||}.$$ Let $v\in \mathbb{R}^m$ be a maximizer of the above with $||v|| = 1$ and $Av = \sigma u$ with $||u||=1$. How can I prove, without using the singular value decomposition, that if $x^tv = 0$ and $y = Ax$, then $u^ty = 0$?

I have that if $v$ maximizes $\max_{x\neq 0} \frac{||Ax||}{||x||}$, then since $<Ax,Ax> = <x, A^tAx>$, by the min-max theorem, $v$ is an eigenvector of $A^tA$, so $$\sigma y^tu = <Ax, Av> = <x, A^tAv> = const.<x,v> = 0.$$ Is there something simpler than this?

  • Related question: https://math.stackexchange.com/questions/1737637/understanding-a-derivation-of-the-svd – littleO Aug 16 '17 at 21:40

1 Answers1

2

So $v$ is a unit vector which is "optimal" in the sense that $\| Av \|_2$ is as large as possible. Let $x$ be a vector that is orthogonal to $v$. We want to show that $Ax$ is orthogonal to $Av$.

If this were not the case, then $v$ would not be optimal, because we could improve $v$ by perturbing it a bit in the direction $x$!

I'll explain the intuitive idea first. When we perturb $v$ in the direction $x$, the norm of $v$ does not change (at least to a very good approximation). Imagine standing on the surface of the earth, and $v$ is the vector from the center of the earth to your current location. If you take a step in a direction orthogonal to $v$, your distance from the center of the earth does not change. You are walking on the surface of the earth.

However, when $v$ is perturbed in the direction of $x$, the vector $Av$ is perturbed in the direction of $Ax$. And if $Av$ is not orthogonal to $Ax$, then the change in the value of $Av$ is non-negligible. So, by perturbing $v$ in the direction of $x$ (or perhaps opposite the direction of $x$), we obtain a unit vector $\tilde v$ for which $\| A \tilde v\|_2$ is larger than $\| Av \|_2$. This shows that $v$ is not optimal after all, which is a contradiction.


That is the intuition, and it is simple and clear. It remains only to convert this intuition into a rigorous proof.

To get a rigorous proof, we will have to deal with the fact that $\tilde v = v + \epsilon x$ is not actually a unit vector, even though its norm is very close to $1$ when $\epsilon$ is tiny. So, we introduce the normalized vector $$ \hat v(\epsilon) = \frac{v + \epsilon x}{\sqrt{1 + \epsilon^2}}, $$ which is a true unit vector. Let $$ f(\epsilon) = \| A \hat v(\epsilon) \|_2^2. $$ Because $v$ is optimal, we see that $f'(0) = 0$. And if we compute $f'(0)$ explicitly (which is a straightforward calculus exercise), we will find that $\langle Av, Ax \rangle = 0$. Here are the details: \begin{align} f(\epsilon) &= \frac{1}{(1 + \epsilon^2)} \langle Av + \epsilon Ax, Av + \epsilon Ax \rangle \\ &= \left(\frac{1}{1 + \epsilon^2}\right) \| Av \|_2^2 + \left(\frac{\epsilon}{1 + \epsilon^2}\right) \langle Av, Ax \rangle + \left(\frac{\epsilon^2}{1 + \epsilon^2}\right) \| Ax \|_2^2. \end{align} Using basic calculus we see that $$ f'(0) = \langle Av, Ax \rangle = 0. $$

littleO
  • 51,938
  • 1
    For those who are interested, in the general case over $\mathbb C$, consider $f(t)=|A(v+t\omega x)|^2/|v+t\omega x|^2$ instead, where $\omega$ is any complex number. Then $f'(0)=2\Re\langle Au,\omega Ax\rangle$. In order that $f'(0)=0$ for every $\omega$, we must have $Au\perp Ax$. – user1551 Aug 16 '17 at 21:42