Minimizing $\|Ax\|_2$ subject to $\|x\|_2 = 1$

Question

I have a Matlab program to estimate a vector $x$ from noisy measurements. I use the singular value decomposition (SVD) to solve the linear equation $Ax=0$ (where the number of equations is greater than the number of variables). I have read before (link below) that in this case, the solution will be the last column in the $V$ matrix (assuming [U,S,V] = svd(A) and $\|x\|=1$).

The problem is that when I test the program using true values of the measured quantities (quantities without noise) the estimated vector is correct. However, when I add the noise, the estimated vector has values which are far from the true ones (the estimated vector seems to have some of its signs converted "flipped" which makes the difference between it and the true vector large!)

I will be grateful if someone could help me.

http://andrew.gibiansky.com/blog/mathematics/cool-linear-algebra-singular-value-decomposition/

Something is wrong here. What are you going to solve? If $AX = 0$ for a mxn matrix A with m > n, then usually X is zero (almost always for random A); if X is nonzero, then Y = X * Q is also solution for any Q. X is given by the last column of V if S has exactly one zero on diagonal, but this is not the case here. Is this a regression problem? If yes, then you are going to find least square solution to $AX = b$ using SVD method. — Pawel Kowal, Jun 22 '16 at 00:25
I have added the link in the original post, read the part "Applications: Solving Linear Equations". I am trying to find the least squares fit to the data. — maruchan, Jun 23 '16 at 06:19
part "Applications: Solving Linear Equations" of this note is broken. — Pawel Kowal, Jun 23 '16 at 07:35
I mean that this part is completely wrong, misleading, confusing. See for example this as much better explanation of linear equations. — Pawel Kowal, Jun 23 '16 at 08:26
I missunderstood your problem; your problem is not a problem of finding least square fit to the data. As I understand the problem is: find $x$ such that $||Ax||_2$ is minimized under constraint $||x||_2 = 1$. Is it right? — Pawel Kowal, Jun 23 '16 at 09:22

Pawel Kowal · Answer 1 · 2016-06-23T18:06:13.930

As I understand the problem is: find a vector $x$ such that $||Ax||_2$ is minimized under constaint $||x||_2 = 1$ for $m\times n$ matrix A.

Such problem always has solution, but solution need not be unique.

First we need to observe that $\min ||Ax||_2 = \sigma_{min}$, where $\sigma_{min}$ is the smallest eigenvalue. Let $U$, $S$, $V$ be the SVD factorization of $A$, i.e. $USV' = A$, $S$ is the diagonal matrix of size $n\times n$ containing singular values sorted decreasingly. The vector $x$ is given by the last column of $V$. To see this observe that $V'x = e_n$, where $e_n$ is the last column of $n\times n$ identity matrix. Then $Se_n = \sigma_{min} \times e_n$, and finally $Ax = \sigma_{min} U_n$, where $U_n$ is the last column of $U$. Thus $||Ax||_2 = \sigma_{min} ||U_n||_2 = \sigma_{min}$.

Hence, the last column of the matrix $V$ is one of the solution $x$ to the problem. The second one is $-x$. There may exist other solution if the matrix $A$ has many singular values equal to $\sigma_{min}$.

Singular value decomposition is not determined uniquely. If $i$-th column of $U$ and $V$ are multiplied by -1 in the same time, then we stil have valid SVD decomposition. When the matrix $A$ is perturbed slightly, then $U$ and $V$ matrices can change dramatically, but not the S matrix.

Basically there are two solution to the problem (ignoring degenerate cases). In order to define unique solution to the problem one must impose additional constraints like for example element with maximum absolute value must be positive.

I see, So there is no other way to determine the signs of the solution than imposing additional constraints. — maruchan, Jun 23 '16 at 10:29

Rodrigo de Azevedo · Answer 2 · 2016-06-23T13:52:46.240

1

We would like to minimize $\|Ax\|_2$ subject to the equality constraint $\|x\|_2 = 1$. Hence,

$$\begin{array}{ll} \text{minimize} & \|Ax\|_2\\ \text{subject to} & \|x\|_2 = 1\end{array}$$

which can be rewritten as a quadratically constrained quadratic program (QCQP)

$$\begin{array}{ll} \text{minimize} & \|Ax\|_2^2\\ \text{subject to} & \|x\|_2^2 = 1\end{array}$$

Thus,

$$\|Ax\|_2^2 = x^T A^T A x \geq \lambda_{\min} (A^T A) \|x\|_2^2 = \lambda_{\min} (A^T A) = \sigma_{\min}^2 (A)$$

and

$$\|Ax\|_2 \geq \sigma_{\min} (A)$$

If $A$ has full column rank, then its SVD is of the form

$$A = U \Sigma V^T = \begin{bmatrix} U_1 & U_2\end{bmatrix} \begin{bmatrix} \hat\Sigma\\ O\end{bmatrix} V^T$$

where the zero matrix may be empty. The eigendecomposition of $A^T A$ is, thus,

$$A^T A = V \Sigma^T U^T U \Sigma V^T = V \Sigma^T \Sigma V^T = V \hat\Sigma^2 V^T$$

In this case, $\sigma_{\min} (A) > 0$, and the minimum is attained at the intersection of the $1$-dimensional vector subspace spanned by the right singular vector associated with $\sigma_{\min} (A)$ with the unit Euclidean sphere. As the singular values are usually listed in descending order, this right singular vector should be the last column of $V$. If the minimum singular value has multiplicity greater than $1$, then the minimum is attained at the intersection of a vector subspace of dimension greater than $1$ with the unit Euclidean sphere, which means that columns of $V$ other than the last could be used.

If $A$ does not have full column rank, then its SVD is of the form

$$A = U \Sigma V^T = \begin{bmatrix} U_1 & U_2\end{bmatrix} \begin{bmatrix} \hat\Sigma & O\\ O & O\end{bmatrix} \begin{bmatrix} V_1^T\\ V_2^T\end{bmatrix}$$

where the zero matrices may be empty. The eigendecomposition of $A^T A$ is, thus,

$$A^T A = V \Sigma^T U^T U \Sigma V^T = V \Sigma^T \Sigma V^T = \begin{bmatrix} V_1 & V_2\end{bmatrix} \begin{bmatrix} \hat\Sigma^2 & O\\ O & O\end{bmatrix} \begin{bmatrix} V_1^T\\ V_2^T\end{bmatrix}$$

In this case, $\sigma_{\min} (A) = 0$, and the minimum is attained at the intersection of the null space of $A$ with the unit Euclidean sphere. As the columns of $V_2$ span the null space of $A$, any column of $V_2$ would do.

edited Jun 23 '16 at 13:52

answered Jun 23 '16 at 13:08

Rodrigo de Azevedo

1

Thank you for editing my post and for your answer. However, my problem is not about how to solve this kind of systems. My problem is as I stated in the post; the solution I get has its signs flipped. maybe this is -as Sto mentioned below- due to the fact that the SVD has sign ambiguity. – maruchan Jun 23 '16 at 15:11
@maruchan Check your latest comment on this question. Do you know what you want? If you know what you're doing, the signs should not be flipped. http://math.stackexchange.com/a/1805239/339790 – Rodrigo de Azevedo Jun 23 '16 at 15:19
what do you mean by "the signs should not be flipped"?, could you explain more. From the other two answers, I would say that SVD has sign ambiguity since "If i-th column of U and V are multiplied by -1 in the same time, then we still have valid SVD decomposition" isn't that right? – maruchan Jun 23 '16 at 15:30
@maruchan The key is "same time". – Rodrigo de Azevedo Jun 23 '16 at 15:48
I don't know what am I doing wrong. I have a large number of noisy measurements stored in matrix A, I perform the SVD on matrix A using Matlab function svd, then I take the last column of V matrix as my solution and I find this solution's absolute value is close to the true vector absolute value, just its signs are flipped. I have no idea how to fix that! – maruchan Jun 23 '16 at 16:02
@maruchan What is it whose signs are flipped? If you take the norm of the last column, you should get something close to $1$, as $V$ is an orthogonal matrix. – Rodrigo de Azevedo Jun 23 '16 at 16:12
I mean the solution vector (last column of V) have some of its components with opposite signs or even wrong signs compared to the true vector components. – maruchan Jun 23 '16 at 16:19
@maruchan Take the inner product of the last column of $V$ with the so-called "true vector". Perhaps they are orthogonal. – Rodrigo de Azevedo Jun 23 '16 at 16:24
The problem is not to minimize $||Ax||_2$ but to find $\arg\min||Ax||_2$. If x is a solution, then -x is also a solution and the algorithm switches randomly between these two. Both solutions are valid, why we want to prefer one of them? – Pawel Kowal Jun 23 '16 at 16:24
@ Pawel Kowal, I don't know what is (argmin||Ax||2) but yes the sign of the solution is my problem, I am estimating the inertia matrix that is why I need specific solution with the right signs as the true Inertia. – maruchan Jun 23 '16 at 16:30
@maruchan What are the "right signs"? Must all entries of the vector be positive? Nonnegative? – Rodrigo de Azevedo Jun 23 '16 at 16:31
@Rodrigo de Azevedo I have 6 components, one is negative, the others are positive. – maruchan Jun 23 '16 at 16:35
@maruchan You could have mentioned that. One can solve the QCQP with nonnegativity constraints. – Rodrigo de Azevedo Jun 23 '16 at 16:40
@maruchan So why don't you postprocess the solution to obtain correct signs? – Pawel Kowal Jun 23 '16 at 16:48
@Rodrigo de Azevedo, I also don't know the QCQP. So you say I can solve this with nonnegativity constraints on some components of the x vector? since not all of the x components are positive. – maruchan Jun 23 '16 at 16:48
@Pawel Kowal, post process? could you please explain more?. – maruchan Jun 23 '16 at 16:50
@maruchan What is the rank of $A$? Does the matrix have full column rank? – Rodrigo de Azevedo Jun 23 '16 at 16:51
@RodrigodeAzevedo, yes A has full column rank. – maruchan Jun 23 '16 at 16:54
@maruchan If you know, that sign of the solution $x$ is wrong, then just multiply by -1; $-x$ is also a valid solution to the problem. – Pawel Kowal Jun 23 '16 at 16:59
@PawelKowal, yes but you see, the signs are not always the opposite of the true vector, as I stated earlier the x vector has 6 elements. the third element should be negative, the others should be positive. however in the solution sometimes I get all positive elements so in this case multiplying by -1 will not do it. – maruchan Jun 23 '16 at 17:05
@maruchan It is not possible. It means that your "true solution" does not solve the problem you stated. You may have some bug in the code. I don't know how you define "true solution", maybe perturbations are too large. Or smallest singular values are very close to each other. – Pawel Kowal Jun 23 '16 at 17:10
@PawelKowal, hmmm...I think my code is correct because when I eliminate all of the perturbations and noises the solution which I get is always true. but maybe it is the perturbations or the smallest singular values are close to each other as you have suggested. do (0.3862 and 0.01865) are considered close for example? – maruchan Jun 23 '16 at 17:18
Also I wonder if this problem is related to ill-conditioning. – maruchan Jun 23 '16 at 17:21
@maruchan Signs of the solution can change even when perturbations are very small if one of component of the solution is close to zero. In this problem you cannot require anything about signs. If signs are important, then you need to define the problem as QCQP with inequality constraints as was suggested. But this is something completely different. – Pawel Kowal Jun 23 '16 at 17:38
@maruchan What is the multiplicity of the minimum singular value? – Rodrigo de Azevedo Jun 24 '16 at 20:32
many thanks for that great explanation @RodrigodeAzevedo. I would like to ask you 2 questions regarding your proof. How can we state that 1. $\textbf{x}^\top\textbf{A}^\top\textbf{Ax}\ge \lambda_{min}(\textbf{A}^\top\textbf{A})|\textbf{x}^2_2|$ and 2. $\lambda_{min}(\textbf{A}^\top\textbf{A}) = \sigma^2_{min}(\textbf{A})$? Many thanks in advance! – ecjb Nov 23 '19 at 18:35
@ecjb For a proof of the inequality, take a look at this. The equality follows from the definition of singular value. – Rodrigo de Azevedo Nov 24 '19 at 10:27

score 0 · Accepted Answer · edited Apr 13 '17 at 12:58

0

SVD inherently has sign ambiguity, so you will need to use some auxiliary information to resolve the direction of the vector.

You may find this answer helpful: https://mathoverflow.net/questions/41756/making-matlab-svd-robust-to-transpose-operation

This paper defines a convention for identifying the true direction of the vectors. https://www.researchgate.net/publication/227677444_Resolving_the_sign_ambiguity_in_the_singular_value_decomposition

edited Apr 13 '17 at 12:58

Community

1

answered Jun 20 '16 at 09:39

Sto Forest

16

My understanding is that the sign ambiguity might affect any singular value and that you would use the original data and the singular value to infer which direction is the true one. – Sto Forest Jun 20 '16 at 12:15
so if my vector (last column of V matrix) is not affected at all every time I try to solve the sign ambiguity issue does that mean it is the right vector and nothing is wrong with its signs? – maruchan Jun 20 '16 at 15:23
No. It's just an artifact of the numerical conditioning of the data. Have you read the paper here https://www.researchgate.net/publication/227677444_Resolving_the_sign_ambiguity_in_the_singular_value_decomposition – Sto Forest Jun 20 '16 at 15:59
Yes, I have. that is why I asked that question. I do not know if I understand it correctly, but I think it says that as long as my vector is not affected then the estimation from the SVD and the flow of data are in the same direction. is that right? – maruchan Jun 20 '16 at 16:16
Sort of. The direction of the vector may be arbitrary in sign. You should choose it's final direction by looking at the data as per Figure 3 in the paper and aligning the vector with the data's primary direction :) – Sto Forest Jun 20 '16 at 16:55
thank you very much for your help..I will keep searching. – maruchan Jun 21 '16 at 06:06

Minimizing $\|Ax\|_2$ subject to $\|x\|_2 = 1$

3 Answers3

Linked