7

Given certain function $f(X)$ which is quadratic in $X\in\mathbb{R}^{n\times d}$,

$$\frac{1}{2}tr(X^TAX) - tr(Y^TBX)$$ for positive definite weighted Laplacian matrices $A, B\in\mathbb{R}^{n\times n}$, i.e. the diagonal elements are negative summation of the negative off-diagonal elements plus a constant for strict diagonal dominance: $A=\left(\begin{array}{ccccc} c_1+\sum a_{1s} & & & & \\ & \ddots & & -a_{ij} \\ & & \ddots & & \\ & -a_{ij} & & \ddots & \\ & & & & c_n+\sum a_{ns} \end{array}\right) \in\mathbb{R}^{n\times n},$

and an arbitrary $Y\in\mathbb{R}^{n\times d}$. In order to obtain its minimim, I set the gradient to $0$,

$$\nabla f(X) = AX-BY \tag 1$$

so, the solution to the following linear system

$$AX=BY \tag 2$$

with the unknown $X$ would give me the minimizer of $f(X)$. Since $A$ is strictly diagonally dominant, the system is solved by Jacobi iteration known to converge to exact solution in this case.

However, I'm interested if only $a$ $single$ iteration of Jacobi method on some arbitrary initialization $X_0$ would yield result $X_1$ that satisfies $f(X_1)<f(X_0)$.

Note that I tried many books on Jacobi method, but I was unable to make the connection with the optimization problem. The Jacobi method is known to yield "progressive better results" to the linear system, but I'm not sure what this implies precisely, and what implications it might have on the above optimization attempt. On one place I found that the Jacobi iterands $\{X_0, X_1, \dots, X_{k-1}, X_{k}\}$ satisfy

$$\lVert X-X_k\rVert_2 < \lVert X-X_{k-1} \rVert_2 \tag 3$$

where $X$ is the true solution to (2). Does the proof on the convergence of Jacobi actually imply (3)? And, if so, does that imply progressively lower values of $f(X_k)$, $k\in\{0, 1, \dots, \}$?

One way I tried is $f(X_1)-f(X_0)<0$, with the replacement as in Jacobi $$X_1=D^{-1}(BY-RX_0)$$

with the splitting $A=D+R$, where $D=diag(A)$, but was unable to regroup the terms.

I would appreciate if the answers would contain supporting references, so that I might explore the connection more deeply.

usero
  • 1,663
  • 2
  • 14
  • 27
  • If there is a way to state that, given $$f(x)=\frac{1}{2}x^TAx -b^TX$$ the value of $f(x_1)-f(x_0)<0$, where $x_1=D^{-1}(b-Rx_0)$, and the splitting $A=D+R$, as above, perhaps that could somehow be translated to the actual problem. If it helps, assume diagonal entries of $A$ are the only positive entries. Well, the intuitions tells that each Jacobi iteration should decrease the function value, but the problem is to state it formaly. – usero Mar 01 '12 at 12:23

1 Answers1

3

Your conjecture is true. Let us define $\Delta x_k = x_{k+1}-x_k$, $x=A\backslash b$, $e_k=x-x_k$. By the definition of iteration, $\Delta x_k = D^{-1}Ae_k$. So $$ f(x_{k+1}) - f(x_k) \\ = x_k^TA\Delta x_k + \frac{1}{2}\Delta x_k^TA\Delta x_k - (Ax)^T \Delta x_k\\ = -e_k^TA\Delta x_k + \frac{1}{2}\Delta x_k^T A \Delta x_k\\ = \frac{1}{2}e_k^T A D^{-1}AD^{-1} A e_k - e_k^T A D^{-1}A e_k = \frac{1}{2}v_k^T D^{-1/2}AD^{-1/2} v_k-v_k^Tv_k = v_k^T(\frac{1}{2}D^{-1/2}AD^{-1/2}-I)v_k. $$ Then, from convergence condition of Jacobi iteration that the spectra of $I-D^{-1}A$ is less than one, you can deduce your conjecture. To this end, we note that

1. $\frac{1}{2}D^{-1/2}AD^{-1/2}-I$ is symmetric so its definiteness is equivalent to its spectra lies on one side of zero.

2. $D^{-1/2}AD^{-1/2}$ have the same spectra with $D^{-1}A$ from the fact spectra is invariant under commutation.

3. The spectra of $D^{-1}A$ lies in $(0,2)$.

Hui Zhang
  • 1,319
  • 7
  • 16
  • Thanks. I guess your implying that the matrix withing the brackets of the last line is negative semi-definite. How to prove that? – usero Mar 01 '12 at 15:38
  • I have added more hints. Hope it helps. – Hui Zhang Mar 01 '12 at 15:51
  • Yes, thanks. Just a reference on 2. would be appreciated; you're not implying $D^{-1/2}AD^{-1/2}=D^{-1}A$ ? – usero Mar 01 '12 at 16:19
  • No, the two matrices are not equal. But their spectra coincide. You can prove it by the following argument: if $ABx=\lambda x$, then $BA(Bx)=\lambda (Bx)$. – Hui Zhang Mar 01 '12 at 16:49
  • Is the spectra from 3. covering $(0, 2)$? – usero Mar 02 '12 at 12:21
  • No, it is $(-1,-\frac{1}{2})$. – Hui Zhang Mar 02 '12 at 22:20
  • You wrote: $i)$ the spectra of $I-D^{-1}A$ is less then one; $ii)$ the spectra of $D^{-1}A$ lies in $(0, 1)$; $iii)$ the spectra of $D^{-1}A$ is $(-1, -1/2)$; Aren't some those contradicting? – usero Mar 03 '12 at 11:42
  • Hi, I did not claim your iii). In my last comment, the matrix is $\frac{1}{2}D^{-1/2}AD^{-1/2}-I$. – Hui Zhang Mar 03 '12 at 13:12
  • Understood, but that implies that the spectra of $D^{-1}A$ is in range $(0, 1)$, same as for $D^{-1/2}AD^{-1/2}$. I was wondering if it is covering $(0, 2)$: since, by spectral radius $\rho(I-D^{-1}A)<1$ ? – usero Mar 03 '12 at 13:27
  • Sorry, I misunderstood. You are right! I have corrected in the answer. Thank you! – Hui Zhang Mar 03 '12 at 13:39
  • It's fine, doesn't change the final result. – usero Mar 03 '12 at 13:51