5

I have a large matrix $A \in \mathcal{R}^{N\times N}$ which is supposedly positive-definite, but numerically low rank. Instead of $A$, I have its incomplete Cholesky factor $G$, such that $A \simeq GG^\top$. $A$ is very large, so I can't keep it in memory, but $G$ is small enough. I need to numerically solve:

$$ (A^{-1} + D)^{-1} v $$

where $D$ is a diagonal matrix with positive entries, and $v$ is a vector. I want to do so without forming an $N \times N$ matrix. Is this possible without further approximation?

(Context: this is the Newton-step in an optimization for variational Bayes under GP. $A$ is a Gram matrix, which I can evaluate any value if needed.)

Using the matrix inversion lemma, we could rewrite it as,

$$ (A - A (D^{-1} + A)^{-1} A)m \simeq G(G^\top m) - GG^\top (D^{-1} + A)^{-1} G(G^\top m)$$

But, the $(D^{-1} + A)^{-1}$ term is still troublesome. (Perhaps a low-rank approximation of it could help?)

EDIT: If only I had the incomplete Cholesky of $A^{-1} = L L^\top$, things would have been much easier. In this case, matrix inversion lemma gives,

$$ (D^{-1} - D^{-1} L (I - L^\top D L)^{-1} L^\top D^{-1}) m $$

which can be all computed in the low-rank dimensions. Now, note that the pseudo-inverse of $G$ gives $L$. Hence, an econ-SVD on $G$ would solve it. Is there a better way?

Memming
  • 870
  • 1
  • 8
  • 19
  • 3
    Would an iterative solver count as "further approximation"? Even if the relative error of the linear solver is kept smaller than the "truncation errors" incurred in the incomplete Cholesky decomposition? – GoHokies Sep 23 '15 at 17:21
  • @GoHokies what kind of iterative solver do you have in mind? I'll be happy with any fast numerical solution. – Memming Sep 23 '15 at 18:11
  • Related answers: http://scicomp.stackexchange.com/a/10631/1128 – Memming Sep 23 '15 at 20:36
  • 1
    If $D$ is positive definite, I'd suggest you use CG on $(GG^T+D^{-1})$. Otherwise, SYMMLQ or MINRES. – GoHokies Sep 24 '15 at 08:58
  • @GoHokies thanks for the suggestion. D indeed has a positive diagonal elements. SYMMLQ or MINERS both seem to work with scaled identity matrices. Is it straightforward to modify them to arbitrary diagonal matrices? I don't have enough memory for $GG^\top + D^{-1}$. – Memming Sep 24 '15 at 12:39
  • 2
    Try Conjugate Gradients (CG). With CG (or other Lanczos-type iterative solvers) you do not need to form $GG^T + D^{-1}$ explicitly, just to calculate its action on a vector: $(GG^T + D^{-1}) y = G(G^T y) + D^{-1}y$. Total (asymptotic) cost of such a vector product: ${\cal O}(N^2)$ operations (the real cost is even lower if your Cholesky factor is sparse). One further observation: your linear solve is the inner loop of a Newton iteration, so it may pay off (computationally) to do less exact solves at the beginning, and tighten the tol-factor as you approach the optimal (Newton) solution. – GoHokies Sep 24 '15 at 13:22
  • The Sherman–Morrison formula may help : https://en.wikipedia.org/wiki/Sherman%E2%80%93Morrison_formula – Anthony Scemama Sep 26 '15 at 16:22

1 Answers1

2

Y.Z. found a simpler solution without an explicit pseudo-inverse.

Using the matrix inversion lemma twice, he got: $$(A^{-1} + D)^{-1}v = GG^\top v - GG^\top D GG^\top v + GG^\top D G (I + G^\top D G)^{-1} G^\top D GG^\top v$$ which can be evaluated without forming the large $N \times N$ matrix.

Memming
  • 870
  • 1
  • 8
  • 19