6

I want to have an estimation, that my solution has an error, let's say less than 1e-8.

Usually, I stop the Gauss-Seidel algorithm, when the residual is "small enough" and this is already the problem. How do I know when the residual is small enough, because even when the residual is small, the solution may still have too much error. So this is no good method.

What do you use as stopping criterion?

On another website (math-linux.com) I found a stopping criterion: $$ \|r\|/\|b\|\leq \epsilon $$ But again, what theory is behind that?

This, by the way, is the code I used in my last project, just for information, how I did it:

void relax(double epsilon, vector<double> &x, SparseMatrix &A, const vector<double> &f) {
    int maxIter = 100;
    int iter = 0;
    double residual = 1.0;
    double minResidual 0.000001; //I also tried 1e-14;

    while (iter < maxIter && residual >= minResidual) {
        for (int i = 0; i < A.dim; ++i) {
            double ls = A.lineScalar(i, x);
            x[i] = (1.0/A(i,i)) * (f[i] - ls);
        }
        vector<double> temp = A.multiply(x);
        residual = L2(temp - f);
    }
}
vanCompute
  • 589
  • 1
  • 4
  • 16

2 Answers2

7

One cannot conclude from a residual how accurate the solution is. Between the best and the worst case in norm, there is a factor of exactly the condition number. More precisely, if the residual norm is r and the error norm is e then $\|A\|^{-1}\le e/r \le \|A^{-1}\|$, and both bounds are attainable. Taking the quotient of the bounds proves the claim.

The order of magnitude of the norm of the inverse can often be estimated from a few steps of the Lanczos/Arnoldi method. Thus one can apply the bounds.

However, generally, model equations are not really exact anyway, and getting the residual smaller than the accuracy with which uncertain data determine the residual is considered the scientifically corrrect procedure. Depending on the data, this may be far earlier that stagnation.

The theory behind this is called backward error analysis. For example, if the only source of uncertainty are measurement errors in the right hand side of 1e-4 in some norm, and your residuum norm is smaller than 1e-4, it cannot be distinguished from the residuum the exact solution of your solved problem would have when used as approximation in a problem with a different rhs of the same accuracy.

Arnold Neumaier
  • 11,318
  • 20
  • 47
  • As one needs $|A^{-1}|$ this approach is not feasible, as one has to know the solution for calculation of the error. It may be better to look, wether the residuum stagnates? – vanCompute Jul 25 '12 at 12:45
  • The order of magnitude of the inverse norm can often be estimated from a few steps of the Lanczos/Arnoldi method. - The fact that one cannot conclude from a residual how accurate the solution is is independent of the knowledge of the value of the inverse norm. - Generally, model equations are not really exact anyway, and getting the residual smaller than the accuracy with which uncertain data determine the residual is considered the scientifically corrrect procedure. Depending on the data, this may be far earlier that stagnation. – Arnold Neumaier Jul 25 '12 at 13:03
  • Ok, if my matrix A comes from a PDE and the right hand side is a measurement with accuracy 1e-4 (for example), then I would stop ich the residuum is smaller then 1e-4? But what theory is behind that? – vanCompute Jul 25 '12 at 13:26
  • 1
    The theory behind it is called backward error analysis. If your residuum is smaller than 1e-4 it cannot be distinguished from the residuum the the exact solution of your particular problem would have when used as approximation in a problem with a different rhs of the same accuracy. – Arnold Neumaier Jul 25 '12 at 13:51
  • Ok, I will look it up that evening. Thank you for the hint. – vanCompute Jul 25 '12 at 14:24
  • 1
    "order of magnitude of the inverse norm can often be estimated from a few steps" -- A "few steps" is very unreliable, it's easy to be orders of magnitude off after 100 iterations. Arnoldi is worse than Lanczos, but in either case, you really have to be converged to some modest tolerance to have a useful estimate. In contrast, a few iterations (less than 10) almost always gives you an accurate estimate of the largest singular value of $A$. – Jed Brown Jul 25 '12 at 15:41
  • @JedBrown: I agree. That's why I had added the qualifier ''often''. It depends on the kind of matrix and the distribution of the singular values. – Arnold Neumaier Jul 25 '12 at 16:02
  • I did some research on backward error analysis and found this link. With that you can calculate a bound for the residual. – vanCompute Jul 25 '12 at 20:32
  • @vanCompute: You can calculate which bound is compatible with data your uncertainty. In case of no erros in the matrix (as we both assumed for illustrative purposes) it becomes my formula. – Arnold Neumaier Jul 26 '12 at 12:58
1

Usually, when I'm estimating a solution of a system of linear equations, I save the approximation $\vec x^{n-1}$ and use it to compute $x_{err}=max|x_i^n-x_i^{n-1}|$ over each component i. This is faster than calculating the residual, since it doesn't require a matrix-vector product. Although, if your system is ill-conditioned, it still may not give you the correct solution for a given tolerance $\epsilon$.

Paul
  • 12,045
  • 7
  • 56
  • 129
  • 1
    Quite interesting. I mean, it's like a Cauchy sequence, but the problem, what I have here is that I cannot bound the error. I only know, that the alogorithm is still converging, if I am right. – vanCompute Jul 25 '12 at 20:35
  • This is bad! Now you cannot distinguish between stagnation and convergence. Most iterative methods either embed an explicit residual evaluation or have one available through a recurrence relation, therefore you can use much better methods at no extra cost. In any case, it is never okay to simply declare convergence when your method is converging slowly. It is easy to construct problems for which that approach is arbitrarily bad. – Jed Brown Jul 25 '12 at 20:37
  • 1
    @JedBrown: True, but the same can be said of residual evaluation: one can construct problems which residual fails to distinguish between the two. – Paul Jul 25 '12 at 20:43
  • Yes, if I am looking at the differences between residuals, I could run into "slow convergence", too. Is that, what you mean, Paul? – vanCompute Jul 25 '12 at 20:50
  • @JedBrown: What would be a method with no extra cost? – vanCompute Jul 25 '12 at 20:52
  • @vanCompute In GS, you locally compute a residual, so you can sum them up. It doesn't represent a global residual, but that sweeping snapshot is still better than just looking at the change in successive iterates. GMRES provides the $l^2$ norm of the residual as part of the iteration (the vector is not computed until the end, but the norm is available). CG explicitly computes the residual vector (using the recurrence). – Jed Brown Jul 25 '12 at 21:03