Newton-Raphson method fails!

Question

I am trying to solve an equation like $R(x) = 0$, using Newton-Raphson method. To obtain the $x$ increment in each iteration I solve $dx = -(A)^{-1}\cdot R$ where $A = dR/dx$. But the convergence criteria, here $||R||$ reduces to a certain amount, depending on the input, e.g. 1.e-8, and it remains so. After much trying I did not find any special bug, so i am starting to have second thoughts about my implementation.

I am wondering now that the approach to calculate the inverse of matrix $A$ might cause inaccuracy so that the convergence fails. What do you think?

EDIT: The code is written in fortran; I calculate the inverse using MKL's routines: dgetrf and dgetri. I am using double-precision variables.

The problem is to solve $R(\sigma)$ in n steps using Backward-Euler method. So functions $Z^p$ and $Z^d$ are also dependent on $\sigma$. In each step $\Delta\epsilon_{n+1}$ is updated and we solve an implicit problem to update $\sigma$

where for a plane strain problem the stress tensor reads

$$ {\boldsymbol\sigma} = \left[ \begin{array}{c} \sigma_{11}\\ \sigma_{22}\\ \sigma_{33}\\ \sigma_{12} \end{array} \right] $$

The function $R$ is: $$ \mathbf{R}_{\boldsymbol\sigma} = {\boldsymbol\sigma}_{n+1} - {\boldsymbol\sigma}_n - \mathbf{D}\Delta{\boldsymbol\epsilon_{n}} + \Delta t\,\mathbf{D}\left[\mathbf{Z}^p({\boldsymbol\sigma}^{\rm{dev}}) + \mathbf{Z}^d({\boldsymbol\sigma}^{\rm{dev}})\right] $$ where $$ \begin{align} \left[\mathbf{Z}^d({\boldsymbol\sigma}^{\rm{dev}})\right]_i &= \alpha\,\left[{\boldsymbol\sigma}^{\rm{dev}}\right]_i \\ \left[\mathbf{Z}^p({\boldsymbol\sigma}^{\rm{dev}})\right]_i &= \beta\,||{\boldsymbol\sigma}^{\rm{dev}}||^{\frac{1-m}{m}} \left[{\boldsymbol\sigma}^{\rm{dev}}\right]_i \\ \left[{\boldsymbol\sigma}^{\rm{dev}}\right]_i &= \left[\mathbf{I}^{\rm{dev}}\right]_{ij}\,\left[{\boldsymbol\sigma}\right]_j \end{align} $$

$I^{dev}$ is defined as:

$$ I_{ij}^{dev} = \delta_{ij} - \frac{1}{3}m_im_j \quad\text{where}\quad \mathbf{m} = \left[ \begin{array}{c} 1\\ 1\\ 1\\ 0 \end{array} \right] $$

and $\delta$ is the Kronecker delta.

The tangent modulus is given by $$ d\mathbf{R}_{\boldsymbol{\sigma}} = \frac{\partial \mathbf{R}_{\boldsymbol{\sigma}}}{\partial {\boldsymbol{\sigma}}}\,d{\boldsymbol{\sigma}} = \left[\mathbf{I} + \Delta t\,\mathbf{D}\left(\frac{\partial \mathbf{Z}^d({\boldsymbol\sigma}^{\rm{dev}})}{\partial \boldsymbol{\sigma}} + \frac{\partial \mathbf{Z}^p({\boldsymbol\sigma}^{\rm{dev}})}{\partial \boldsymbol{\sigma}}\right) \right]\,d{\boldsymbol{\sigma}} = \mathbf{A}\,d{\boldsymbol{\sigma}} $$ If we expand $\mathbf{A}$ we get $$ \mathbf{A} = \mathbf{I} + \Delta t\,\mathbf{D}\left[\left(\alpha+\beta\,||{\boldsymbol\sigma}^{\rm{dev}}||^{\frac{1-m}{m}}\right)\,\mathbf{I}^{\rm{dev}} + \frac{(1-m)\beta}{m}\,||{\boldsymbol\sigma}^{\rm{dev}}||^{\frac{1-3m}{m}}{\boldsymbol\sigma}^{\rm{dev}}\otimes{\boldsymbol\sigma}^{\rm{dev}}\right] $$

Here $x=\sigma_{n+1}$. Also, $\alpha, \beta, \Delta t, I_{ij}^{dev}, D$ and $m$ are constant($m=0.3$ and $i,j=1,...,4$.). $\sigma_n$ and $\Delta{\boldsymbol\epsilon_{n}}$ are constant in step n.

Here you can see the results for $R$ and $\sigma_{n+1}$.

What is your expression for $A$? Newton iterations often don't converge (or converge slowly) if the Jacobian has been derived from continuum mechanics considerations but is not consistent with the numerical integration method. — Biswajit Banerjee, Apr 11 '14 at 03:47
I am not sure I get your point. Can you explain more? (I added $A$ to the post.) — Vahid, Apr 11 '14 at 07:56
The primary reason for Newton iterations failing to converge, in my experience, is that the Jacobian is wrong. If the Jacobian is correct, then the cause is an instability in which case a line search or arc-length method can fix the problem. In your case, you haven't explained what your quantities are - matrices, vectors, what size etc. You have also not defined $\sigma^{dev}$ and $I^{dev}$. So it's hard to say what's wrong. For example, if $\sigma$ is the stress tensor you have to consider all nine components when you take derivatives. Have you done that? — Biswajit Banerjee, Apr 13 '14 at 02:05
stress and strain are rank-1 tensors with dimension of 4 for the case of plane strain. I don't believe that is an issue. Maybe the problem is the instability. — Vahid, Apr 13 '14 at 07:51
I always like to think of stress and strain as rank-2 tensors for physical reasons - independently of whether a 2D assumption is used. You model is relatively simple and should be easy to test. One ways of finding out the source of the problem is to test your model for a 1D state of stress (uniaxial stress) to find out whether there actually is an instability. Also, you can look at what characteristic the acoustic tensor has. The initial rate of convergence seems to be quadratic. Do you have a convergence plot? What are the eigenvalues of A? — Biswajit Banerjee, Apr 13 '14 at 21:45
@Vahid: Welcome to SciComp! You probably want to try the strategies in http://scicomp.stackexchange.com/a/31/276 first. — Geoff Oxberry, Apr 13 '14 at 21:55
@Biswajit, I am already testing with the simplest case as you mentioned. I have recognized whenever $\Delta\epsilon_{n+1}$ becomes greater than a certain value, the method doesn't converge. So I think the reason was a bad initial point. What do you think? By the way, by a look at the matrix, the eigen values of A are reasonable and finite. I have yet another question, in some cases the convergence rate becomes greater than 2, is that common? — Vahid, Apr 17 '14 at 14:47
@Vahid: Super-quadratic convergence is typically a sign that something is not quite right. This usually happens if the n-dimensional curve that you are trying to intersect moves around too much during Newton iterations. That may also be the reason that smaller strain increments are leading to convergence. Are you following the standard procedure, e.g. the one described here? — Biswajit Banerjee, Apr 22 '14 at 00:01

score 1 · Answer 1 · answered Apr 13 '14 at 21:58

I suggest you do the following to check your implementation:

Take your nearly-converged sigma vector and then calculate the 4x4 Jacobian matrix numerically, column-by-column using a central difference approximation. You may have to experiment a bit with the step size to obtain an accurate result. This numerical Jacobian should be very close to your analytical one.

Biswajit Banerjee · Answer 2 · 2014-04-14T02:49:55.993

As a check we can look at the problem in terms tensors. Let $$ \boldsymbol{\sigma}^{\text{dev}} = \boldsymbol{\sigma} - \frac{1}{3}(\boldsymbol{\sigma}:\mathbf{I})\,\mathbf{I} $$ where $\mathbf{I}$ is the second-order identity tensor. Then the fourth-order tensor $\mathsf{I}^{\rm dev}$ is given by $$ \mathsf{I}^{\rm dev} = \mathsf{I}^{(4s)} - \frac{1}{3} \mathbf{I}\otimes\mathbf{I} $$ where $\mathsf{I}^{(4s)}$ is the symmetric fourth-order identity tensor with components $1/2(\delta_{ik}\delta_{jl} + \delta_{il}\delta_{jk})$ and $\delta_{ij}$ is the Kronecker delta.

We can also express $\mathbf{Z}^p$ in tensor form as $$ \boldsymbol{Z}^p = \beta\,\left\|\boldsymbol{\sigma}^{\rm dev}\right\|^{\frac{1-m}{m}} \boldsymbol{\sigma}^{\rm dev} \,. $$ Then $$ \frac{\partial \boldsymbol{Z}^p}{\partial\boldsymbol{\sigma}} = \beta\left[\frac{\partial}{\partial \boldsymbol{\sigma}}\left(\left\|\boldsymbol{\sigma}^{\rm dev}\right\|^{\frac{1-m}{m}}\right)\otimes\boldsymbol{\sigma}^{\rm dev} + \left\|\boldsymbol{\sigma}^{\rm dev}\right\|^{\frac{1-m}{m}} \frac{\partial \boldsymbol{\sigma}^{\rm dev}}{\partial\boldsymbol{\sigma}}\right] $$ Let us look at the first term $$ \frac{\partial}{\partial \boldsymbol{\sigma}}\left(\left\|\boldsymbol{\sigma}^{\rm dev}\right\|^{\frac{1-m}{m}}\right) = \frac{1-m}{m}\left\|\boldsymbol{\sigma}^{\rm dev}\right\|^{\frac{1-2m}{m}} \frac{\partial\left\|\boldsymbol{\sigma}^{\rm dev}\right\|}{\partial \boldsymbol{\sigma}} $$ Next look at the part $$ \frac{\partial\left\|\boldsymbol{\sigma}^{\rm dev}\right\|}{\partial \boldsymbol{\sigma}} = \frac{1}{2\left\|\boldsymbol{\sigma}^{\rm dev}\right\|}\,\frac{\partial}{\partial\boldsymbol{\sigma}}(\boldsymbol{\sigma}^{\rm dev}:\boldsymbol{\sigma}^{\rm dev}) $$ Finally, we look at the last bit above, using $\boldsymbol{\sigma}^{\rm dev}:\mathbf{I} = 0$, $$ \frac{\partial}{\partial\boldsymbol{\sigma}}(\boldsymbol{\sigma}^{\rm dev}:\boldsymbol{\sigma}^{\rm dev}) = \mathsf{I}^{\rm dev}:\boldsymbol{\sigma}^{\rm dev} + \boldsymbol{\sigma}^{\rm dev}:\mathsf{I}^{\rm dev} = 2\boldsymbol{\sigma}^{\rm dev} $$ Therefore, $$ \frac{\partial \boldsymbol{Z}^p}{\partial\boldsymbol{\sigma}} = \frac{\beta(1-m)}{m}\left\|\boldsymbol{\sigma}^{\rm dev}\right\|^{\frac{1-3m}{m}} \boldsymbol{\sigma}^{\rm dev}\otimes\boldsymbol{\sigma}^{\rm dev} + \beta\,\left\|\boldsymbol{\sigma}^{\rm dev}\right\|^{\frac{1-m}{m}}\mathsf{I}^{\rm dev} \,. $$ ~~This is different from the expression you have used. So you will need to recheck your algebra.~~ So the tangent modulus appears to be correct.

Thank you first very much for taking your time to answer to this post. However, I got the same result and it was included in the post before the edit. I can't edit my post. The page doesn't load. — Vahid, Apr 13 '14 at 07:38

Newton-Raphson method fails!

2 Answers2