2

In a personal project of mine, I've derived a couple of equations.

$$ \sum_{j=1}^Nu_{ij}^*(\theta_1,...,\theta_N,J) - k_{i} = 0 \qquad \forall 1\leq i \leq N$$ $$\sum_{i<j}^N\big(u_{ij}^*(\theta_1,...,\theta_N,J)\big)^2 - O^* = 0$$

where $u_{ij}^*(\theta_1,...,\theta_N,J)$ is a solution to the equation

$$u_{ij} =\frac{1}{2} + \frac{1}{2}\tanh{\left( - \frac{\theta_i + \theta_j}{2} + 2\cdot J\cdot M \cdot u_{ij}\right)} $$

where $M,N,k_i^*, O^*$ are known parameters, and $\theta_1,...,\theta_N,J$ are $N+1$ unknown parameters. I thought this system should be solvable since the top two equations give us $N+1$ equations as well.

I think the difficulty in this problem is due to the presence of some unknown/hidden function $u_{ij}^*$.

My first try was to construct a quadratic cost function, and to minimize this quadratic cost function using gradient descent (involving the partial derivatives of $u_{ij}^*$ (which can be obtained implicitly), however this method did not work out.

Does anyone have any idea on how to solve this or know of some tools which I can use to solve this?

Or maybe you could point me to a different place?

nicoguaro
  • 8,500
  • 6
  • 23
  • 49
hoolaboris
  • 21
  • 1
  • 1
    What do you mean "this method did not work out"? Was the method inadequate, or was there a bug in your code? – Wolfgang Bangerth Jun 20 '17 at 15:28
  • I think that you have more equations than unknowns. Your unknowns are $\mathbf{x} = (u_{1,1}, u_{1,2}, \cdots, u_{i,j},\cdots, u_{N-1, N}, u_{N,N}, \theta_{1}, \theta_{2}, \cdots, \theta_{N})$, that add up to $N^2 + N$. While you have $N^2 + N + 1$ equations. – nicoguaro Jun 20 '17 at 16:48
  • I solved using scipy.optimize.root (neglecting the equation that involves the $O*$) and using your approach with scipy.optimize.minimize for $N=2$, and it worked. Of course, some of the components of the solution vector are different. – nicoguaro Jun 20 '17 at 17:05
  • @WolfgangBangerth , depending on my initial parameter value choice ( in the case of convergence) it converged to local minima of the cost function. – hoolaboris Jun 21 '17 at 10:03
  • I see, I missed $J$. I will post an answer considering that. – nicoguaro Jun 21 '17 at 17:39
  • @hoolaboris -- these are difficult points, but you will get trapped in the same points if you tried to use a Newton method or similar for root finding. You may have to come up with a way to get a good initial guess. – Wolfgang Bangerth Jun 23 '17 at 02:14

1 Answers1

2

First, let's rewrite your system in the form of a vector function for which you want to find a root.

$$\mathbf{F}(\mathbf{x}) = 0$$

with

\begin{align} F_i(\mathbf{x}) &= \sum_{j=1}^N u_{ij} - 1 \qquad \forall 1\leq i \leq N\\ F_i(\mathbf{x}) &= \sum_{i<j}^N u_{ij}^2 k_i^2 - R \qquad i = N + 1\\ F_i(\mathbf{x}) &= u_{ij} -\frac{1}{2} - \frac{1}{2}\tanh{\left( - \frac{\theta_i + \theta_j}{2} + 2 J u_{ij}\right)} \qquad \forall i >N +1 \land i\leq N^2 + N + 1\, , \end{align}

and

$$\mathbf{x} = (u_{1,1},u_{1,2},\cdots,u_{i,j},\cdots, u_{N-1,N},u_{N,N},θ_1,θ_2,\cdots,θ_N,J)\, .$$

The parameters $k_i$ are absorbed as part of the new variables $u_{ij}$, and the same happened to $M$. That way, you have $N + 1$ less parameters to worry about (I also renamed $O^*$ as $R$).

Then, you can use a root finding algorithm. If you can compute the Jacobian, you can use something like Newton's method.

The option that you propose would be something like

$$\min_{\mathbf{x}} \Vert \mathbf{F}(\mathbf{x})\Vert^2\, ,$$

what you can plug into a minimization algorithm

Below, I implemented both approaches in Python using scipy.optimize. Notice that basin-hopping is a global minimization approach and it is not a fair comparison. Also, note that basin-hopping will run several initial points and it will take more time that just using a single initial point. If you have a good guess, then it might improve.

from __future__ import division, print_function
import numpy as np
from numpy.linalg import norm
from scipy.optimize import root, basinhopping


def fun(x, R, N, k):
    u = x[:N**2].copy()
    u.shape = (N, N)
    f = np.zeros_like(x)
    f[:N] = -np.ones((N))
    f[N] = -R
    for row in range(N):
        for col in range(N):
            f[row] += u[row, col]
            f[N + 1 + N*col + row] = u[row, col] - 0.5 -\
                0.5*np.tanh(-0.5*(x[N**2 + row] + x[N**2 + col]) +
                2*x[-1]*u[row, col])
            if row < col:
                f[N] += u[row, col]**2*k[row]**2
    return f


def opt_fun(x, R, N, k):
    return norm(fun(x, R, N, k))**2


N = 2
R = 1
size = N**2 + N + 1
x0 = np.random.normal(loc=1,size=size)
k = np.ones((N))
res_root = root(fun, x0, args=(R, N, k))
res_min = basinhopping(opt_fun, x0,
                       minimizer_kwargs={'args': (R, N, k)}, seed=1)
print("Root finding algorithm norm: {:.6g}".format(norm(res_root.fun)))
print("Minimization algorithm norm: {:.6g}".format(norm(res_min.fun)))

For $N=2$ the root finding works better. For higher orders it seems that the optimization procedure works better. For example for $N=10$, this is the output

Root finding algorithm solution norm: 0.772523
Minimization algorithm solution norm: 1.67062e-13
nicoguaro
  • 8,500
  • 6
  • 23
  • 49
  • Thanks for the answer. There are some mistakes in your code where you fill in the vector function, but I understand the general idea. I'm however not sure about your redefinition of $u_{ij}$, as it does not seem consistent between the first $N$ equations and the $N+1$'th equation – hoolaboris Jun 22 '17 at 12:04
  • What are the mistakes? And what is the inconsistency in $u_{ij}$? If you point them out I can correct them or discuss them with you. – nicoguaro Jun 22 '17 at 12:39
  • According to your equations posted above the code: in the code you're adding -1 to the first $N$ equations for every column, i think it should only be added before the 2nd loop begins. For the $N+1$'th equation, you are again adding $R$ for every column/row, while it should only be added once. For the equation $i$ where $(N+1)< i \leq N^2+N+1$, the index is wrong. I think it should be $N+1 + Ncol + row$ instead of $N+1+colrow + row$. – hoolaboris Jun 22 '17 at 13:33
  • For the inconsistency on $u_{ij}$, i'll get back to you later :) But i think the redefinition is fine in the first $N$ equations, but in the $(N+1)$'th equation you can't use the same redefinition for $u_{ij}$, since you now sum over $i$ aswell (which kind of changes the system of equations), but I'll think about it some more later and get back to you! – hoolaboris Jun 22 '17 at 13:39
  • @hoolaboris, I think that you are right. I have fixed it. I also changed the initial guess for a random one. – nicoguaro Jun 22 '17 at 15:51