How to treat sequential numerical optimization (bilevel optimization)?

Question

I am trying to understand how to set up sequential optimization problems. I first want to understand how to treat a basic problem like

\begin{equation} \max_{x \in \Omega} f(x,m(x)) \quad \text{with} \quad m(x) = \max_{y}g(x,y) \end{equation}

In my actual problem, I have very complicated nonlinear vector functions and several constraints building the set $\Omega$.

But, let us consider an example with \begin{eqnarray} f(x,m(x)) &=& \sin(x)m(x) \ , \\ m(x) &=& \max_{y} g(x,y) \ , \\ g(x,y) &=& -(y-\sin(x))^2+\cos(x) \ , \\ \Omega &=& \{x \ | \ 0 \leq x \leq \pi \} \end{eqnarray} Naturally, $m(x) = \cos(x)$ and $\max_{x \in \Omega}f = 1/2$ for $x=\pi/4 \approx 0.785398$ holds. But suppose you dont have an analytical solution for the maximum of $g$ in respect to $y$ (due to high nonlinearity) and you want to treat the problem numerically.

How would you approach this problem numerically?

My attempt is to define all functions with numerical arguments and use NMaxValue and NMaximize, but this takes forever (around 17 mins on my laptop) already for this 1D problem.

g[x_?NumericQ, y_?NumericQ] := -(y - Sin[x])^2 + Cos[x];
m[x_?NumericQ] := NMaxValue[g1[x, y], y];
Omega = ImplicitRegion[0 <= x <= Pi, x];
f[x_?NumericQ] := Sin[x]*m[x];
DateString[]
NMaximize[{f[x], 0 <= x <= Pi}, x]
DateString[]
DateDifference[t1, t2, {"Minute", "Second"}]

"Sat 30 Dec 2017 15:17:01"

{0.5, {x -> 0.785398}}

"Sat 30 Dec 2017 15:33:54"

16 min 53 s

EDIT: I just learned, that this kind of problems is referred to as bilevel optimization, see Wikipedia. How do you treat these problems with Mathematica?

I am curious what your actual optimization problem is. Would you mind to elaborate on it a bit? — Henrik Schumacher, Jan 01 '18 at 11:22
Sadly, right now I only have my phone with me. I will try to find this week a compact form for my problem and post it. — Mauricio Fernández, Jan 01 '18 at 13:27
@HenrikSchumacher I wont be able to post my actual problem here. I have been trying to find a compact description for my actual problem, but it contains several high-order tensors, f is a very large expression and I would have to introduce too much stuff. Further, my g functions are not concave and have several maxima. But I think I have just came up with an artificial constraint which helps me define implicitly my maxima for my special g. Still, thank you for the help. — Mauricio Fernández, Jan 02 '18 at 14:37

score 5 · Accepted Answer · answered Dec 31 '17 at 00:39

5

Assuming that the maxima of g is in the interior and not on a boundary, one can use the constraint $\frac{\partial g}{\partial y}=0$. Then, the maxima can be obtained with:

Maximize[{f[x, g[x, y]], Derivative[0, 1][g][x, y] == 0}, {x, y}] //TeXForm

$\left\{\frac{1}{2},\left\{x\to -2 \left(24 \pi -\tan ^{-1}\left(\frac{4+3 \sqrt{2}}{-2-\sqrt{2}}\right)\right),y\to \frac{2 \left(-1-\sqrt{2}\right)}{1+\left(-1-\sqrt{2}\right)^2}\right\}\right\}$

answered Dec 31 '17 at 00:39

Carl Woll

130,679
6
243
355

Thank you for the advice. I will check if I can use the stationarity condition as an additional constraint in my actual problem. Do you have any idea how to treat such problems if also conditions are imposed on $y$ and the maximum is obtained in the boundary of the corresponding constraint set for $y$, say $\Omega_y$? I will try to create an example for that. – Mauricio Fernández Dec 31 '17 at 10:18
@MauricioLobos If g is convex with respect to y and if the domain of definition of g is also concave (and some constraint qualification is satisfied) then the Karush-Kuhn-Tucker-conditions are not only necessary but also sufficient. These can be used similarly to $\frac{\partial g}{\partial y} = 0$ as constraints. – Henrik Schumacher Jan 01 '18 at 00:29
Together with optimality of f, this would lead to a system of semismooth equations that can be solved, e.g., with the semismooth Newton method (not hard to implement, actually I have some code for it somewhere). If the optimization problem for g is nonconvex in y, I had to think about it again... – Henrik Schumacher Jan 01 '18 at 00:30
@MauricioLobos Sorry: If it's about maximization then g being concave would make things much easier. – Henrik Schumacher Jan 01 '18 at 00:36
@HenrikSchumacher thanks! I will check my g is concave and if the KKT conditions are applicabke to my case. Do you happend to have a reference for the application of semismooth Newton in bilevel optimization? – Mauricio Fernández Jan 01 '18 at 11:13
@MauricioLobos Actually no. =0/ It just came to my mind yesterday because it can be a very efficient way to solve the KKT conditions. – Henrik Schumacher Jan 01 '18 at 11:16
@CarlWoll, am I missing something but why x is so small in the result (in g it seems to be somewhere around 1)? I also cannot make this code working in 11. Could you provide f in your code, as in OP's f seems different. – garej May 24 '18 at 05:17

How to treat sequential numerical optimization (bilevel optimization)?

1 Answers1

Linked