Confusion related to convex optimization

Question

I have been reading about convex optimization. We have:

minimize $f(x)$ s.t. $h(x) = 0$, $g(x) \le 0$, $x \in X$

It's Lagrangian dual is:

maximize $\phi(\lambda,\mu)$ s.t. $\mu \ge 0$, where $\phi(\lambda,\mu) = \inf[f(x) + \lambda' h(x) + \mu 'g(x)]$

I don't understand why $\mu$ must be greater than zero. Can anyone please explain?

brent.payne · Accepted Answer · 2013-03-09T15:12:24.293

For this post: $\mathcal{L} = \mathcal{L}(\lambda, \mu) = f(x) + \lambda h(x) + \mu g(x)$

You want $\mathcal{L}(\lambda,\mu) \le f(x)$ for all valid $x$.
A valid $x$ is one that satisfies $h(x) = 0$, $g(x) \le 0$.

If $x$ satisfies $h(x) = 0$, then the term $\lambda h(x)$ in $\mathcal{L}$ is zero regardless of $\lambda$.

If $x$ satisfies $g(x) \le 0$, the the term $\mu g(x)$ in $\mathcal{L}$ is negative or zero as long as $\mu \ge 0$.

This is required for $\mathcal{L} = f(x) + \lambda h(x) + \mu g(x) \le f(x)$ for valid $x$. This makes the Lagrangian a lower bound on $f(x)$ for valid $x$.

score 2 · Answer 2 · answered Mar 09 '13 at 03:55

Ignore the equality constraint for a moment and only think of the inequality constrained problem (assuming all functions are at least once continuously differentiable). Now think about what it means for a point $x^\ast$ to be a minimizer of your problem:

If it lies in the interior of the domain described by $g(x)\le 0$ (i.e.: $g(x^\ast)<0$), then $x^\ast$ can only be a minimizer if $\nabla f(x^\ast)=0$ because otherwise going in direction $-\nabla f$ would lower the function value. So, in this case, the optimality condition $\nabla f(x^\ast) + \mu^\ast \nabla g(x^\ast)=0$ is satisfied only if $\mu^\ast=0$.
The consider the case where the minimizer $x^\ast$ lies at the boundary of the feasible domain, i.e., $g(x^\ast)=0$. In this case, for $x^\ast$ to be a minimizer, the gradient of $f$ at $x^\ast$ must point perpendicularly to the boundary and into the feasible set since, otherwise, going into the domain or along some direction along the boundary a little bit would lower the function value. On the other hand, the direction perpendicular to the boundary in the outward direction is given by $\nabla g(x^\ast)$. Consequently, we can express the optimality condition as saying that $\nabla f(x^\ast)$ must be equal to a negative multiple of $\nabla g(x^\ast)$. In other words, $\nabla f(x^\ast) + \mu \nabla g(x^\ast) = 0$ with some positive multiplier $\mu$.

Makes sense? It's actually a pretty intuitive argument if you think of it in terms of geometry.

Confusion related to convex optimization

2 Answers2