2

It is written in a book I'm reading that $$\nabla f(x) = \left( \frac{\partial f(x)}{\partial x_1}, \frac{\partial f(x)}{\partial x_2},...,\frac{\partial f(x)}{\partial x_n}\right)$$ and $$\nabla^2 f(x)_{ij} = \frac{\partial^2 f(x)}{\partial x_i ~\partial x_j}, \qquad \forall i,j=1,...,n.$$

According to 2nd-order conditions: for twice differentiable function $f$, it is convex if and only if $$\nabla^2 f(x) \ge 0, \qquad \forall x \in \mathrm{dom} f.$$

But, the function $f(x,y) = \sqrt{x^2+y^2}$ is convex, but does not meet 2nd-order conditions: $$ \begin{aligned} \frac{\partial^2 }{\partial x^2} \sqrt{x^2+y^2} &= \frac{y^2}{(x^2+y^2)^{\frac{3}{2}}} \ge 0,\\ \frac{\partial^2 }{\partial x ~ \partial y} \sqrt{x^2+y^2} &= - \frac{x y}{(x^2+y^2)^{\frac{3}{2}}} \le 0. \end{aligned} $$ Can anyone explain this?

Christian Clason
  • 12,301
  • 3
  • 48
  • 68
Kevin
  • 161
  • 1
  • 5
  • 8
    This notation is a bit misleading -- for a matrix $A\in\mathbb{R}^{n\times n}$, writing $A\geq 0$ usually does not mean that all entries $a_{ij}$ are positive, but that the matrix is positive (semi-)definite, i.e., $x^TAx\geq 0$ for all $x\in\mathbb{R}^n$. – Christian Clason Jan 20 '16 at 12:51
  • Do you mean the matrix $D$ must be positive s.t. $det(D) \ge 0$. $D= [\frac{\partial^2 f}{\partial x^2},\frac{\partial^2 f}{\partial x \partial y}; \frac{\partial^2 f}{\partial y \partial x},\frac{\partial^2 f}{\partial y^2} ] $ – Kevin Jan 20 '16 at 13:30
  • 1
    Not quite -- all eigenvalues must be positive (and real) (which is only sufficient, not necessary, for the determinant to be positive). Put another way, $det(A)\geq 0$ is only necessary, not sufficient for convexity. – Christian Clason Jan 20 '16 at 13:31
  • But how to check the eigenvalues for matrix $D$, since it is not formed by real values. – Kevin Jan 20 '16 at 13:33
  • By the way, what is the meaning of $A \succeq 0$. – Kevin Jan 20 '16 at 13:52
  • 1
    By Sylvester's criterion a $2\times2$ matrix is p.d. iff $A_{11}>0$ and $\det A>0$. That's usually easier than computing eigenvalues. – Kirill Jan 20 '16 at 16:53
  • @ChristianClason Yes, you're right of course. It's all principal minors (not just leading principal minors) for p.s.d.. So $A_{11},A_{22}\geq 0$, $\det A\geq 0$ would be sufficient. – Kirill Jan 20 '16 at 19:12

2 Answers2

5

Consolidating my comments (so that they can be cleaned up): This is a misunderstanding.

A twice (continuously!) differentiable function $f:\mathbb{R}^n\to \mathbb{R}$ is convex if and only if the Hessian $\nabla^2 f(x)\in\mathbb{R}^{n\times n}$ is positive semi-definite at every $x\in \mathbb{R}^n$. (This definition makes sense since the Hessian is symmetric by Schwarz' theorem if the second derivatives are continuous.) This is sometimes written as $$\nabla^2 f(x) \succeq 0 \qquad\text{for all } x\in\mathbb{R}^n$$ (and more rarely -- since it can lead to misunderstandings -- as $\nabla^2 f(x)\geq 0$).

As @nicoguaro points out in his answer, this is equivalent to the condition that all eigenvalues of $\nabla^2 f(x)$ -- as a function of $x$ -- are nonnegative for every $x\in \mathbb{R}^n$. An equivalent (and often easier to verify, especially for large $n$) condition is that $$d^T\nabla^2 f(x)d \geq 0 \qquad\text{for all } d\in\mathbb{R}^n \text{ and }x\in\mathbb{R}^n.$$

(This condition is also easier to work with if you want to rule out convexity: It's sufficient to find a single $d$ such that $d^T \nabla^2 f(x) d<0$.)


In your example (with $x_1 = x$ and $x_2 = y$), this would yield $$ \begin{aligned} \begin{pmatrix} d_1 & d_2 \end{pmatrix} \begin{pmatrix} \frac{x_2^2}{(x_1^2 + x_2^2)^\frac{3}{2}} & \frac{-x_1\,x_2}{(x_1^2 + x_2^2)^\frac{3}{2}} \\ \frac{-x_1\,x_2}{(x_1^2 + x_2^2)^\frac{3}{2}} & \frac{x_1^2}{(x_1^2 + x_2^2)^\frac{3}{2}} \end{pmatrix} \begin{pmatrix} d_1 \\ d_2 \end{pmatrix} &= \frac{1}{(x_1^2 + x_2^2)^\frac{3}{2}}\left(d_1^2x_2^2 - 2 d_1 x_1x_2d_2 + d_2^2x_1^2\right)\\ &= \frac{1}{(x_1^2 + x_2^2)^\frac{3}{2}}\left(d_1x_2-d_2x_1\right)^2\\ &\geq 0 \end{aligned} $$ for all $x,d\in\mathbb{R}^n$. Hence, $f$ is convex.

Christian Clason
  • 12,301
  • 3
  • 48
  • 68
3

The comments already mention what you are not considering. For the particular example you mention we have

$$\nabla^2 f(x,y) = \begin{pmatrix} \frac{y^2}{(x^2 + y^2)^\frac{3}{2}} & \frac{-x\,y}{(x^2 + y^2)^\frac{3}{2}}\\ \frac{-x\,y}{(x^2 + y^2)^\frac{3}{2}} & \frac{x^2}{(x^2 + y^2)^\frac{3}{2}} \end{pmatrix}$$

And the eigenvalues are $0$ and $\frac{1}{\sqrt{x^2 + y^2}}$. That are greater or equal to zero for all $x,y \in \mathbb{R}^+$.

nicoguaro
  • 8,500
  • 6
  • 23
  • 49