4

Consider

$$E := \{ x \in \Bbb R^n \mid x^T D x = 1 \}$$

an ellipsoid constructed by the diagonal matrix $D = \mbox{diag}(d_1, d_2, \dots, d_n)$ with $d_i > 0,\ \forall i \in [n]$. Suppose that $z$ is inside the ellipsoid, $z^T D z < 1$. What is the projection of $z$ on the surface of the ellipsoid $E$? Does the following nonconvex problem have a closed-form solution? \begin{equation} \min_{x\in \mathbb{R}^n} \ \|x-z\|_2^2 \qquad \textrm{subject to} \qquad x^TDx=1. \end{equation}

I doubt that there is a closed-form solution, following the standard procedure of KKT conditions. My inquiries on finding effective numerical methods show that there are numerical methods to solve the following related problem, for $z$ with $z^TDz>1$

\begin{equation} \min_{x\in\mathbb R^n}\ \|x-z\|_2^2 \qquad \mbox{subject to} \qquad x^TDx\le 1. \end{equation}

See this paper, for example. Any ideas on how I can probably use these methods? Or are there papers that discuss how to find a projection of a point inside an ellipsoid onto it?

I appreciate any useful information.

Sam
  • 364

2 Answers2

3

The fact that the $D$ matrix is diagonal makes the problem much easier. The Lagrangian is: $$L(x,y) = ||x-z||_2^2 + y\left(x^TDx-1\right)$$ so the KKT conditions are: $$(x_i-z_i) + yd_i x_i = 0 \quad \forall i$$ $$x^TDx = 1$$ The stationarity condition can also be expressed as: $$x_i = \frac{z_i}{1+yd_i}.$$ This simple expression is only possible because $D$ is diagonal, which gave rise to the term $yd_ix_i$. Due to symmetry, you know $x_i^*$ has the same sign as $z_i$, so $y\geq -1/(\max_i d_i)$. Note that $y=0$ is impossible (it leads to $x^T D x = z^TDz <1$), and that $y>0$ implies $|x_i| \leq |z_i| \; \forall i$ which is also not possible, so $y<0$ All that's left is finding $y$ for which $-1/(\max_i d_i) \leq y < 0$ and $$\sum_i d_i \left( \frac{z_i}{1+y d_i}\right)^2 = 1.$$ Since the left hand side is monotonously decreasing in $y$, you can use bisection search.

LinAlg
  • 19,822
  • This is fantastic. I really appreciate your time and effort. Voting up now and most probably will be accepted as the solution later. Looking forward for other possible solutions till the deadline but I am not really sure how they would beat yours! Thank you so much again. – Sam Jul 31 '20 at 21:23
  • Could you please explain how symmetry results in having the same sign for $x_i$ and $z_i$? I am basically thinking about a scenario in which the problem has multiple solutions, maybe that does not happen!? – Sam Jul 31 '20 at 22:01
  • I can see that the solution is not necessarily unique, at least for $x=0$. – Sam Jul 31 '20 at 22:24
  • symmetry: you can work out all $+/-$ cases, but intuitively, if you put $-|x_i^|$ and $|x_i^|$ on the line of reals (both are feasible), $|x_i^|$ is closer to positive numbers and $-|x_i^|$ is closer to negative numbers. Since the squared distance is in the objective, $x_i^* = |x_i^|$ has a better objective value when $z_i < 0$ (and $x_i^ = -|x_i^*|$. is better when $z_i>0$) – LinAlg Jul 31 '20 at 23:11
  • When $z_i=0$ for $i \in I$ with $I = \text{argmax}_i d_i$, the problem is a bit harder because $x_i=0$ is clearly not optimal (think 2d with $z=(1,0)$ and $d=(1,0.01)$). Setting $y=-1/(\max_i d_i)$ satisfies stationarity for $i\in I$, determines $x_i$ for $i \not\in I$ from the stationarity conditions, and then the other $x_i$ follow from $x^T Dx = 1$. – LinAlg Aug 01 '20 at 00:22
  • Thank you for your clarification. Nicely done! – Sam Aug 01 '20 at 22:27
2

One way to look at this problem is from a bounding perspective, although it only gives insight into the optimal distance $\|x^*-z\|_2$, and not necessarily localization information of $x^*$ itself in general.

In particular, note that we can define a lifted variable $X=xx^\top$. Then the left side of the constraint can be rewritten as \begin{equation*} x^\top Dx = \text{tr}(x^\top Dx) = \text{tr}(Dxx^\top) = \text{tr}(DX). \end{equation*} Similarly, the objective can be written as \begin{equation*} \|x-z\|_2^2 = x^\top x - 2z^\top x + z^\top z = \text{tr}(X)-2z^\top x + z^\top z. \end{equation*} Therefore, the projection problem is equivalent to the following: \begin{equation*} \begin{aligned} &\underset{x\in\mathbb{R}^n,X\in\mathbb{S}^n}{\text{minimize}} && \text{tr}(X)-2z^\top x + z^\top \\ &\text{subject to} && \text{tr}(DX)=1, \\ &&& X=xx^\top. \end{aligned} \end{equation*} Under this reformulation, the objective is affine, and the old equality constraint is also affine. However, the nonconvexity has been absorbed into the new constraint $X=xx^\top$. If you relax this constraint to $X\succeq xx^\top$, the problem becomes convex, since $f\colon\mathbb{R}^n\to\mathbb{S}^n$ defined by $f(x,X)=xx^\top-X$ is cone-convex with respect to the positive semidefinite cone. Indeed, using Schur complements, we can further rewrite the condition that $X-xx^\top\succeq 0$ as \begin{equation*} \begin{bmatrix} 1 & x^\top \\ x & X \end{bmatrix} \succeq 0. \end{equation*} Since we've introduced a relaxation of your original problem, we conclude that the following (convex) semidefinite programming problem lower bounds your original problem: \begin{equation*} \begin{aligned} &\underset{x\in\mathbb{R}^n,X\in\mathbb{S}^n}{\text{minimize}} && \text{tr}(X)-2z^\top x + z^\top \\ &\text{subject to} && \text{tr}(DX)=1, \\ &&& \begin{bmatrix} 1 & x^\top \\ x & X \end{bmatrix} \succeq 0. \end{aligned} \end{equation*} Note that in the case the final constraint is active at optimum, i.e., $X^*=x^*x^{*\top}$, you can conclude that $x^*$ solves the original nonconvex problem.

For the other side of things, you can upper bound the optimal value by looking at the eigenvalues of $D$. In particular, the eigenvalues of $D$ are precisely the diagonal elements of $D$ (since it is a diagonal matrix per your assumption). Without loss of generality, let us assume $d_1\ge d_2\ge \cdots \ge d_n$. Then the eigenvector associated with eigenvalue $d_i$ is $e_i$, the $i$th standard basis vector. Let $x=\frac{1}{\sqrt{d_i}}e_i$. Then we remark that $x$ is feasible for your original optimization problem, since \begin{equation*} x^\top Dx = \frac{1}{d_i}e_i^\top De_i = \frac{1}{d_i}e_i^\top (d_i e_i) = e_i^\top e_i = 1. \end{equation*} The corresponding objective value is \begin{equation*} \|x-z\|_2^2 = \left\|\frac{1}{\sqrt{d_i}}e_i - z\right\|_2^2. \end{equation*} This value trivially upper bounds the optimal objective value of the minimization problem. Since this holds for all $i\in\{1,2,\dots,n\}$, we conclude that the following upper bounds the optimal value of the problem: \begin{equation*} \min_{i\in\{1,2,\dots,n\}}\left\|\frac{1}{\sqrt{d_i}}e_i - z\right\|_2^2. \end{equation*}

With a bit more work, it may be possible to tighten these bounds, or even reformulate your problem differently so as to find an exact solution. I hope this helps give you some ideas.

brenderson
  • 1,462
  • 1
    Thank you so much for your response. I was actually aware of the lifting procedure and I must have mentioned in my question that I am not interested in this relaxation because it scales to $\mathcal O(n^3)$ which is costly for my setting. I was hoping to find something of a smaller order. nevertheless, I sincerely appreciate your time and effort. Voting up! – Sam Jul 31 '20 at 06:39
  • Okay! I'll try to think of some alternate approaches in my free time. – brenderson Jul 31 '20 at 15:29
  • That is very kind of you. Thank you. – Sam Jul 31 '20 at 21:24
  • Looks good! Did you use \underset to make the changes? – brenderson Aug 04 '20 at 16:11
  • @brenderson Indeed, I did. – Rodrigo de Azevedo Aug 05 '20 at 07:03