2

A Hermitian matrix is positive definite if the scalar $z^{\dagger} M z$ is strictly positive for every nonzero $z$.

But what does expression $z^{\dagger}Mz$ mean? I am not getting any intuition here. The spectral decomposition has a similar type of expression. Is there any relation with that?

shul
  • 109
  • 9

1 Answers1

4

An $n\times n$ complex matrix $A$ is called positive definite if for all nonzero complex vectors $x$ in $\mathbb C^n,$ where $x^\dagger$ denotes the conjugate transpose of the vector $x$ fulfill the condition:

$$\Re\left[x^\dagger Ax \right]>0$$

Generally, this concept is applied to symmetric matrices.

For example, given the matrix

$$A = \begin{bmatrix} 3+3i & -5+0i & 2-9i \\ -5+0i & 6+2i & 7-1i \\ 2-9i & 7-1i & -1+0i \end{bmatrix}$$

and an example of a complex vector

$$x = \begin{bmatrix} 9-1i \\0-3i \\0-1i\end{bmatrix}$$

with conjugate transpose

$$x^\dagger = \begin{bmatrix} 9+1i &0+3i &0+1i\end{bmatrix}$$

we get

$$x^\dagger A x = 315+240i$$

with $\Re(x^\dagger A x) = 315 >0.$

In the case of a Hermitian matrix $x^\dagger A x$ will be real. For example,

$$A =\begin{bmatrix} 3 & -5-3i & 2-1i\\ -5+3i & 6 & 7-1i\\ 2+1i & 7+1i & -1 \end{bmatrix}$$

$$x^\dagger A x = 135$$

The intuition is that of a quadratic operation, which renders itself to finding a global minimum when the result is consistently positive (picture a bowl).

From Prof. Strang's lectures:

$$x^\top A x = \begin{bmatrix}x_1&x_2\end{bmatrix}\begin{bmatrix}a&b\\b&c\end{bmatrix}\begin{bmatrix}x_1\\x_2\end{bmatrix}=ax_1^2 + 2 b x_1 x_2 +cx_2^2$$

The idea is that $x_1$ and $x_2$ are variables. Both $x_1^2$ and $x_2^2$ are naturally positive (or zero); therefore, the term $2bx_1x_2$ determines the PSD-ness of the matrix.

The most widespread application is in ordinary least squares (OLS) parameter estimation, where given a model matrix $X$ and dependent variable realization $y,$ a vector of parameters or coefficients $\hat \beta$ are calculated in closed form to minimize the sum of square residuals:

$$\begin{align} e^\top e &= (y - X\hat \beta)^\top(y - X\hat\beta)\\ &=y^\top y - y \hat\beta^\top X^\top y + \color{red}{\hat \beta^\top X^\top X\hat \beta} \end{align}$$

by differentiating:

$$\frac{\partial e^\top e}{\partial \hat \beta}=-2 X^\top y + 2 X^\top X \hat \beta =0$$

$\color{red}{ X^\top X}$ is a Gramian matrix, and hence, positive definite.

Also, a function $f : \mathbb R^n \to \mathbb R$ of the form $f(x) = x^\top Ax = \sum_{i,j=1}^n A_{ij} x_ix_j$ is called a quadratic form.