PSD

differential equation

Any matrix $A_{d \times d}$ is said to be PSD if $x^TAx \geq$ 0 $\forall$ vector $x_{d \times 1}$.

Examples

Identity matrix $I_{d \times d}$:

$$ A = \begin{bmatrix} 1 & 0 & 0 & 0 \newline 0 & 1 & 0 & 0 \newline 0 & 0 & 1 & 0 \newline 0 & 0 & 0 & 1 \end{bmatrix} $$

Then

\begin{align} x^TAx &= \sum_{i = 1}^d \sum_{i = 1}^d A_{i,j}x_ix_j \newline &= \sum_{i = 1}^d A_{i,i} x_i^2\newline &= \sum_{i = 1}^d x_i^2 \newline &\geq 0 \end{align}

Simple Matrix

$$ A = \begin{bmatrix} 2 & -1 \newline -1 & 1 \newline \end{bmatrix} $$

\begin{align} x^TAX &= \begin{bmatrix}x_1 & x_2\end{bmatrix} \begin{bmatrix}2 & -1 \newline -1 & 1\end{bmatrix} \begin{bmatrix}x_1 \newline x_2\end{bmatrix}\newline &= 2x_1^2 + x_2^2 - 2x_1x_2 \newline &= (x_1-x_2)^2 + x_1^2 \newline &\geq 0 \end{align}

Diagonal Matrix

$$ A = \begin{bmatrix} A_{11} & … & 0 \newline 0 & A_{22} & … \newline … & … & … \newline 0 & 0 & A_{dd} \end{bmatrix} $$

$$x^TAX = \sum_{i=1}^d A_{ii}x_i^2$$

If we conisder vector x to be \begin{bmatrix} 1 \newline 0 \newline … \newline 0 \newline \end{bmatrix}

$A_{11}$ has to be $\geq$ 0, same logic applies to other A’s.

So if All $A_{ii} \geq$ 0, then A is PSD.

Properties of PSD

  1. For any matrix A that is PSD, cA is also PSD, if c $\geq$ 0.
  2. Given two PSD matrices, A and B, (A + B) is also PSD
    • Proof: $x^T(A + B)X = x^TAx + x^TBX \geq 0 \forall x \subseteq R^d$
  3. A matrix M is PSD iff M can be written as $M = UU^T$ for some matrix U.
    • Proof: $UU^T$ is square
    • $x^TMx = x^TUU^Tx = (U^Tx)^T U^Tx = ||U^Tx||^2 \geq 0$

Convexity

  1. convexity: a function f: R $\rightarrow$ R is convex iff its second derivative $\geq$ 0 everywhere
  2. A function f: R $\rightarrow$ R is convex iff its Hessian is PSD

Examples

Example 1

Let $x \in R^d$$

$$f(x) = ||x||^2 = x^T x$$

$$\nabla f = 2x$$

$$\nabla^2 f = 2I$$

$\therefore$ it is PSD, since c $\geq$ 0 and I is PSD.

Example 2

Let $x = M_{1 \times d}$ and $u = M_{d \times 1}$

$$f(x) = (ux)^2$$

$$\nabla f = 2uxu$$

$$\nabla^2 f = 2uu^T$$

Since $uu^T$ is PSD, $2uu^T$ is also PSD.

Example 3

\begin{align} L(w) &= \sum_{i=1}^n (y^{(i)} - (w \cdot x^{(i)}))^2 \newline &= (y - xw)^T(y - xw) \newline &= y^Ty - 2y^Txw + w^Tx^Txw \end{align}

Then

$$\nabla f = 2y^Tx + 2w^Tx^Tx$ using $\frac {dx^TAx}{dx} = x^T(A + A^T)$$

$$\nabla^2 f = 2x^Tx$$

Since $x^Tx$ is PSD, $2x^Tx$ is also PSD.

Example 4

$$L(w) = \sum_{i=1}^n \ln (1 + e^{-y^{(i)} w \cdot x^{(i)}})$$

\begin{align} \frac {\partial L(w)} {\partial w_j} &= \sum_{i=1}^n \frac{e^{-y^{(i)} w \cdot x^{(i)}}(-y^{(i)}x_j^{(i)})}{1 + e^{-y^{(i)} w \cdot x^{(i)}}} \newline &= \sum_{i=1}^n \frac{-y^{(i)}x_j^{(i)}}{1 + e^{y^{(i)} w \cdot x^{(i)}}} \end{align}

\begin{align} \frac {\partial L(w)} {\partial w_j w_k} &= \sum_{i=1}^n \frac{e^{y^{(i)} w \cdot x^{(i)}}y^{(i)}x_j^{(i)}y^{(i)}x_k^{(i)}}{(1 + e^{y^{(i)} w \cdot x^{(i)}})^2} \newline &= \sum_{i=1}^n \frac{e^{y^{(i)} w \cdot x^{(i)}}y^{(i)2} x_j^{(i)}x_k^{(i)}}{(1 + e^{y^{(i)} w \cdot x^{(i)}})^2} \newline \end{align}

So Hessian

$$H = \sum_{i=1}^n \frac{e^{y^{(i)} w \cdot x^{(i)}}y^{(i)2}(x^{(i)})^T(x^{(i)})}{(1 + e^{y^{(i)} w \cdot x^{(i)}})^2}$$

$\therefore$ H is PSD.

Example 5

Let $u = M_{d \times 1}$

$$f(x) = e^{u \cdot w}$$

$$\nabla f = e^{u \cdot w}u$$

$$\nabla^2 f = e^{u \cdot w}u^Tu$$

Since $u^Tu$ is PSD, H is PSD.

Example 6

$$L(x) = \sum_{i=1}^n (y^{(i)} - (w \cdot x^{(i)}))^2 + \lambda ||w||^2$$

$$\frac{\partial L}{\partial w_j} = -2 \sum_{i=1}^n (y^{(i)} - (w \cdot x^{(i)}))x_j^{(i)} + 2 \lambda w_j$$

$$\frac{\partial L}{\partial w_j \partial w_k} = \begin{cases} 2 \sum_{i=1}^n x_j^{(i)2} + 2\lambda \text{, if k = j} \newline 2 \sum_{i=1}^n x_j^{(i)}x_k^{(i)} \text{, if k != j} \end{cases} $$

$$H = 2 x^Tx + 2\lambda I$$

$\therefore$ H is PSD.