Positive Semi Definite
PSD
Any matrix $A_{d \times d}$ is said to be PSD if $x^TAx \geq$ 0 $\forall$ vector $x_{d \times 1}$.
Examples
Identity matrix $I_{d \times d}$:
$$ A = \begin{bmatrix} 1 & 0 & 0 & 0 \newline 0 & 1 & 0 & 0 \newline 0 & 0 & 1 & 0 \newline 0 & 0 & 0 & 1 \end{bmatrix} $$
Then
\begin{align} x^TAx &= \sum_{i = 1}^d \sum_{i = 1}^d A_{i,j}x_ix_j \newline &= \sum_{i = 1}^d A_{i,i} x_i^2\newline &= \sum_{i = 1}^d x_i^2 \newline &\geq 0 \end{align}
Simple Matrix
$$ A = \begin{bmatrix} 2 & -1 \newline -1 & 1 \newline \end{bmatrix} $$
\begin{align} x^TAX &= \begin{bmatrix}x_1 & x_2\end{bmatrix} \begin{bmatrix}2 & -1 \newline -1 & 1\end{bmatrix} \begin{bmatrix}x_1 \newline x_2\end{bmatrix}\newline &= 2x_1^2 + x_2^2 - 2x_1x_2 \newline &= (x_1-x_2)^2 + x_1^2 \newline &\geq 0 \end{align}
Diagonal Matrix
$$ A = \begin{bmatrix} A_{11} & … & 0 \newline 0 & A_{22} & … \newline … & … & … \newline 0 & 0 & A_{dd} \end{bmatrix} $$
$$x^TAX = \sum_{i=1}^d A_{ii}x_i^2$$
If we conisder vector x to be \begin{bmatrix} 1 \newline 0 \newline … \newline 0 \newline \end{bmatrix}
$A_{11}$ has to be $\geq$ 0, same logic applies to other A’s.
So if All $A_{ii} \geq$ 0, then A is PSD.
Properties of PSD
- For any matrix A that is PSD, cA is also PSD, if c $\geq$ 0.
- Given two PSD matrices, A and B, (A + B) is also PSD
- Proof: $x^T(A + B)X = x^TAx + x^TBX \geq 0 \forall x \subseteq R^d$
- A matrix M is PSD iff M can be written as $M = UU^T$ for some matrix U.
- Proof: $UU^T$ is square
- $x^TMx = x^TUU^Tx = (U^Tx)^T U^Tx = ||U^Tx||^2 \geq 0$
Convexity
- convexity: a function f: R $\rightarrow$ R is convex iff its second derivative $\geq$ 0 everywhere
- A function f: R $\rightarrow$ R is convex iff its Hessian is PSD
Examples
Example 1
Let $x \in R^d$$
$$f(x) = ||x||^2 = x^T x$$
$$\nabla f = 2x$$
$$\nabla^2 f = 2I$$
$\therefore$ it is PSD, since c $\geq$ 0 and I is PSD.
Example 2
Let $x = M_{1 \times d}$ and $u = M_{d \times 1}$
$$f(x) = (ux)^2$$
$$\nabla f = 2uxu$$
$$\nabla^2 f = 2uu^T$$
Since $uu^T$ is PSD, $2uu^T$ is also PSD.
Example 3
\begin{align} L(w) &= \sum_{i=1}^n (y^{(i)} - (w \cdot x^{(i)}))^2 \newline &= (y - xw)^T(y - xw) \newline &= y^Ty - 2y^Txw + w^Tx^Txw \end{align}
Then
$$\nabla f = 2y^Tx + 2w^Tx^Tx$ using $\frac {dx^TAx}{dx} = x^T(A + A^T)$$
$$\nabla^2 f = 2x^Tx$$
Since $x^Tx$ is PSD, $2x^Tx$ is also PSD.
Example 4
$$L(w) = \sum_{i=1}^n \ln (1 + e^{-y^{(i)} w \cdot x^{(i)}})$$
\begin{align} \frac {\partial L(w)} {\partial w_j} &= \sum_{i=1}^n \frac{e^{-y^{(i)} w \cdot x^{(i)}}(-y^{(i)}x_j^{(i)})}{1 + e^{-y^{(i)} w \cdot x^{(i)}}} \newline &= \sum_{i=1}^n \frac{-y^{(i)}x_j^{(i)}}{1 + e^{y^{(i)} w \cdot x^{(i)}}} \end{align}
\begin{align} \frac {\partial L(w)} {\partial w_j w_k} &= \sum_{i=1}^n \frac{e^{y^{(i)} w \cdot x^{(i)}}y^{(i)}x_j^{(i)}y^{(i)}x_k^{(i)}}{(1 + e^{y^{(i)} w \cdot x^{(i)}})^2} \newline &= \sum_{i=1}^n \frac{e^{y^{(i)} w \cdot x^{(i)}}y^{(i)2} x_j^{(i)}x_k^{(i)}}{(1 + e^{y^{(i)} w \cdot x^{(i)}})^2} \newline \end{align}
So Hessian
$$H = \sum_{i=1}^n \frac{e^{y^{(i)} w \cdot x^{(i)}}y^{(i)2}(x^{(i)})^T(x^{(i)})}{(1 + e^{y^{(i)} w \cdot x^{(i)}})^2}$$
$\therefore$ H is PSD.
Example 5
Let $u = M_{d \times 1}$
$$f(x) = e^{u \cdot w}$$
$$\nabla f = e^{u \cdot w}u$$
$$\nabla^2 f = e^{u \cdot w}u^Tu$$
Since $u^Tu$ is PSD, H is PSD.
Example 6
$$L(x) = \sum_{i=1}^n (y^{(i)} - (w \cdot x^{(i)}))^2 + \lambda ||w||^2$$
$$\frac{\partial L}{\partial w_j} = -2 \sum_{i=1}^n (y^{(i)} - (w \cdot x^{(i)}))x_j^{(i)} + 2 \lambda w_j$$
$$\frac{\partial L}{\partial w_j \partial w_k} = \begin{cases} 2 \sum_{i=1}^n x_j^{(i)2} + 2\lambda \text{, if k = j} \newline 2 \sum_{i=1}^n x_j^{(i)}x_k^{(i)} \text{, if k != j} \end{cases} $$
$$H = 2 x^Tx + 2\lambda I$$
$\therefore$ H is PSD.