Knapsack

Problem1 Each request has value $v_i$ and weight $w_i$, and we have constraint that the total of request weight $\leq$ W. Greedy doesn’t work because sorting the W either in decreasing or increasing manner don’t produce the optimal solution. {W/2 + 1, W/2, W/2} {1, W/2, W/2} Using dynamic programing, we can use reduce this problem to whether or not each request belongs to the optimal solution O again.……

Read More

Segmented Least Square

Problem1 Data $P = (x_1,y_1), (x_2, y_2)…(x_n, y_n)$ for $x_1 < x_2 < … < x_n$. Given a line y=ax+b, and error $$Error(L, P) = \sum_{i=1}^n(y_i - ax_i -b)^2$$ And it has closed form solution of $$a = \frac{n\sum_i x_iy_i - (\sum_i x_i)(\sum_i y_i)}{n\sum_i x_i^2 - (\sum_i x_i)^2}$$ and $$b = \frac {\sum_i y_i - a \sum_i x_i}{n}$$ However, if the data is show as below (maybe fitted with two lines), we can’t just use the above formula.……

Read More

Weighted Interval Scheduling

Problem1 Instead of just fitting as many tasks like regular interval scheduling problem, we have weight associates with the interval. Now we want to maximize the value. So basically, $f_i$ finish time sorted in non-decreasing order, $v_i$ is the value, and $s_i$ is the start time. We want $S \subseteq {1… n}$. We also define $p(j)$ which is the largest i < j which is disjoint(compatible) with j. Algorithm Idea:……

Read More

Gradient Descent

Gradient Descent A simple algorithm to go “downward” against the gradient of the function. Algebrically: $$w_{t+1} = w_t - \eta \nabla f(w_t)$$ where $\eta$ is called learning rate or step size. Step Size $\eta$ too small, slow convergence $\eta$ too large, solution will bounce around In practice: Set $\eta$ to be a smalle constant Backtracking line search (work when $\nabla f$ is continuous) Parameter $\bar{\alpha}, c \in (0,1), \rho \in (0,1)$.……

Read More

Convex Optimization

Gradient && Hessian The gradient of a f, d x 1, can be represented as follow $$ \nabla f(x) = \begin{bmatrix} \frac {\partial f(x)} {\partial x_1} \newline …\newline \frac {\partial f(x)} {\partial x_d} \end{bmatrix} $$ and the Hessian, d x d, can be represented as $$ \nabla^2 f(x) = \begin{bmatrix} \frac {\partial^2 f(x)} {\partial x_1^2} & \frac {\partial^2 f(x)} {\partial x_1x_2} & \frac {\partial^2 f(x)} {\partial x_1x_d} \newline … & … & …\newline \frac {\partial^2 f(x)} {\partial x_d^2} & \frac {\partial^2 f(x)} {\partial x_dx_2} & \frac {\partial^2 f(x)} {\partial x_dx_d} \newline \end{bmatrix} $$……

Read More

Positive Semi Definite

PSD differential equation Any matrix $A_{d \times d}$ is said to be PSD if $x^TAx \geq$ 0 $\forall$ vector $x_{d \times 1}$. Examples Identity matrix $I_{d \times d}$: $$ A = \begin{bmatrix} 1 & 0 & 0 & 0 \newline 0 & 1 & 0 & 0 \newline 0 & 0 & 1 & 0 \newline 0 & 0 & 0 & 1 \end{bmatrix} $$ Then \begin{align} x^TAx &= \sum_{i = 1}^d \sum_{i = 1}^d A_{i,j}x_ix_j \newline &= \sum_{i = 1}^d A_{i,i} x_i^2\newline &= \sum_{i = 1}^d x_i^2 \newline &\geq 0 \end{align}……

Read More

Logistic Regression

Uncertainty in Prediction Related to Linear Regression. The available features x do not contain enough information to perfectly predict y, such as x = medical record for patients at risk for a disease y = will he contact disease in next 5 years Model We still going to use linear model for conditional probability estmation $$w_1x_1 + w_2x_2 + … + w_dx_d + b = w \cdot x + b$$……

Read More

Matrix

Determinant Calculation Determinant of 2x2 matrix $$ A= \begin{bmatrix} a & b \newline c & d \end{bmatrix} $$ $|A| = det(A) = ad -bc$ Determinant of 3x3 matrix, also called expansion of the determinant by first row. Link. $$ B= \begin{bmatrix} a & b & c \newline d & e & f \newline g & h & k \end{bmatrix} $$ $|B| = det(B) = a\begin{vmatrix} e & f \newline h & k \end{vmatrix} -b\begin{vmatrix} d & f \newline g & k \end{vmatrix} +c\begin{vmatrix} d & e \newline g & h \end{vmatrix}$……

Read More

Linear Regression

Basic Idea Fit a line to a bunch of points. Example Without extra information, we will predict the mean 2.47. Average squared error = $\mathbb{E} [(studentGPA - predictedGPA)^2]$ = Variance If we have SAT scores, then we can fit a line. Now if we predict based on this line, the MSE drops to 0.43. This is a regression problem with: Predictor variable: SAT score Response variable: College GPA Formula For $\mathbb{R}$ $$y = ax + b$$……

Read More