Knapsack

2019/2/9

| algorithm

Problem1 Each request has value $v_i$ and weight $w_i$, and we have constraint that the total of request weight $\leq$ W. Greedy doesn’t work because sorting the W either in decreasing or increasing manner don’t produce the optimal solution. {W/2 + 1, W/2, W/2} {1, W/2, W/2} Using dynamic programing, we can use reduce this problem to whether or not each request belongs to the optimal solution O again.……

Segmented Least Square

2019/2/9

| algorithm

Problem1 Data $P = (x_1,y_1), (x_2, y_2)…(x_n, y_n)$ for $x_1 < x_2 < … < x_n$. Given a line y=ax+b, and error $$Error(L, P) = \sum_{i=1}^n(y_i - ax_i -b)^2$$ And it has closed form solution of $$a = \frac{n\sum_i x_iy_i - (\sum_i x_i)(\sum_i y_i)}{n\sum_i x_i^2 - (\sum_i x_i)^2}$$ and $$b = \frac {\sum_i y_i - a \sum_i x_i}{n}$$ However, if the data is show as below (maybe fitted with two lines), we can’t just use the above formula.……

Weighted Interval Scheduling

2019/2/8

| algorithm

Problem1 Instead of just fitting as many tasks like regular interval scheduling problem, we have weight associates with the interval. Now we want to maximize the value. So basically, $f_i$ finish time sorted in non-decreasing order, $v_i$ is the value, and $s_i$ is the start time. We want $S \subseteq {1… n}$. We also define $p(j)$ which is the largest i < j which is disjoint(compatible) with j. Algorithm Idea:……

Gradient Descent

2019/2/7

| machine learning

Gradient Descent A simple algorithm to go “downward” against the gradient of the function. Algebrically: $$w_{t+1} = w_t - \eta \nabla f(w_t)$$ where $\eta$ is called learning rate or step size. Step Size $\eta$ too small, slow convergence $\eta$ too large, solution will bounce around In practice: Set $\eta$ to be a smalle constant Backtracking line search (work when $\nabla f$ is continuous) Parameter $\bar{\alpha}, c \in (0,1), \rho \in (0,1)$.……

Convex Optimization

2019/2/7

| machine learning

Gradient && Hessian The gradient of a f, d x 1, can be represented as follow $$ \nabla f(x) = \begin{bmatrix} \frac {\partial f(x)} {\partial x_1} \newline …\newline \frac {\partial f(x)} {\partial x_d} \end{bmatrix} $$ and the Hessian, d x d, can be represented as $$ \nabla^2 f(x) = \begin{bmatrix} \frac {\partial^2 f(x)} {\partial x_1^2} & \frac {\partial^2 f(x)} {\partial x_1x_2} & \frac {\partial^2 f(x)} {\partial x_1x_d} \newline … & … & …\newline \frac {\partial^2 f(x)} {\partial x_d^2} & \frac {\partial^2 f(x)} {\partial x_dx_2} & \frac {\partial^2 f(x)} {\partial x_dx_d} \newline \end{bmatrix} $$……

Positive Semi Definite

2019/2/7

| mathematic

PSD differential equation Any matrix $A_{d \times d}$ is said to be PSD if $x^TAx \geq$ 0 $\forall$ vector $x_{d \times 1}$. Examples Identity matrix $I_{d \times d}$: $$ A = \begin{bmatrix} 1 & 0 & 0 & 0 \newline 0 & 1 & 0 & 0 \newline 0 & 0 & 1 & 0 \newline 0 & 0 & 0 & 1 \end{bmatrix} $$ Then \begin{align} x^TAx &= \sum_{i = 1}^d \sum_{i = 1}^d A_{i,j}x_ix_j \newline &= \sum_{i = 1}^d A_{i,i} x_i^2\newline &= \sum_{i = 1}^d x_i^2 \newline &\geq 0 \end{align}……

Logistic Regression

2019/2/1

| machine learning

Uncertainty in Prediction Related to Linear Regression. The available features x do not contain enough information to perfectly predict y, such as x = medical record for patients at risk for a disease y = will he contact disease in next 5 years Model We still going to use linear model for conditional probability estmation $$w_1x_1 + w_2x_2 + … + w_dx_d + b = w \cdot x + b$$……

Matrix

2019/2/1

| mathematic

Determinant Calculation Determinant of 2x2 matrix $$ A= \begin{bmatrix} a & b \newline c & d \end{bmatrix} $$ $|A| = det(A) = ad -bc$ Determinant of 3x3 matrix, also called expansion of the determinant by first row. Link. $$ B= \begin{bmatrix} a & b & c \newline d & e & f \newline g & h & k \end{bmatrix} $$ $|B| = det(B) = a\begin{vmatrix} e & f \newline h & k \end{vmatrix} -b\begin{vmatrix} d & f \newline g & k \end{vmatrix} +c\begin{vmatrix} d & e \newline g & h \end{vmatrix}$……

Linear Regression

2019/1/31

| machine learning

Basic Idea Fit a line to a bunch of points. Example Without extra information, we will predict the mean 2.47. Average squared error = $\mathbb{E} [(studentGPA - predictedGPA)^2]$ = Variance If we have SAT scores, then we can fit a line. Now if we predict based on this line, the MSE drops to 0.43. This is a regression problem with: Predictor variable: SAT score Response variable: College GPA Formula For $\mathbb{R}$ $$y = ax + b$$……

k317h

Knapsack

Segmented Least Square

Weighted Interval Scheduling

Gradient Descent

Convex Optimization

Positive Semi Definite

Logistic Regression

Matrix

Linear Regression

Most Recents

Categories

Series

Tags

Other