Bayes Optimal Classifier

Background Marginal Distribution Three ways to sample from P Draw (x,y) Draw y according to its marginal distribution, then x according to the conditional distribution of x | y Draw X according to its marginal distribution, then Y according to the conditional distribution of y | x Define: $\mu$: distribution on $X$ $\eta$: conditional distribution y|x Classifier Normal Classifier $h : x \rightarrow y$ $R(h) = Pr_{(x,y) \in p} (h(x) \neq y)$, where R = risk……

Read More

Marginal Distribution

Marginal Distribution: It’s a function that gives the probability based on only subset of the variables.1. For example, Or in mathematical way, for discrete $$Pr(X = x) = \sum_y Pr(X = x, Y = y) = \sum_y Pr(X = x | Y = y) Pr(Y = y)$$ and for continuous $$p_X(x) = \int_y p_{X,Y}(x,y) dy = \int_y p_{X|Y}(x|y)p_Y(y) dy$$ and it can also be written as Expected Vaue $$p_X(x) = \int_y p_{X|Y} (x|y) p_Y(y) dy = \mathbb{E}_Y[p_{X|Y} (x|y)]$$……

Read More

lp norm

Families of Distance Function $l_p$ norm The most common one is $l_2$ norm (Euclidean distance): $$||x - z||_2 = \sqrt{\sum_{i=1}^{m}(x_i - z_i)^2}$$ Notes: sometime 2 is dropped. For $p \geq 1$, the $l_p$ distance: $$||x - z||_p = (\sum_{i=1}^{m}(x_i - z_i)^p)^{1/p}$$ Special case: $l_1$ distance: $$||x - z||_1 = \sum_{i=1}^{m}|x_i - z_i|$$ $l_\infty$ distance: $$||x - z||_1 = max_i |x_i - z_i|$$ Metric space Let $X$ be the space that data lie.……

Read More

Nearest Neighbor Classification

Nearest Neighbor Classification Procedures Assemble a data set (training set) How to classify a new image x? find its closest neighbor y, and label it the same Notes: training set of 60000 images test set of 10000 images How do we determine if two data (images) are closest? With 28 x 28 image, we can strech it to become a vector of 784.……

Read More

Fast Fourtier Transform

Problem Given two vectors $a = (a_1, a_2, a_{n-1})$ and $b = (a_1, b_2, b_{n-1})$. The convolution of a * b is a vector with 2n - 1 coordinates, where coordinate k is $\sum_{(i,j):i+j=k|i,j < n} a_ib_j$, which is can be written as $$a ∗ b = (a_0b_0, a_0b_1 + a_1b_0, a_0b_2 + a_1b_1 + a_2b_0, … , a_{n−2}b_{n−1} + a_{n−1}b_{n−2}, a_{n−1}b_{n−1})$$ Or an n x n table, whose (i, j) entry is $a_ib_j$……

Read More

Integer Multiplication

Problem Wiki Explanation Algorithm Let’s say we define $x = x_1 2^{n/2} + x_0$, then xy become \begin{align} xy &= (x_1 2^{n/2} + x_0)(y_1 2^{n/2} + y_0) \newline &= x_1 y_1 2^n + (x_1 y_0 + x_0 y_1)2^{n/2} + x_0 y_0 \end{align} So we have $$T(n) \leq 4T(n/2) + cn$$ But this is essentially $$T(n) \leq O(n^{\log_2 q}) = O(n^2)$$ However, we can reduce the time by observing that $(x_1 + x_0)(y_1 + y_0) = x_1y_1 + x_1y_0 + x_0y_1 + x_0y_0$.……

Read More

Closest Point

Problem Given n points in the plane, find the pair that is closest together. Brute force takes $O(n^2)$. Algorithm Let’s $d(p_i, p_j)$ = Euclidean distance. In 1-d, we can simply sort points and compute the distance with the next point, we then have complexity of O(nlogn). In 2-d, we can’t applied the same thing. We will use divide and conquer. We find the closest pair in the left and closest pair in the right, and hoping to get it in linear time.……

Read More

Counting Inversion

Problem Application in ranking, also called corraborative filtering. Comparing two rankings and decide how similar they are, or how many pairs are out of order. To quantify this, we count the number of inversions. The inversion is defined as two indices i < j that $a_i > a_j$. Algorithm Brute-Force $T(n) = O(n^2)$ Modified Merge-Sort By leverage the merge process form merger-sort, we can count the number of inversion. Basically, when the element from A is appended, there is not inversion.……

Read More

Merge Sort

Problem Sort the elements Abstract the behavior: 1. Divide the input into two pieces of equal size O(n) 1. solve the two subproblems on these pieces separately by recursion 1. combine the two results into an overall solution O(n) Recurrence Time Complexity q = 2 T(n) ≤ 2T(n/2) + cn To analyze the above recurrence relation, check below image. At level j, we have $2^j$ nodes with size $n/2^j$ Each node takes $cn/2^j$, so level j takes $cn/2^j$ x $2^j = cn$ There are logn levels, so T(n) = O(nlogn) General Case For q > 2 T(n) ≤ qT(n/2) + cn……

Read More

Margin of Error

Z-Score vs T-Score Z-Score Link Z-Score’s formula $$z = \frac{X - \mu}{\sigma}$$ where X = sample mean, $\mu$ = population means, $\sigma$ = population standard deviation. Also, we use Z Score when sample size >= 30 or we know the population’s mean dna SD. Z Table T-Score T-Score T-Score’s formula $$T = \frac{X - \mu}{s/ \sqrt{n}}$$ where X = sample mean, $\mu$ = population mean, s = sample standard deviation, and n = sample size.……

Read More