Articles in statistical learning series

Prototype Selection

Backgrond kNN prototype selection Summary List There are couple drawbacks for KNN high storage for data computation for decision boundary intolerance to noise There are couple methods address above issue better similarity metric or better distance function k-d trees or R-trees as storage reduction technique (prototype selection) Prototype Selection 1. edition method - remove noise 1. condensation method - remove superfluous dataset 1. hybrid method - achive elimination of noise and superfluous at the same time……

Read More

Probability

Discrete Random Variables A random variable is a number whose value depends upon the outcome of a random experiement. Such as tossing a coin 10 times and let X be the number of Head. A discrete random variable X has finitely countable values $x_i = 1, 2…$ and $p(x_i) = P(X = x_i)$ is called probability mass function. Probability mass functions has following properties: For all i, $p(x_i) > 0$ For any interval $P(X \in B) = \sum_{x_i \in B}p(x_i)$ $\sum_{i}p(x_i) = 1$ There are many types of discrete random variable……

Read More