Prototype Selection
Backgrond
kNN prototype selection Summary List
There are couple drawbacks for KNN
- high storage for data
- computation for decision boundary
- intolerance to noise
There are couple methods address above issue
- better similarity metric or better distance function
- k-d trees or R-trees as storage
- reduction technique (prototype selection)
Prototype Selection 1. edition method - remove noise 1. condensation method - remove superfluous dataset 1. hybrid method - achive elimination of noise and superfluous at the same time
Methods
Edition Method
Condensation Method
CNN
- $S = {t_1, t2, …, t_c}$
- $T = Training Set \setminus S$
- while there is misclassified point in T by training on S
- for t in $T \setminus S$
- if t is misclassified, S = S + {t}, T = T $\setminus$ {t}
- for t in $T \setminus S$