Backgrond

kNN prototype selection Summary List

There are couple drawbacks for KNN

  1. high storage for data
  2. computation for decision boundary
  3. intolerance to noise

There are couple methods address above issue

  1. better similarity metric or better distance function
  2. k-d trees or R-trees as storage
  3. reduction technique (prototype selection)

Prototype Selection 1. edition method - remove noise 1. condensation method - remove superfluous dataset 1. hybrid method - achive elimination of noise and superfluous at the same time

Methods

Edition Method

Condensation Method

  1. CNN

    • $S = {t_1, t2, …, t_c}$
    • $T = Training Set \setminus S$
    • while there is misclassified point in T by training on S
      • for t in $T \setminus S$
        • if t is misclassified, S = S + {t}, T = T $\setminus$ {t}
  2. Protoype Selection by Clustering