Trinity College Dublin, Department of Computer Science
Pasquier, François-Xavier; Delany, Sarah Jane; Cunningham, Pádraig. 'Blame-Based Noise Reduction: An Alternative Perspective on Noise Reduction for Lazy Learning'. - Dublin, Trinity College Dublin, Department of Computer Science, TCD-CS-2005-29, 2005, pp17
Computer Science Technical Report TCD-CS-2005-29
In this paper we present a new perspective on noise reduction for nearest-neighbour classifiers.
Classic noise reduction algorithms such as Repeated Edited Nearest Neighbour remove cases from
the training set if they are misclassified by their nearest neighbours in a leave-one-out cross validation.
In the approach presented here, cases are identified for deletion based on their propensity
to cause misclassifications. This approach was originally identified in a case-based spam filtering
application where it became clear that certain training examples were damaging to the accuracy of
the system. In this paper we evaluate the general applicability of the approach on a large variety of
datasets and show that it generally beats the classic approach. We also compare the two techniques
on artificial noise and show that both are far from perfect at removing noise and that there remains
scope for further research in this area.
Please note: There is a known bug in some browsers that causes an
error when a user tries to view large pdf file within the browser window.
If you receive the message "The file is damaged and could not be
repaired", please try one of the solutions linked below based on the
browser you are using.
Items in TARA are protected by copyright, with all rights reserved, unless otherwise indicated.