Blame-Based Noise Reduction: An Alternative Perspective on Noise Reduction for Lazy Learning
Delany, Sarah Jane
Metadata:Show full item record
Citation:Pasquier, Francois-Xavier; Delany, Sarah Jane; Cunningham, Padraig. 'Blame-Based Noise Reduction: An Alternative Perspective on Noise Reduction for Lazy Learning'. - Dublin, Trinity College Dublin, Department of Computer Science, TCD-CS-2005-29, 2005, pp17
In this paper we present a new perspective on noise reduction for nearest-neighbour classifiers. Classic noise reduction algorithms such as Repeated Edited Nearest Neighbour remove cases from the training set if they are misclassified by their nearest neighbours in a leave-one-out cross validation. In the approach presented here, cases are identified for deletion based on their propensity to cause misclassifications. This approach was originally identified in a case-based spam filtering application where it became clear that certain training examples were damaging to the accuracy of the system. In this paper we evaluate the general applicability of the approach on a large variety of datasets and show that it generally beats the classic approach. We also compare the two techniques on artificial noise and show that both are far from perfect at removing noise and that there remains scope for further research in this area.
Science Foundation Ireland
Publisher:Trinity College Dublin, Department of Computer Science
Series/Report no:Computer Science Technical Report