Overfitting in Wrapper-Based Feature Subset Selection: The Harder You Try the Worse it Gets
File Type:
PDFItem Type:
Technical ReportDate:
2005-01-28Citation:
Cunningham, Padraig; Loughrey, John. 'Overfitting in Wrapper-Based Feature Subset Selection: The Harder You Try the Worse it Gets'. - Dublin, Trinity College Dublin, Department of Computer Science, TCD-CS-2005-17, 2005, pp11Download Item:
TCD-CS-2005-17.pdf (PDF) 239.0Kb
Abstract:
In Wrapper based feature selection, the more states that are
visited during the search phase of the algorithm the greater the
likelihood of finding a feature subset that has a high internal accuracy
while generalizing poorly. When this occurs, we say that the algorithm
has overfitted to the training data. We outline a set of experiments to
show this and we introduce a modified genetic algorithm to address this
overfitting problem by stopping the search before overfitting occurs.
This new algorithm called GAWES (Genetic Algorithm With Early
Stopping) reduces the level of overfitting and yields feature subsets that
have a better generalization accuracy.
Sponsor
Grant Number
Science Foundation Ireland
Author: Cunningham, Padraig; Loughrey, John
Publisher:
Trinity College Dublin, Department of Computer ScienceType of material:
Technical ReportCollections:
Series/Report no:
Computer Science Technical ReportTCD-CS-2005-17
Availability:
Full text availableKeywords:
Computer ScienceLicences: