Trinity College Dublin, Department of Computer Science
Citation:
Cunningham, Pádraig; Loughrey, John. 'Overfitting in Wrapper-Based Feature Subset Selection: The Harder You Try the Worse it Gets'. - Dublin, Trinity College Dublin, Department of Computer Science, TCD-CS-2005-17, 2005, pp11
Series/Report no.:
Computer Science Technical Report TCD-CS-2005-17
Abstract:
In Wrapper based feature selection, the more states that are
visited during the search phase of the algorithm the greater the
likelihood of finding a feature subset that has a high internal accuracy
while generalizing poorly. When this occurs, we say that the algorithm
has overfitted to the training data. We outline a set of experiments to
show this and we introduce a modified genetic algorithm to address this
overfitting problem by stopping the search before overfitting occurs.
This new algorithm called GAWES (Genetic Algorithm With Early
Stopping) reduces the level of overfitting and yields feature subsets that
have a better generalization accuracy.
Please note: There is a known bug in some browsers that causes an
error when a user tries to view large pdf file within the browser window.
If you receive the message "The file is damaged and could not be
repaired", please try one of the solutions linked below based on the
browser you are using.
Items in TARA are protected by copyright, with all rights reserved, unless otherwise indicated.