The University of Dublin | Trinity College -- Ollscoil Átha Cliath | Coláiste na Tríonóide
Trinity's Access to Research Archive
Home :: Log In :: Submit :: Alerts ::

TARA >
School of Computer Science and Statistics >
Computer Science >
Computer Science Technical Reports >

Please use this identifier to cite or link to this item: http://hdl.handle.net/2262/13500

Title: Dynamic Integration with Random Forests
Author: Tsymbal, Alexey
Cunningham, Pádraig
Sponsor: Science Foundation Ireland
Keywords: Computer Science
Issue Date: 2006
Publisher: Trinity College Dublin, Department of Computer Science
Citation: Tsymbal, Alexey; Cunningham, Pádraig. 'Dynamic Integration with Random Forests'. - Dublin, Trinity College Dublin, Department of Computer Science, TCD-CS-2006-23, 2006, pp10
Series/Report no.: Computer Science Technical Report
TCD-CS-2006-23
Abstract: Random Forests are a successful ensemble prediction technique that combines two sources of randomness to generate base decision trees; bootstrapping instances for each tree and considering a random subset of features at each node. Breiman in his introductory paper on Random Forests claims that they are more robust than boosting with respect to overfitting noise, and are able to compete with boosting in terms of predictive performance. Multiple recently published empirical studies conducted in various application domains confirm these claims. Random Forests use simple majority voting to combine the predictions of the trees. However, it is clear that each decision tree in a random forest may have different contribution in classifying a certain instance. In this paper, we demonstrate that the prediction performance of Random Forests may still be improved in some domains by replacing the combination function. Dynamic integration, which is based on local performance estimates of base predictors, can be used instead of majority voting. We conduct experiments on a selection of classification datasets, analysing the resulting accuracy, the margin and the bias and variance components of error. The experiments demonstrate that dynamic integration increases accuracy on some datasets. Even if the accuracy remains the same, dynamic integration always increases the margin. A bias/variance decomposition demonstrates that dynamic integration decreases the error by significantly decreasing the bias component while leaving the same or insignificantly increasing the variance. The experiments also demonstrate that the intrinsic similarity measure of Random Forests is better than the commonly used Heterogeneous Euclidean/Overlap Metric in finding a neighbourhood for local estimates in this context.
URI: https://www.cs.tcd.ie/publications/tech-reports/reports.06/TCD-CS-2006-23.pdf
http://hdl.handle.net/2262/13500
Appears in Collections:Computer Science Technical Reports

Files in This Item:

File Description SizeFormat
TCD-CS-2006-23.pdf136.72 kBAdobe PDFView/Open


This item is protected by original copyright


Please note: There is a known bug in some browsers that causes an error when a user tries to view large pdf file within the browser window. If you receive the message "The file is damaged and could not be repaired", please try one of the solutions linked below based on the browser you are using.

Items in TARA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback