Bayesian kernel classification for high dimensional data with variable selection
Citation:
Katarina Domijan, 'Bayesian kernel classification for high dimensional data with variable selection', [thesis], Trinity College (Dublin, Ireland). School of Computer Science & Statistics, 2009, pp 195Download Item:

Abstract:
High dimensional data sets, where the dimension of the measurements exceeds the number of samples, arise in many application domains. In particular, the development of genomic and proteomic technologies in the last decade has seen a rapid emergence of such ‘high-throughput’ data and has generated much interest in the statistical community, as analysis of such data requires novel statistical techniques. One area where this has arisen is classification of high dimensional data. This challenging problem is the central focus of this thesis. Models for classification are developed based on reproducing kernel Hilbert spaces theory and are set in the fully Bayesian framework. MCMC techniques are employed in order to sample from the posterior distributions of the model parameters. The proposed classification approaches are applied to microarray, image processing and near-infrared spectroscopy data sets. However, the methods are general and can be used for a variety of classification settings and data spaces of varying structure. Computational efficiency of the algorithms set in the Bayesian framework is an important consideration, and is approached by kernel dimensionality reduction. One of the most interesting aspects of modeling high dimensional data is identifying subsets of measurements that are relevant for classification. Due to the complexity of the data structures and insufficient number of samples to properly characterize those structures, this is a challenging, but important problem. Novel approaches to feature selection based on Bayesian decision theory are proposed and investigated.
Author: Domijan, Katarina
Advisor:
Wilson, SimonQualification name:
Doctor of Philosophy (Ph.D.)Publisher:
Trinity College (Dublin, Ireland). School of Computer Science & StatisticsNote:
TARA (Trinity’s Access to Research Archive) has a robust takedown policy. Please contact us if you have any concerns: rssadmin@tcd.ieType of material:
thesisAvailability:
Full text availableKeywords:
Statistics, Ph.D., Ph.D. Trinity College DublinLicences: