Topics in unsupervised learning
Citation:
Paul David McNicholas, 'Topics in unsupervised learning', [thesis], Trinity College (Dublin, Ireland). School of Computer Science & Statistics, 2007, pp 174Download Item:
McNicholas TCD THESIS 8215 Topics in.pdf (PDF) 78.37Mb
Abstract:
Two topics in unsupervised learning are reviewed and developed; namely, model-based
clustering and association rule mining. A new family of Gaussian mixture models, with a parsim onious covariance structure, is introduced. The mixtures of factor analysers and mixtures of principal component analysers models are special cases of this new family of models. This family exhibit the feature that their number of covariance parameters grows linearly with the dimensionality of the data, which leads to relatively fast computation time. These models perform excellently, compared to popular model-based clustering techniques, when applied to real data.
A new family of Gaussian mixture models with a Cholesky-decomposed covariance structure
is also introduced. Four members of this family are developed and applied to real data.
This family of models has great potential for further development in future work.
A novel approach, via association rules, is taken to the analysis of college applications
data. This analysis contributes to the discussion about the existence of a 'points race'. A
new method of quantifying and visualising the interestingness of an association rule is also
introduced and an argument for the inclusion of negations in the association rule mining
process is given.
Author: McNicholas, Paul David
Advisor:
Murphy, BrendanO'Regan, Myra
Qualification name:
Doctor of Philosophy (Ph.D.)Publisher:
Trinity College (Dublin, Ireland). School of Computer Science & StatisticsNote:
TARA (Trinity's Access to Research Archive) has a robust takedown policy. Please contact us if you have any concerns: rssadmin@tcd.ieType of material:
thesisAvailability:
Full text availableKeywords:
Statistics, Ph.D., Ph.D. Trinity College DublinLicences: