Producing Accurate Interpretable Clusters from High-Dimensional Data

File Type:
PDFItem Type:
Technical ReportDate:
2005-05-19Citation:
Greene, Derek; Cunningham, Padraig. 'Producing Accurate Interpretable Clusters from High-Dimensional Data'. - Dublin, Trinity College Dublin, Department of Computer Science, TCD-CS-2005-42, 2005, pp12Download Item:
Abstract:
The primary goal of cluster analysis is to produce clusters
that accurately reflect the natural groupings in the data. A second objective
that is important for high-dimensional data is to identify features
that are descriptive of the clusters. In addition to these requirements, we
often wish to allow objects to be associated with more than one cluster.
In this paper we present a technique, based on the spectral co-clustering
model, that is effective in meeting these objectives. Our evaluation on a
range of text clustering problems shows that the proposed method yields
accuracy superior to that afforded by existing techniques, while producing
cluster descriptions that are amenable to human interpretation.
Author: Greene, Derek; Padraig, Cunningham
Publisher:
Trinity College Dublin, Department of Computer ScienceType of material:
Technical ReportCollections
Series/Report no:
Computer Science Technical ReportTCD-CS-2005-42
Availability:
Full text availableKeywords:
Computer ScienceMetadata
Show full item recordLicences: