A Theoretical Analysis of Density Peaks Clustering and the Component-wise Peak-Finding Algorithm
Citation:
Joshua Tobin and Mimi Zhang, A Theoretical Analysis of Density Peaks Clustering and the Component-wise Peak-Finding Algorithm, IEEE Transactions on Pattern Analysis and Machine Intelligence, 46, 2, 2024, 1109 - 1120Abstract:
Density peaks clustering detects modes as points with high density and large distance to points of higher density. Each
non-mode point is assigned to the same cluster as its nearest neighbor of higher density. Density peaks clustering has proved capable
in applications, yet little work has been done to understand its theoretical properties or the characteristics of the clusterings it produces.
Here, we prove that it consistently estimates the modes of the underlying density and correctly clusters the data with high probability.
However, noise in the density estimates can lead to erroneous modes and incoherent cluster assignments. A novel clustering
algorithm, Component-wise Peak-Finding (CPF), is proposed to remedy these issues. The improvements are twofold: (1) the
assignment methodology is improved by applying the density peaks methodology within level sets of the estimated density; (2) the
algorithm is not affected by spurious maxima of the density and hence is competent at automatically deciding the correct number of
clusters. We present novel theoretical results, proving the consistency of CPF, as well as extensive experimental results demonstrating
its exceptional performance. Finally, a semi-supervised version of CPF is presented, integrating clustering constraints to achieve
excellent performance for an important problem in computer vision.
Author's Homepage:
http://people.tcd.ie/zhangm3Description:
PUBLISHED
Author: Zhang, Mimi
Type of material:
Journal ArticleCollections
Series/Report no:
IEEE Transactions on Pattern Analysis and Machine Intelligence46
2
Availability:
Full text availableMetadata
Show full item recordLicences: