Probabilistic consensus clustering using evidence accumulation
Lourenço, A.
; Bulo, S. Bulo
; Rebagliati, N.
;
Fred, A. L. N.
;
Figueiredo, M. A. T.
; Pelillo, M.
Machine Learning Vol. 98, Nº 1-2, pp. 331 - 357, April, 2015.
ISSN (print): 0885-6125
ISSN (online): 1573-0565
Scimago Journal Ranking: 1,26 (in 2015)
Digital Object Identifier: 10.1007/s10994-013-5339-6
Abstract
Clustering ensemble methods produce a consensus partition of a set of data points by combining the results of a collection of base clustering algorithms. In the evidence accumulation clustering (EAC) paradigm, the clustering ensemble is transformed into a pairwise co-association matrix, thus avoiding the label correspondence problem, which is intrinsic to other clustering ensemble schemes. In this paper, we propose a consensus clustering approach based on the EAC paradigm, which is not limited to crisp partitions and fully exploits the nature of the co-association matrix. Our solution determines probabilistic assignments of data points to clusters by minimizing a Bregman divergence between the observed co-association frequencies and the corresponding co-occurrence probabilities expressed as functions of the unknown assignments. We additionally propose an optimization algorithm to find a solution under any double-convex Bregman divergence. Experiments on both synthetic and real benchmark data show the effectiveness of the proposed approach.