Creating and sharing knowledge for telecommunications

Combining Data Clusterings with Instance Level Constraints

Duarte, J. ; Fred, A. L. N. ; Duarte, J. F. Duarte

Combining Data Clusterings with Instance Level Constraints, Proc Workshop on Pattern Recognition in Information Systems, Milan, Italy, Vol. 1, pp. 49 - 60, May, 2009.

Digital Object Identifier: 10.5220/0002260300490060

Download Full text PDF ( 1 MB)

Recent work has focused the incorporation of a priori knowledge into the data clustering process, in the form of pairwise constraints, aiming to improve clustering quality and find appropriate clustering solutions to specific tasks or interests. In this work, we integrate must-link and cannot-link constraints into the cluster ensemble framework. Two algorithms for combining multiple data partitions with instance level constraints are proposed. The first one consists of a modification to Evidence Accumulation Clustering and the second one maximizes both the similarity between the cluster ensemble and the target consensus partition, and constraint satisfaction using a genetic algorithm. Experimental results shown that the proposed constrained clustering combination methods performances are superior to the unconstrained Evidence Accumulation Clustering.