Creating and sharing knowledge for telecommunications

Observer biased fuzzy clustering

Fazendeiro, P. ; Oliveira , J.

IEEE Transactions on Fuzzy Systems Vol. 23, Nº 1, pp. 85 - 97, February, 2015.

ISSN (print): 1063-6706
ISSN (online): 1941-0034

Scimago Journal Ranking: 4,55 (in 2015)

Digital Object Identifier: 10.1109/TFUZZ.2014.2306434

Abstract
As generated by clustering algorithms, clusterings (or partitions) are hypotheses on data explanation which are better evaluated by experts from the application domain. In general, clustering algorithms allow a limited usage of domain knowledge on the cluster formation process. In this study, we propose both a design technique and a new partitioning based clustering algorithm which can be used to assist the data analyst while looking for a set of meaningful clusters, i.e., clusters that actually correspond to the underlying data structure. Following an observer metaphor according to which the perception of a group of objects depends on the observer position – the closer an observer is from an image more details (s)he perceives – we resort to shrinkage to incorporate a regularization term, accounting for the observation point, within the objective function of an otherwise unbiased clustering algorithm. This technique allows our resulting biased algorithm to generate a set of reasonable partitions, i.e., partitions validated by a given cluster validity index, corresponding to views of data with different levels of granularity (levels of detail) in different regions of the data space. For illustration of the design technique we adopted the FCM algorithm as the unbiased clustering algorithm, and include a convergence theorem assuring that changing the point of observation in the corresponding biased algorithm (FCMFP) does not jeopardize its convergence. Experimental studies on both synthetic and real data are included for illustrating the usefulness of the approach. In addition, and as a convenient side effect of using shrinkage, the experimental results suggest that our biased algorithm (FCMFP) not only seems to scale better than the successive runs of the unbiased one (FCM) but also, on the average, seems to produce clusters exhibiting higher validity index values. Also, less sensitivity to initialization was observed for the biased algorithm when compared to the unbiased one.