Comparison and Weighted Summation Type of Fuzzy Cluster Validity Indices

  • Kaile Zhou Hefei University of Technology
  • Shuai Ding Hefei University of Technology
  • Chao Fu School of Management Hefei University of Technology Hefei 230009, China
  • Shanlin Yang Hefei University of Technology

Abstract

Finding the optimal cluster number and validating the partition resultsof a data set are difficult tasks since clustering is an unsupervised learning process.Cluster validity index (CVI) is a kind of criterion function for evaluating the clusteringresults and determining the optimal number of clusters. In this paper, we present anextensive comparison of ten well-known CVIs for fuzzy clustering. Then we extendtraditional single CVIs by introducing the weighted method and propose a weightedsummation type of CVI (WSCVI). Experiments on nine synthetic data sets and fourreal-world UCI data sets demonstrate that no one CVI performs better on all datasets than others. Nevertheless, the proposed WSCVI is more effective by properlysetting the weights.

References

[1] A.K. Jain, M.N. Murty, P.J. Flynn (1999). Data Clustering: A Review, ACM Computer Surveys, 31(3):264-323.
http://dx.doi.org/10.1145/331499.331504

[2] P.A. Devijver, J. Kittler (1982). Pattern Recognition: A Statistical Approach, Prentice-Hall, London.

[3] F. Hoppner, F. Klawon, R. Kruse, T. Runkler (1999). Fuzzy Cluster Analysis: Methods for Classifications Data Analysis and Image Recognition, Wiley, New York.

[4] M. Kim, R.S. Ramakrishna (2005). New Indices for Cluster Validity Assessment, Pattern Recognition Letters, 26 (15):2353-2363.
http://dx.doi.org/10.1016/j.patrec.2005.04.007

[5] W. Wang, Y. Zhang (2007). On Fuzzy Cluster Validity Indices, Fuzzy Sets and Systems, 158(19):2095-2117.
http://dx.doi.org/10.1016/j.fss.2007.03.004

[6] E. Dimitriadou, S. Dolnicar, A. Weingessel (2002). An Examination of Indexes for Determining the Number of Clusters in Binary Data Sets, Psychometrika, 67(1):137-159.
http://dx.doi.org/10.1007/BF02294713

[7] O. Arbelaitz, I. Gurrutxaga, J. Muguerza, J.M. P¨Śrez, I. Perona (2013). An Extensive Comparative Study of Cluster Validity Indices, Pattern Recognition, 46(1):243-256.
http://dx.doi.org/10.1016/j.patcog.2012.07.021

[8] K.L. Wu, M.S. Yang (2005). A Cluster Validity Index for Fuzzy Clustering, Pattern Recognition Letters, 26 (9):1275-1291.
http://dx.doi.org/10.1016/j.patrec.2004.11.022

[9] H. Le Capitaine, C. Frelicot (2011). A Cluster-validity Index Combining an Overlap Measure and a Separation Measure based on Fuzzy-aggregation Operators, IEEE Transactions on Fuzzy Systems, 19(3):580-588.
http://dx.doi.org/10.1109/TFUZZ.2011.2106216

[10] U. Maulik, S. Bandyopadhyay (2002). Performance Evaluation of Some Clustering Algorithms and Validity Indices, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24:1650-1654.
http://dx.doi.org/10.1109/TPAMI.2002.1114856

[11] K.R. Zalik (2010). Cluster Validity Index for Estimation of Fuzzy Clusters of Different Sizes and Densities, Pattern Recognition, 43(10):3374-3390.
http://dx.doi.org/10.1016/j.patcog.2010.04.025

[12] W. Sheng, S. Swift, L. Zhang, X. Liu (2005). A Weighted Sum Validity Function for Clustering with a Hybrid Niching Genetic Algorithm, IEEE Transactions on Systems, Man, and Cybernetics - Part B, Cybernetics, 35(6):1156-1167.
http://dx.doi.org/10.1109/TSMCB.2005.850173

[13] J.C. Bezdek, R. Ehrlish, W. Full (1984). FCM: The Fuzzy C-means Clustering Algorithm, Computers & Geosciences, 10(2-3):191-203.
http://dx.doi.org/10.1016/0098-3004(84)90020-7

[14] J.C. Bezdek (1974). Numerical Taxonomy with Fuzzy Sets, Journal of Mathematical Biology, 7(1):57-71.
http://dx.doi.org/10.1007/BF02339490

[15] M. Roubens (1978). Pattern Classification Problems and Fuzzy Sets, Fuzzy Sets and Systems, 1(4):239-253.
http://dx.doi.org/10.1016/0165-0114(78)90016-7

[16] J.C. Bezdek (1974). Cluster Validity with Fuzzy Sets, Journal of Cybernetics, 3(3):58-72.
http://dx.doi.org/10.1080/01969727308546047

[17] J.C. Dunn (1977). Fuzzy Automata and Decision Processes, Elsevier, New York.

[18] X.L. Xie, G. Beni (1991). A Validity Measure for Fuzzy Clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(8):841-847.
http://dx.doi.org/10.1109/34.85677

[19] S.H. Kwon (1998). Cluster Validity Index for Fuzzy Clustering, Electronics Letters, 34(22):2176-2177.
http://dx.doi.org/10.1049/el:19981523

[20] M.K. Pakhira, S. Bandyopadhyay, U. Maulik (2004). Validity Index for Crisp and Fuzzy Clusters, Pattern Recognition, 37(3):487-501.
http://dx.doi.org/10.1016/j.patcog.2003.06.005

[21] Y. Fukuyama, M. Sugeno (1989). A New Method of Choosing the Number of Cluster for the Fuzzy C-means Method, Proceedings of the 5th Fuzzy Systems Symposium, 247-250.

[22] Y.G. Tang, F.C. Sun, Z.Q. Sun (2005). Improved Validation Index for Fuzzy Clustering, American Control Conference, 1120-1125.

[23] A.M. Bensaid, L.O. Hall, J.C. Bezdek, L.P. Clarke, M.L. Silbiger, J.A. Arrington, R.F. Murtagh (1996). Validity-guided (Re) Clustering with Applications to Image Segmentation, IEEE Transactions on Fuzzy Systems, 4(2):112-123.
http://dx.doi.org/10.1109/91.493905

[24] K.L. Zhou, S.L. Yang (2013). A Fuzzy Cluster Validity Index in Consideration of Different Size and Density of Data Set, Journal of the China Society for Scientific and Technical Information, 32(3):306-313.

[25] A. Asuncion, D.J. Newman (2007). UCI Machine Learning Repository, University of California, School of Information and Computer Science, Irvine, CA, http://www.ics.uci.edu/mlearn/MLRepositor-y.html.
Published
2014-04-04
How to Cite
ZHOU, Kaile et al. Comparison and Weighted Summation Type of Fuzzy Cluster Validity Indices. INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, [S.l.], v. 9, n. 3, p. 370-378, apr. 2014. ISSN 1841-9844. Available at: <http://univagora.ro/jour/index.php/ijccc/article/view/237>. Date accessed: 13 july 2020. doi: https://doi.org/10.15837/ijccc.2014.3.237.

Keywords

fuzzy clustering, fuzzy c-means (FCM), cluster validity indices (CVIs), WSCVI