Comparison and Weighted Summation Type of Fuzzy Cluster Validity Indices

Kaile Zhou, Shuai Ding, Chao Fu, Shanlin Yang

Abstract


Finding the optimal cluster number and validating the partition results
of a data set are difficult tasks since clustering is an unsupervised learning process.
Cluster validity index (CVI) is a kind of criterion function for evaluating the clustering
results and determining the optimal number of clusters. In this paper, we present an
extensive comparison of ten well-known CVIs for fuzzy clustering. Then we extend
traditional single CVIs by introducing the weighted method and propose a weighted
summation type of CVI (WSCVI). Experiments on nine synthetic data sets and four
real-world UCI data sets demonstrate that no one CVI performs better on all data
sets than others. Nevertheless, the proposed WSCVI is more effective by properly
setting the weights.


Keywords


fuzzy clustering, fuzzy c-means (FCM), cluster validity indices (CVIs), WSCVI

Full Text:

PDF

References


A.K. Jain, M.N. Murty, P.J. Flynn (1999). Data Clustering: A Review, ACM Computer Surveys, 31(3):264-323.
http://dx.doi.org/10.1145/331499.331504

P.A. Devijver, J. Kittler (1982). Pattern Recognition: A Statistical Approach, Prentice-Hall, London.

F. Hoppner, F. Klawon, R. Kruse, T. Runkler (1999). Fuzzy Cluster Analysis: Methods for Classifications Data Analysis and Image Recognition, Wiley, New York.

M. Kim, R.S. Ramakrishna (2005). New Indices for Cluster Validity Assessment, Pattern Recognition Letters, 26 (15):2353-2363.
http://dx.doi.org/10.1016/j.patrec.2005.04.007

W. Wang, Y. Zhang (2007). On Fuzzy Cluster Validity Indices, Fuzzy Sets and Systems, 158(19):2095-2117.
http://dx.doi.org/10.1016/j.fss.2007.03.004

E. Dimitriadou, S. Dolnicar, A. Weingessel (2002). An Examination of Indexes for Determining the Number of Clusters in Binary Data Sets, Psychometrika, 67(1):137-159.
http://dx.doi.org/10.1007/BF02294713

O. Arbelaitz, I. Gurrutxaga, J. Muguerza, J.M. P¨Śrez, I. Perona (2013). An Extensive Comparative Study of Cluster Validity Indices, Pattern Recognition, 46(1):243-256.
http://dx.doi.org/10.1016/j.patcog.2012.07.021

K.L. Wu, M.S. Yang (2005). A Cluster Validity Index for Fuzzy Clustering, Pattern Recognition Letters, 26 (9):1275-1291.
http://dx.doi.org/10.1016/j.patrec.2004.11.022

H. Le Capitaine, C. Frelicot (2011). A Cluster-validity Index Combining an Overlap Measure and a Separation Measure based on Fuzzy-aggregation Operators, IEEE Transactions on Fuzzy Systems, 19(3):580-588.
http://dx.doi.org/10.1109/TFUZZ.2011.2106216

U. Maulik, S. Bandyopadhyay (2002). Performance Evaluation of Some Clustering Algorithms and Validity Indices, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24:1650-1654.
http://dx.doi.org/10.1109/TPAMI.2002.1114856

K.R. Zalik (2010). Cluster Validity Index for Estimation of Fuzzy Clusters of Different Sizes and Densities, Pattern Recognition, 43(10):3374-3390.
http://dx.doi.org/10.1016/j.patcog.2010.04.025

W. Sheng, S. Swift, L. Zhang, X. Liu (2005). A Weighted Sum Validity Function for Clustering with a Hybrid Niching Genetic Algorithm, IEEE Transactions on Systems, Man, and Cybernetics - Part B, Cybernetics, 35(6):1156-1167.
http://dx.doi.org/10.1109/TSMCB.2005.850173

J.C. Bezdek, R. Ehrlish, W. Full (1984). FCM: The Fuzzy C-means Clustering Algorithm, Computers & Geosciences, 10(2-3):191-203.
http://dx.doi.org/10.1016/0098-3004(84)90020-7

J.C. Bezdek (1974). Numerical Taxonomy with Fuzzy Sets, Journal of Mathematical Biology, 7(1):57-71.
http://dx.doi.org/10.1007/BF02339490

M. Roubens (1978). Pattern Classification Problems and Fuzzy Sets, Fuzzy Sets and Systems, 1(4):239-253.
http://dx.doi.org/10.1016/0165-0114(78)90016-7

J.C. Bezdek (1974). Cluster Validity with Fuzzy Sets, Journal of Cybernetics, 3(3):58-72.
http://dx.doi.org/10.1080/01969727308546047

J.C. Dunn (1977). Fuzzy Automata and Decision Processes, Elsevier, New York.

X.L. Xie, G. Beni (1991). A Validity Measure for Fuzzy Clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(8):841-847.
http://dx.doi.org/10.1109/34.85677

S.H. Kwon (1998). Cluster Validity Index for Fuzzy Clustering, Electronics Letters, 34(22):2176-2177.
http://dx.doi.org/10.1049/el:19981523

M.K. Pakhira, S. Bandyopadhyay, U. Maulik (2004). Validity Index for Crisp and Fuzzy Clusters, Pattern Recognition, 37(3):487-501.
http://dx.doi.org/10.1016/j.patcog.2003.06.005

Y. Fukuyama, M. Sugeno (1989). A New Method of Choosing the Number of Cluster for the Fuzzy C-means Method, Proceedings of the 5th Fuzzy Systems Symposium, 247-250.

Y.G. Tang, F.C. Sun, Z.Q. Sun (2005). Improved Validation Index for Fuzzy Clustering, American Control Conference, 1120-1125.

A.M. Bensaid, L.O. Hall, J.C. Bezdek, L.P. Clarke, M.L. Silbiger, J.A. Arrington, R.F. Murtagh (1996). Validity-guided (Re) Clustering with Applications to Image Segmentation, IEEE Transactions on Fuzzy Systems, 4(2):112-123.
http://dx.doi.org/10.1109/91.493905

K.L. Zhou, S.L. Yang (2013). A Fuzzy Cluster Validity Index in Consideration of Different Size and Density of Data Set, Journal of the China Society for Scientific and Technical Information, 32(3):306-313.

A. Asuncion, D.J. Newman (2007). UCI Machine Learning Repository, University of California, School of Information and Computer Science, Irvine, CA, http://www.ics.uci.edu/mlearn/MLRepositor-y.html.




DOI: https://doi.org/10.15837/ijccc.2014.3.237



Copyright (c) 2017 Kaile Zhou, Shuai Ding, Chao Fu, Shanlin Yang

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

CC-BY-NC  License for Website User

Articles published in IJCCC user license are protected by copyright.

Users can access, download, copy, translate the IJCCC articles for non-commercial purposes provided that users, but cannot redistribute, display or adapt:

  • Cite the article using an appropriate bibliographic citation: author(s), article title, journal, volume, issue, page numbers, year of publication, DOI, and the link to the definitive published version on IJCCC website;
  • Maintain the integrity of the IJCCC article;
  • Retain the copyright notices and links to these terms and conditions so it is clear to other users what can and what cannot be done with the  article;
  • Ensure that, for any content in the IJCCC article that is identified as belonging to a third party, any re-use complies with the copyright policies of that third party;
  • Any translations must prominently display the statement: "This is an unofficial translation of an article that appeared in IJCCC. Agora University  has not endorsed this translation."

This is a non commercial license where the use of published articles for commercial purposes is forbiden. 

Commercial purposes include: 

  • Copying or downloading IJCCC articles, or linking to such postings, for further redistribution, sale or licensing, for a fee;
  • Copying, downloading or posting by a site or service that incorporates advertising with such content;
  • The inclusion or incorporation of article content in other works or services (other than normal quotations with an appropriate citation) that is then available for sale or licensing, for a fee;
  • Use of IJCCC articles or article content (other than normal quotations with appropriate citation) by for-profit organizations for promotional purposes, whether for a fee or otherwise;
  • Use for the purposes of monetary reward by means of sale, resale, license, loan, transfer or other form of commercial exploitation;

    The licensor cannot revoke these freedoms as long as you follow the license terms.

[End of CC-BY-NC  License for Website User]


INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL (IJCCC), With Emphasis on the Integration of Three Technologies (C & C & C),  ISSN 1841-9836.

IJCCC was founded in 2006,  at Agora University, by  Ioan DZITAC (Editor-in-Chief),  Florin Gheorghe FILIP (Editor-in-Chief), and  Misu-Jan MANOLESCU (Managing Editor).

Ethics: This journal is a member of, and subscribes to the principles of, the Committee on Publication Ethics (COPE).

Ioan  DZITAC (Editor-in-Chief) at COPE European Seminar, Bruxelles, 2015:

IJCCC is covered/indexed/abstracted in Science Citation Index Expanded (since vol.1(S),  2006); JCR2018: IF=1.585..

IJCCC is indexed in Scopus from 2008 (CiteScore2018 = 1.56):

Nomination by Elsevier for Journal Excellence Award Romania 2015 (SNIP2014 = 1.029): Elsevier/ Scopus

IJCCC was nominated by Elsevier for Journal Excellence Award - "Scopus Awards Romania 2015" (SNIP2014 = 1.029).

IJCCC is in Top 3 of 157 Romanian journals indexed by Scopus (in all fields) and No.1 in Computer Science field by Elsevier/ Scopus.

 

 Impact Factor in JCR2018 (Clarivate Analytics/SCI Expanded/ISI Web of Science): IF=1.585 (Q3). Scopus: CiteScore2018=1.56 (Q2);

SCImago Journal & Country Rank

Editors-in-Chief: Ioan DZITAC & Florin Gheorghe FILIP.