Feature Clustering based MIM for a New Feature Extraction Method

Sabra El Ferchichi, Salah Zidi, Salah Maouche, Kaouther Laabidi, Moufida Ksouri

Abstract


In this paper, a new unsupervised Feature Extraction appoach is presented, which is based on feature clustering algorithm. Applying a divisive clustering algorithm, the method search for a compression of the information contained in the original set of features. It investigates the use of Mutual Information Maximization (MIM) to find appropriate transformation of clusterde features. Experiments on UCI datasets show that the proposed method often outperforms conventional unsupervised methods PCA and ICA from the point of view of classification accuracy.


Keywords


feature extraction, Mutual Information Maximization (MIM), similarity measure, clustering

Full Text:

PDF

References


Baker, L.D. and McCallum, A.K.; Distributional clustering of words for text classification, Proc. 21st Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, 1998.
http://dx.doi.org/10.1145/290941.290970

Battiti R.; Using Mutual Information for Selecting Features in Supervised Neural Net Learning, IEEE Trans. on Neural networks, 5: 537-550, 1994.
http://dx.doi.org/10.1109/72.298224

Blake, C.L. and Merz C.J.; UCI repository of machine learning databases, http://archive.ics.uci.edu/ml/, Department of Information and Computer Science, University of California, Irvine, CA, 1998.

Bonet, I., Saeys, Y., Grau Abalo, R., García, M., Sanchez, R. and Van de Peer, Y. (2006); Feature extraction using clustering of protein, Proc. 11th Iberoamerican Congress in Pattern Recognition CIARP, eds. Springer, LNCS 4225, 614-623, 2006.

Feature extraction using clustering of protein, Proc. 11th Iberoamerican Congress in Pattern Recognition CIARP, eds. Springer, LNCS 4225, 614-623, 2006.

Charbonnier, S. and Gentil, S.; A trend-based alarm system to improve patient monitoring in intensive care units, Control Engineering Practice, 15:1039-1050, eds. Eds. Elsevier, Kidlington, ROYAUME-UNI, 2007.

Cherkassky, V. and Mulier, F.; Learning from data: concepts, theory and methods, chapter 5, eds.John Wiley & Sons, 1998.

EL Ferchichi, S., Zidi, S., Laabidi, K., Ksouri, M. and Maouche, S.; A new feature extraction method based on clustering for face recognition, " Proc. 12th Engineering Applications of Neural Networks, eds. Springer, IFIP 363, 247-253, 2011.

Fern, X.Z. and Brodley, C.E.; Cluster Ensembles for High Dimensional Clustering: an empirical study, Technical report, CS06-30-02, 2004.

Fisher, J.W., Principe, J.C.; A methodology for information theoretic feature extraction, Proc. 17th Int'l Joint Conf. on Neural Networks, 1998.

Guyon, I., Elisseef, A.; An introduction to variable and feature selection, Journal of Machine Learning Research, 3: 1157-1182, 2003.

Hild II, K.E., Erdogmus, D., Torkkola, K., and Principe, J.C.; Feature extraction using information-theoritic learning, IEEE Trans. on Pattern Analysis and Machine Intelligence, 28, 2006.

Kwak, N., and Choi, C.; Feature extraction based on ICA for binary classification problems, IEEE Trans. on Knowledge and Data Engineering, 15: 1387-1388, 2003.

Kwak, N., Feature selection and extraction based on mutual information for classification; Ph.D Thesis, Seoul National Univ., Seoul, Korea, 2003.

Payne, T.R. and Edwards, P.; Implicit feature selection with the value difference metric, Proc. 13th European Conf. on Artificial Intelligence, 1998.

Saul, L.K., Weinberger, K.Q., Sha, F., Ham, J. and Lee, D.D.; Spectral Methods for Dimensionality Reduction, Semi supervised Learning, eds. MIT Press Cambridge, MA, 2006.

Schaffernicht, E., Kaltenhaeuser, R.; On estimating mutual information for feature selection, Proc. 17th Int'l Conf. on Artificial Neural Networks, eds. Springer, LNCS 6352, 362-367, 2010.

Slonim, N. and Tishby, N.; The power of word clusters for text classification, Proc. 23rd European Colloquim on Information Retrieval Research, 2001.

Suzuki, T., Sugiyama, M., and Kanamori, T.; A Least-squares Approach to Mutual Information Estimation with Application in Variable Selection, JMLR 17th 3rd Workshop on New Challenges for Feature Selection in Data mining and Knowledge Discovery (FSDM 2008), 2008.

Torkkola, K. and Campbell, W.M.; Mutual information in learning feature transformations, Proc. 17th Int'l Conf. on Machine Learning, 2000.

Torkkola, K., Feature extraction by non-parametric mutual information maximization, Journal of Machine Learning Research, 3: 1415-1438, 2003.

Von Luxburg, U., Bubeck, S., Jegelka, S. and Kaufmann, M.; Consistent minimization of clustering objective functions, Neural Information Processing Systems NIPS, 2007.




DOI: https://doi.org/10.15837/ijccc.2013.5.644



Copyright (c) 2017 Sabra El Ferchichi, Salah Zidi, Salah Maouche, Kaouther Laabidi, Moufida Ksouri

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

CC-BY-NC  License for Website User

Articles published in IJCCC user license are protected by copyright.

Users can access, download, copy, translate the IJCCC articles for non-commercial purposes provided that users, but cannot redistribute, display or adapt:

  • Cite the article using an appropriate bibliographic citation: author(s), article title, journal, volume, issue, page numbers, year of publication, DOI, and the link to the definitive published version on IJCCC website;
  • Maintain the integrity of the IJCCC article;
  • Retain the copyright notices and links to these terms and conditions so it is clear to other users what can and what cannot be done with the  article;
  • Ensure that, for any content in the IJCCC article that is identified as belonging to a third party, any re-use complies with the copyright policies of that third party;
  • Any translations must prominently display the statement: "This is an unofficial translation of an article that appeared in IJCCC. Agora University  has not endorsed this translation."

This is a non commercial license where the use of published articles for commercial purposes is forbiden. 

Commercial purposes include: 

  • Copying or downloading IJCCC articles, or linking to such postings, for further redistribution, sale or licensing, for a fee;
  • Copying, downloading or posting by a site or service that incorporates advertising with such content;
  • The inclusion or incorporation of article content in other works or services (other than normal quotations with an appropriate citation) that is then available for sale or licensing, for a fee;
  • Use of IJCCC articles or article content (other than normal quotations with appropriate citation) by for-profit organizations for promotional purposes, whether for a fee or otherwise;
  • Use for the purposes of monetary reward by means of sale, resale, license, loan, transfer or other form of commercial exploitation;

    The licensor cannot revoke these freedoms as long as you follow the license terms.

[End of CC-BY-NC  License for Website User]


INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL (IJCCC), With Emphasis on the Integration of Three Technologies (C & C & C),  ISSN 1841-9836.

IJCCC was founded in 2006,  at Agora University, by  Ioan DZITAC (Editor-in-Chief),  Florin Gheorghe FILIP (Editor-in-Chief), and  Misu-Jan MANOLESCU (Managing Editor).

Ethics: This journal is a member of, and subscribes to the principles of, the Committee on Publication Ethics (COPE).

Ioan  DZITAC (Editor-in-Chief) at COPE European Seminar, Bruxelles, 2015:

IJCCC is covered/indexed/abstracted in Science Citation Index Expanded (since vol.1(S),  2006); JCR2018: IF=1.585..

IJCCC is indexed in Scopus from 2008 (CiteScore2018 = 1.56):

Nomination by Elsevier for Journal Excellence Award Romania 2015 (SNIP2014 = 1.029): Elsevier/ Scopus

IJCCC was nominated by Elsevier for Journal Excellence Award - "Scopus Awards Romania 2015" (SNIP2014 = 1.029).

IJCCC is in Top 3 of 157 Romanian journals indexed by Scopus (in all fields) and No.1 in Computer Science field by Elsevier/ Scopus.

 

 Impact Factor in JCR2018 (Clarivate Analytics/SCI Expanded/ISI Web of Science): IF=1.585 (Q3). Scopus: CiteScore2018=1.56 (Q2); Editors-in-Chief: Ioan DZITAC & Florin Gheorghe FILIP.