Improving a SVM Meta-classifier for Text Documents by using Naive Bayes

Authors

  • Daniel Morariu "Lucian Blaga" University of Sibiu Engineering Faculty,Computer Science Department E. Cioran Street, No. 4, 550025 Sibiu, ROMANIA
  • Radu Crețulescu "Lucian Blaga" University of Sibiu Engineering Faculty,Computer Science Department E. Cioran Street, No. 4, 550025 Sibiu, ROMANIA
  • Lucian Vințan Daniel Morariu, Radu Cre¸tulescu, "Lucian Blaga" University of Sibiu Engineering Faculty,Computer Science Department E. Cioran Street, No. 4, 550025 Sibiu, ROMANIA

Keywords:

Meta-classification, Support Vector Machine, Naive Bayes, Text document and Performance Evaluation

Abstract

Text categorization is the problem of classifying text documents into a set of predefined classes. In this paper, we investigated two approaches: a) to develop a classifier for text document based on Naive Bayes Theory and b) to integrate this classifier into a meta-classifier in order to increase the classification accuracy. The basic idea is to learn a meta-classifier to optimally select the best component classifier for each data point. The experimental results show that combining classifiers can significantly improve the classification accuracy and that our improved meta-classification strategy gives better results than each individual classifier. For Reuters2000 text documents we obtained classification accuracies up to 93.87%

References

S. Chakrabarti, Mining the Web- Discovering Knowledge from hypertext data, Morgan Kaufmann Press, 2003.

N. Dimitrova, L. Agnihotri and G. Wei, Video Classification Based on HMM Using Text and Face, Proceedings of the European Conference on Signal Processing,Vol. XVII, pp. 1373-1376, Finland, 2000.

J. Engler, A. Kusiak, Mining Authoritativeness of Collaborative Innovation Partners, International Journal of Computers, Communications and Control, Vol. V, No. 1, pp. 42-51, 2010. http://dx.doi.org/10.15837/ijccc.2010.1.2463

D. Lewis, Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval, ATT Lab Research, NJ, Vol. 1398, pp. 4-15, USA, 1998.

W.H. Lin, A. Houptmann, News Video Classification Using SVM-based Multimodal Classifier and Combination Strategies, Proceedings of the tenth ACM international conference on Multimedia, pp. 323-326, 2002. http://dx.doi.org/10.1145/641007.641075

W.H. Lin, R. Jin, A. Houptmann, A Meta-classification of Multimedia Classifiers, International Workshop on Knowledge Discovery in Multimedia and Complex Data, Taiwan, 2002.

D. Morariu, L. Vintan, A Better Correlation of the SVM kernel's Parameters, Proceeding of the 5th RoEduNet International Conference, Sibiu, pp. 244-249, June 2006.

D. Morariu, L. Vintan, V. Tresp, Feature Selection Methods for an Improved SVM Classifier, Proceedings of the 14th International Conference on Computational and Information Science, pp. 83-89, Prague, August 2006.

D. Morariu, L. Vintan, V. Tresp, Evolutionary Feature Selection for Text Documents Using the SVM, Proceeding of the 3rd International Conference on Neural Computing and Patter Recognition, pp. 215-221, Barcelona, October 2006.

D. Morariu, Classification and Clustering using Support Vector Machine, 2nd PhD Report, University "Lucian Blaga" of Sibiu, September, 2005, http://webspace.ulbsibiu.ro/daniel.morariu/html/Docs /Report2.pdf.

D. Morariu, L. Vintan, V. Tresp, Meta-Classification using SVM Classifiers for Text Documents, The 3rd International Conference on Neural Computing and Patter Recognition, pp. 222-227, Barcelona, October 2006.

D. Morariu, Text Mining Methods based on Support Vector Machine, MatrixRom, Bucharest, 2008.

C. Nello, J. Swawe-Taylor, An introduction to Support Vector Machines, Cambridge University Press, 2000.

Reuters Corpus: http://about.reuters.com/researchandstandards/corpus/. Released in November 2000.

B. Schoelkopf, A. Smola, Learning with Kernels. Support Vector Machines, MIT Press, London, 2002.

G. Siyang, L. Quingrui, M. Lin, Meta-classifier in Text Classification, http://www. comp.nus.edu.sg/ zhouyong/papers/cs5228project.pdf.

R. Stoean, C. Stoean, M. Preuss, D. Dumitrescu, Evolutionary Multi-class Support Vector Machine for Classification, International Journal of Computers, Communications and Control, 1(S):423-428, 2006.

Published

2010-09-01

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.