Semantic Graph Based Convolutional Neural Network for Spam e-mail Classification in Cybercrime Applications

S.  Rahmath Nisha; S. Muthurajkumar

doi:10.15837/ijccc.2023.1.4478

Authors

S. Rahmath Nisha Department of Computer Science and Engineering, K.Ramakrishnan College of Technology, Tamilnadu, India
S. Muthurajkumar Department of Computer Technology, Madras Institute of Technology (MIT) Campus, Anna University, Tamilnadu, India

DOI:

https://doi.org/10.15837/ijccc.2023.1.4478

Keywords:

Spam E-mail classification, Convolutional Neural Network, Semantic Graph, Graph Neural Network

Abstract

Spam is characterized as unnecessary and garbage E-mails. Due to the increasing of unsolicited E-mails, it is becoming more and more crucial for mail users to utilize a trustworthy spam E-mail filter. The shortcomings of spam classifier are defined by their increasing inability to manage large amounts of relevant messages and to effectively detect and effectively detect spam messages. Numerous characteristics in spam classifications are problematic. Given that selecting features is one of the most often used and successful techniques for feature reduction, it is a crucial duty in the identification of keyword content. As a result, features that are unnecessary and pointless yet potentially harm effciency would be removed. In this study, we present SGNNCNN (Semantic Graph Neural Network With CNN) as a solution to tackle the diffcult task of mail identification. By projections E-mails onto a graph and by using the SGNN-CNN model for classifications, this technique transforms the E-mail classification issue into a graph classification challenge. There is no need to integrate the word into a representation since the E-mail characteristics are produced from the semantic network. On several open databases, the technique's effectiveness is evaluated. Some few public databases were used in experiments to demonstrate the high accuracy of the proposed approach for classifying E-mails. In term of spam classification, the performance is superior to state-of-the-art deep learning-based methods.

References

Harisinghaney, Anirudh, et al. "Text and image based spam E-mail classification using KNN, Naïve Bayes and Reverse DBSCAN algorithm." International Conference on Reliability Optimization and Information Technology (ICROIT). IEEE. 2014.

https://doi.org/10.1109/ICROIT.2014.6798302

Sharaff, Aakanksha, and Harshil Gupta.. "Extra-tree classifier with metaheuristics approach for E-mail classification." Advances in Computer Communication and Computational Sciences. Springer, Singapore, 2019. 189-197.

https://doi.org/10.1007/978-981-13-6861-5_17

Bouguila, Nizar, and Ola Amayri. "A discrete mixture-based kernel for SVMs: application to spam and image categorization." Information processing & management 45.6 : 631-642.

https://doi.org/10.1016/j.ipm.2009.05.005

Derhab, Abdelouahid, et al. 2020 "Intrusion Detection System for Internet of Things Based on Temporal Convolution Neural Network and Efficient Feature Engineering." Wireless Communications and Mobile Computing . 2009.

https://doi.org/10.1155/2020/6689134

Cao, Yukun, Xiaofeng Liao, and Yunfeng Li. "An e-mail filtering approach using neural network." International Symposium on Neural Networks. Springer, Berlin, Heidelberg. 2004.

https://doi.org/10.1007/978-3-540-28648-6_110

Alghoul, Ahmed, et al. "E-mail Classification Using Artificial Neural Network." International Journal of Academic Engineering Research, Vol. 2 Issue 11. 2018.

Soni, Ankit Narendrakumar. " Spam e-mail detection using advanced deep convolution neural network algorithms." Journal for innovative development in pharmaceutical and technical science 2.5 : 74-80. 2019.

Srinivasan, Sriram, et al. "Spam E-mails Detection Based on Distributed Word Embedding with Deep Learning." Machine Intelligence and Big Data Analytics for Cybersecurity Applications. Springer, Cham,. 161-189. 2021.

https://doi.org/10.1007/978-3-030-57024-8_7

Rathod, Sunil B., and Tareek M. Pattewar."Content based spam detection in E-mail using Bayesian classifier." International Conference on Communications and Signal Processing (ICCSP). IEEE. 2015

https://doi.org/10.1109/ICCSP.2015.7322709

Androutsopoulos, Ion, Georgios Paliouras, and Eirinaios Michelakis. Learning to filter unsolicited commercial e-mail. " DEMOKRITOS", National Center for Scientific Research. 2004.

Hall, Mark, et al. "The WEKA data mining software: an update." ACM SIGKDD explorations newsletter 11.1 : 10-18. 2009.

https://doi.org/10.1145/1656274.1656278

Rusland, Nurul Fitriah, et al. "Analysis of Naïve Bayes algorithm for E-mail spam filtering across multiple datasets." Proceedings of the IOP Conference Series: Materials Science and Engineering. 2017.

https://doi.org/10.1088/1757-899X/226/1/012091

Feng, Weimiao, et al. "A support vector machine based naive Bayes algorithm for spam filtering." 2016 IEEE 35th International Performance Computing and Communications Conference (IPCCC). IEEE. 2016.

https://doi.org/10.1109/PCCC.2016.7820655

Vishagini, V., and Archana K. Rajan. "An improved spam detection method with weighted support vector machine." International Conference on Data Science and Engineering (ICDSE). IEEE. 2018.

https://doi.org/10.1109/ICDSE.2018.8527737

Karthika, R., and P. Visalakshi. "A hybrid ACO based feature selection method for E-mail spam classification." WSEAS Trans. Comput 14 : 171-177. 2015

Bagui, Sikha, et al. "Classifying Phishing E-mail Using Machine Learning and Deep Learning." International Conference on Cyber Security and Protection of Digital Services (Cyber Security). IEEE. 2019

https://doi.org/10.1109/CyberSecPODS.2019.8885143

Seth, S., & Biswas, S. , "Multimodal spam classification using deep learning techniques." In 2017 13th International Conference on Signal-Image Technology & Internet-Based Systems. IEEE. 2017

https://doi.org/10.1109/SITIS.2017.91

Blei, David M., Andrew Y. Ng, and Michael I. Jordan. "Latent dirichlet allocation." the Journal of machine Learning research.3 : 993-1022. 2003

Trotman, Andrew, Antti Puurula, and Blake Burgess. "Improvements to BM25 and language models examined." Proceedings of the 2014 Australasian Document Computing Symposium.. 2014.

https://doi.org/10.1145/2682862.2682863

Karthik, A., MazherIqbal, J.L. Efficient Speech Enhancement Using Recurrent Convolution Encoder and Decoder. Wireless Pers Commun 119, 1959-1973 (2021).

https://doi.org/10.1007/s11277-021-08313-6

Zhou, Jie, et al. "Graph neural networks: A review of methods and applications." arXiv preprint arXiv:1812.08434 . 2018.

Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., & Philip, S. Y.. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, 32(1), 2020.pp4-24.

https://doi.org/10.1109/TNNLS.2020.2978386

Douzi, Samira, et al. "Hybrid E-mail spam detection model using artificial intelligence." Int. J. Mach. Learn. Comput 10.2 : 316-322. 2020.

https://doi.org/10.18178/ijmlc.2020.10.2.937

Saini, Divyanjali, and Monalisa Meena. "Hybrid Forecasting Scheme for Enhance Prediction Accuracy of Spambase Dataset." Proceedings of International Conference on Communication and Computational Technologies. Springer, Singapore.. 2021

https://doi.org/10.1007/978-981-15-5077-5_24

Kim, Ji-hye, and Ok-ran Jeong. "Knowledge Graph-based Korean New Words Detection Mechanism for Spam Filtering." Journal of Internet Computing and Services 21.1 : 79-85. 2020

Tran, M., Elsisi, M., & Liu, M. (2021). Effective feature selection with fuzzy entropy and similarity classifier for chatter vibration diagnosis. Measurement, 184, 109962.

https://doi.org/10.1016/j.measurement.2021.109962

D. N. V. S. L. S. Indira, Rajendra Kumar Ganiya, P. Ashok Babu, A. Jasmine Xavier, L. Kavisankar, S. Hemalatha, V. Senthilkumar, T. Kavitha, A. Rajaram, Karthik Annam, Alazar Yeshitla, "Improved Artificial Neural Network with State Order Dataset Estimation for Brain Cancer Cell Diagnosis", BioMed Research International, vol. 2022, Article ID 7799812, 10 pages, 2022.

https://doi.org/10.1155/2022/7799812

Tran, M., Liu, M., & Elsisi, M. (2021). Effective multi-sensor data fusion for chatter detection in milling process. ISA transactions.

https://doi.org/10.1016/j.isatra.2021.07.005

Tran, M., Elsisi, M., Mahmoud, K., Liu, M., Lehtonen, M., & Darwish, M.M. (2021). Experimental Setup for Online Fault Diagnosis of Induction Machines via Promising IoT and Machine Learning: Towards Industry 4.0 Empowerment. IEEE Access, 9, 115429-115441.

https://doi.org/10.1109/ACCESS.2021.3105297

Elsisi, M., Tran, M., Mahmoud, K., Mansour, D.A., Lehtonen, M., & Darwish, M.M. (2021). Towards Secured Online Monitoring for Digitalized GIS Against Cyber-Attacks Based on IoT and Machine Learning. IEEE Access, 9, 78415-78427.

https://doi.org/10.1109/ACCESS.2021.3083499