Model of Network Topic Detection Based on Web Usage Behaviour Mode Analysis and Mining Technology

  • Mo Chen

Abstract

This research has caught researchers’ wide attention for detecting network topic exactly with the arrival of big data era characterized by semi-structured or unstructured text. This paper proposes a model of network topic detection based on web usage behaviour mode analysis and mining technology taking Web news as object of research. The author elaborates main function and method proposed in this model, which include the analysis module of Web news instance clicking mode, the analysis module of Web news instance retrieval mode, the analysis module of Web news instance seed and the analysis module of similar Web news instance supporting topics. Based on these functions and methods, the author elaborates main algorithm proposed in this model, which include the mining algorithm of Web news seed instances and the mining algorithm of similar Web news instances supporting topics. These functional algorithms have been applied in processing module of model, and focus on how to detect network topic efficiently from a large number of web usage behaviour towards to Web news instances, in order to explore a research method for network topic detection. The process of experimental analysis includes three steps, firstly, the author analyses the precision of topic detection under different method, secondly, the author completes the impact analysis of Web news topic detection quality from the number of Web news instances concerned and seed threshold, finally, the author completes the quality impact analysis of Web news instances mined supporting topic from the number of Web news instances concerned and probability threshold. The results of experimental analysis show the feasibility, validity and superiority of model design and play an important role in constructing topic-focused Web news corpus so as to provide a real-time data source for topic evolution tracking.

References

[1] Zhang Ji, Li Hongzhou, Gao Qigang, Wang Hai, Luo Yonglong, Detecting anomalies from big network traffic data using an adaptive detection approach, Information Sciences, 6(3): 96-97.

[2] Pandey Suraj, Nepal Surya, Cloud Computing and Scientific Applications-Big Data, Scalable Analytics, and Beyond, Future Generation Computer Systems, 29(7): 1774-1775.

[3] Zhu Zhiguo, A novel method for discovering frequent changing patterns from historical web access data, ICIC Express Letters, 8(9): 2443-2445.

[4] Nasomyont, Tamrerk, A study on the relationship between search engine optimization factors and rank on google search result page, Advanced Materials Research, 3(4): 1462-1464.
https://doi.org/10.4028/www.scientific.net/amr.931-932.1462

[5] Guo Yi, Chen Hao, Microblog user ranking based on PageRank and Hadoop, WIT Transactions on Information and Communication Technologies, 49(1): 1083-1085.

[6] Zhang Hongli, Huang Shouming, Web Information Extraction Method Based on MapReduce, Journal of Anhui Science and Technology University, 27(2): 72-74.

[7] Li Wen, Zheng Bangxi, Deng Wu, Research on Web Information Extraction Model Based on XML and DOM Technologies, Journal of Dalian Jiaotong University, 34(3): 96-98.

[8] Zhang Yaming, Tang Chaosheng, Information propagation model based on the dynamics of complex networks in mircoblogging, Journal of Computational Information Systems, 10(1): 443-445.

[9] Wu Jiagao, Zhou Fankun, Zhang Xueying, Research of the Extraction Method of Event Properties Based on the Combining of HMM and Syntactic Analysis, Journal of Nanjing Normal University(Natural Science Edition), 37(1): 30-32.

[10] Yang Yuzhen, Liu Peiyu, Fei Shaodong, Zhang Chenggong, A topic link detection method based on improved information bottleneck theory, Zidonghua Xuebao/Acta Automatica Sinica, 40(3): 471-479.

[11] Suhara, Yoshihiko, Toda, Hiroyuki, Nishioka, Shuichi, Susaki, Seiji, Automatically generated spam detection based on sentence-level topic information, WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web, 1157-1160.

[12] Pang Junbiao, Jia Fei, Zhang Chunjie, Zhang Chenggong, Unsupervised Web Topic Detection Using A Ranked Clustering-Like Pattern Across Similarity Cascades, IEEE TRANSACTIONS ON MULTIMEDIA, 17(6): 843-853.
https://doi.org/10.1109/TMM.2015.2425143

[13] Dziczkowski, Grzegorz, Wegrzyn-Wolska, Katarzyna, Bougueroua, Lamine, An opinion mining approach for web user identification and clients' behaviour analysis, IEEE Computer Society, 79-84.

[14] Karakostas, Bill, Theodoulidis, Babis, A MapReduce architecture for web site user behaviour monitoring in real time, DATA 2013 - Proceedings of the 2nd International Conference on Data Technologies and Applications, 45-52.

[15] Zhang Yongheng, Feng Zhang, Fei You, A New Replacement Algorithm of Web Search Engine Cache based on User Behavior, Applied Mathematics & Information Sciences, 8(6): 3049-3054.

[16] Chen Mo, Yang Xiaoping, Research on Model of Network Information Extraction Based on Improved Topic-Focused Web Crawler Key Technology, Tehnicki vjesnik/Technical Gazette, 23(4): 49-54.

[17] Chen Xuegang, Research and realization of E-commerce monitor system based on focused web crawler, Information Technology Journal, 12(17): 4033-4035.

[18] Balla, Andoena, Real-time web crawler detection, 2011 18th International Conference on Telecommunications, 428-430.
https://doi.org/10.1109/cts.2011.5898963

[19] Ahmadi-Abkenari, F, A clickstream-based web page significance ranking metric for web crawlers, 2011 5th Malaysian Conference in Software Engineering, 223-225.
https://doi.org/10.1109/mysec.2011.6140674

[20] Chen Mo, Yang Xiaoping, Liu Ting, A research on user behavior sequence analysis based on social networking service use-case model, International Journal of u- and e- Service, Science and Technology, 7(2): 1-4.

[21] Chen Mo, Yang Xiaoping, Sun Meng, Zhao Yun, Research on model of network information currency evaluation based on web semantic extraction method, International Journal of Future Generation Communication and Networking, 7(2): 103-105.

[22] Zhu Tao, Lin Yumin, Cheng Ji,Wang Xiaoling, Efficient diverse rank of hot-topics-discussion on social network, Lecture Notes in Computer Science, 8485(1): 522-524.

[23] Lu Ran, Xue Suzhi, Ren Yuanyuan, Zhu Zhenfang, A modified approach of hot topics found on micro-blog, Lecture Notes in Electrical Engineering, 269(1): 603-605.
Published
2017-03-01
How to Cite
CHEN, Mo. Model of Network Topic Detection Based on Web Usage Behaviour Mode Analysis and Mining Technology. INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, [S.l.], v. 12, n. 2, p. 183-200, mar. 2017. ISSN 1841-9844. Available at: <http://univagora.ro/jour/index.php/ijccc/article/view/2599>. Date accessed: 16 july 2020. doi: https://doi.org/10.15837/ijccc.2017.2.2599.

Keywords

web usage behaviour, network topic detection, clicking mode analysis, retrieval mode analysis