Research on Key Technology of Web Hierarchical Topic Detection and Evolution Based on Behaviour Tracking Analysis

Mo Chen


In the development background of today’s big data era, the research direction of Web hierarchical topic detection and evolution characterized by the semistructured or unstructured data has caught wide attention for academicians. This paper proposes an idea of Web hierarchical topic detection and evolution based on behaviour tracking analysis taking the network big data as the research object, and expounds main implementation methods, which include the instance analysis of the usage mode, the instance analysis of the seed, the set analysis of similar instance supporting the topics, the set analysis of similar instance supporting the events, the evolution analysis of the event, and expounds the algorithm of Web hierarchical topic detection and evolution based on behaviour tracking analysis. The process of experimental analysis is organized as follows, first of all, the experiment analyses the quality of topic detection, the accuracy rate with the number of instance concerned and the seed threshold variation trend, the accuracy rate with the number of instance concerned and the probability threshold variation trend, secondly, the experiment analyses the quality of topic evolution, the accuracy rate with the variation trend of parameter adjustment, the accuracy rate with the number of instance concerned and the similar threshold variation trend, finally, the experiment analyses the time consuming to solve main research problem under different method, the qualitative result of topic detection and evolution under different data set. The results of experimental analysis show the idea is feasible, verifiable and superior, which plays a major role in reconfiguring Web hierarchical topic corpus and providing an intelligent big data warehouse for the network information evolution application.


Web hierarchical topic, topic detection, event evolution, behaviour tracking analysis

 Impact Factor in JCR2017 (Clarivate Analytics/SCI Expanded/ISI Web of Science): IF=1.29 (Q3). Scopus: CiteScore2017=1.04 (Q2); Editors-in-Chief: Ioan DZITAC & Florin Gheorghe FILIP.