Mining Temporal Sequential Patterns Based on Multi-granularities

Authors

  • Naiqian Li Department of Computer Science Baoji University of Arts and Sciences Baoji 721016, Shaanxi, China
  • Xinhui Yao Department of Computer Science Baoji University of Arts and Sciences Baoji 721016, Shaanxi, China
  • Dongpin Tian Department of Computer Science Baoji University of Arts and Sciences Baoji 721016, Shaanxi, China

Keywords:

Data Mining Algorithm, Sequential Pattern Mining, Sequential Data, Time Granularity, Temporal Patterns

Abstract

Sequential pattern mining is an important data mining problem that can extract frequent subsequences from sequences. However, the times between successive items in a sequence is typically used as user-specified constraints to pre-process the input data or to prune the pattern search space. In either cases, the times cannot be used to identify item intervals of sequential patterns. In this paper, we introduce a form of multi-granularity sequence patterns, which is a sequential pattern where each transition time is annotated with multi-granularity boundary interval and average time derived from the source data rather than the user-predetermined time interval or only a typical time. Then we present a novel algorithm, MG-PrefixSpan, of multiple granularity sequential patterns based on PrefixSpan[, which discovers all such patterns. Empirical evaluation shows that MG-PrefixSpan scales up linearly as the size of database, and has a good scalability with respect to the length of sequence and the size of transaction.

Author Biography

Naiqian Li, Department of Computer Science Baoji University of Arts and Sciences Baoji 721016, Shaanxi, China

Department of Mathematics and Computer Science

References

Constantinescu, Z, Marinoiu, C, Vladoiu, M, "Driving Style Analysis Using Data Mining Techniques," International Journal Of Computers Communications and Control, Vol.5, No.5, pp. 654- 663, Dec 2010.

Andonie, R, "Extreme Data Mining: Inference from Small Datasets," International Journal Of Computers Communications and Control, Vol.5, No.3, pp. 280-291, Sep 2010.

R. Agrawal and R. Srikant, "Mining sequential patterns," Proc. of the 7th International Conference on Data Engineering (ICDE'95), pp. 3-14, March, 1995. http://dx.doi.org/10.1109/ICDE.1995.380415

R. Srikant and R. Agrawal, "Mining sequential patterns: Generalizations and performance improvements," Proc. of the 5th International Conference on Extending Database Technology, pp. 3-17, March, 1996.

M. Zaki, "An Efficient Algorithm for Mining Frequent Sequences," Machine Learning, Vol. 40, pp. 31-60, 2000.

J. Pei, J. Han and H. Pinto et al, "Mining Sequential Pattern-Growth: The PrefixSpan Approach," IEEE Transactions on Knowledge and Engineering, Vol.16, No.11, pp. 1424-1440, 2004. http://dx.doi.org/10.1109/TKDE.2004.77

H. Manila, H. Toivonen, and A.I. Verkamo, "Discovery of frequent episodes in event sequences," Data Mining and Knowledge Discovery, Vol. 1, No. 3, pp. 256-289, 1997.

M.N. Garofalakis, R. Rastogi and K. Shim, "SPIRIT: Sequential Pattern Mining with Regular Expression Constraints," Proc. of the 25th International Conference on Very Large Data Bases (VLDB'99), pp. 223-234, September, 1999.

J. Pei, J. Han and W. Wang, "Constraint-based Sequential Pattern Mining: The Pattern-growth Methods," Journal of Intelligent Information Systems, Volume 28, Issue 2, pp. 133-160, April, 2007. http://dx.doi.org/10.1007/s10844-006-0006-z

M. Yoshida et al. "Mining sequential patterns including time intervals", Proc. of SPIE Conf.- DMKD, pp. 213-220, April, 2000.

Y.-L. Chen, M.-C. Chiang and M.-T. Ko, "Discovering time-interval sequential patterns in sequence databases," Expert System with Applications, Volume 25, Issue3, Pp. 343-354, October, 2003. http://dx.doi.org/10.1016/S0957-4174(03)00075-7

R. Algawal and R. Srikant, "Fast algorithm for mining association rules in Large Databases.," Proc. of the 20th International Conference on Very Large Data bases (VLDB'94), pp. 487-499, September 1994.

Y. Hirate, H. Yamana, "Generalized Pattern Mining with Item Intervals," Journal of Computers, Vol.1, No3, pp. 51-60, June, 2006. http://dx.doi.org/10.4304/jcp.1.3.51-60

F. Giannotti, M. Nanni, and D. Pedreschi. "Efficient Mining of Temporally Annotated Sequences," Proc. of the 6th SIAM International Conference on Data Mining, pp. 346-357, April, 2006.

C. Bettini, X.S. Wang and S. Jajodia et al, "Discovering Temporal Relationships with Multiple Granularities in Time Sequences," IEEE Transactions on Knowledge and Data Engineering, Vol. 10 (2), pp. 222-237, 1998. http://dx.doi.org/10.1109/69.683754

Published

2014-09-18

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.