A New Rymon Tree Based Procedure for Mining Statistically Significant Frequent Itemsets
Keywords:
frequent itemset mining, association analysis, Apriori algorithm, Rymon treeAbstract
In this paper we suggest a new method for frequent itemsets mining, which is more efficient than well known Apriori algorithm. The method is based on special structure called Rymon tree. For its implementation, we suggest modified sort-merge-join algorithm. Finally, we explain how support measure, which is used in Apriori algorithm, gives statistically significant frequent itemsets.References
Agrawal, R., Srikant, R., Fast Algorithms for Mining Association Rules, Proceedings of VLDB-94, 487-499, Santiago, Chile (1994)
Coenen, F.P., Leng, P., Ahmed, S., T-Trees, Vertical Partitioning and Distributed Association Rule Mining, Proceedings ICDM-2003, 513-516 (2003) http://dx.doi.org/10.1109/icdm.2003.1250965
Coenen, F.P., Leng, P., Ahmed, S., Data Structures for Association Rule Mining: T-trees and Ptrees, IEEE Transactions on Data and Knowledge Engineering, Vol. 16, No 6, 774-778 (2004) http://dx.doi.org/10.1109/TKDE.2004.8
Coenen, F.P., Leng, P., Goulbourne, G., Tree Structures for Mining Association Rules, Journal of Data Mining and Knowledge Discovery Vol. 8, No. 1, 25-51 (2004) http://dx.doi.org/10.1023/B:DAMI.0000005257.93780.3b
Goulbourne, G., Coenen, F., Leng, P., Algorithms for Computing Association Rules Using a Partial- Support Tree, Journal of Knowledge-Based Systems Vol. 13, 141-149 (1999) http://dx.doi.org/10.1016/S0950-7051(00)00055-1
Grahne, G., Zhu, J., Efficiently Using Prefix-trees in Mining Frequent Itemsets, Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (2003)
Han, J., Pei, J., Yu, P.S., Mining Frequent Patterns without Candidate Generation, Proceedings of the ACM SIGMOD Conference on Management of Data, 1-12 (2000) http://dx.doi.org/10.1145/342009.335372
Rymon, R., Search Through Systematic Set Enumeration, Proceedings of 3rd International Conference on Principles of Knowledge Representation and Reasoning, 539-550 (1992)
Silberschatz, A., Korth, H. F., Sudarshan, S., Database System Concepts, Mc Graw Hill, New York (2006)
Simovici, A. D., Djeraba, C., Mathematical Tools for Data Mining (Set Theory, Partial Orders, Combinatorics), Springer-Verlag London Limited (2008)
Stanisic, P., Tomovic, S., Apriori Multiple Algorithm for Mining Association Rules, Information Technology and Control Vol. 37, No. 4, 311-320 (2008)
Tan., P.N., Steinbach, M., Kumar, V., Introduction to Data Mining, Addicon Wesley (2006).
Published
Issue
Section
License
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.