SQL based frequent pattern mining without candidate generation

Xuequn Shang, Kai Uwe Sattler, Ingolf Geist

科研成果: 会议稿件论文同行评审

10 引用 (Scopus)

摘要

Scalable data mining in large databases is one of today's real challenges to database research area. The integration of data mining with database systems is an essential component for any successful large-scale data mining application. A fundamental component in data mining tasks is finding frequent patterns in a given dataset. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist prolific patterns and/or long patterns. In this study we present an evaluation of SQL based frequent pattern mining with a novel frequent pattern growth (FP-growth) method, which is efficient and scalable for mining both long and short patterns without candidate generation. We examine some techniques to improve performance. In addition, we have made performance evaluation on commercial DBMS (IBM DB2 UDB EEE V8).

源语言英语
618-619
页数2
DOI
出版状态已出版 - 2004
已对外发布
活动Applied Computing 2004 - Proceedings of the 2004 ACM Symposium on Applied Computing - Nicosia, 塞浦路斯
期限: 14 3月 200417 3月 2004

会议

会议Applied Computing 2004 - Proceedings of the 2004 ACM Symposium on Applied Computing
国家/地区塞浦路斯
Nicosia
时期14/03/0417/03/04

指纹

探究 'SQL based frequent pattern mining without candidate generation' 的科研主题。它们共同构成独一无二的指纹。

引用此