Processing sequential patterns in relational databases

Xuequn Shang, Kai Uwe Sattler

科研成果: 期刊稿件会议文章同行评审

1 引用 (Scopus)

摘要

Database integration of data mining has gained popularity and its significance is well recognized. However, the performance of SQL based data mining is known to fall behind specialized implementation since the prohibitive nature of the cost associated with extracting knowledge, as well as the lack of suitable declarative query language support. Recent studies have found that for association rule mining and sequential pattern mining with carefully tuned SQL formulations it is possible to achieve performance comparable to systems that cache the data in files outside the DBMS. However most of the previous pattern mining methods follow the method of Apriori which still encounters problems when a sequential database is large and/or when sequential patterns to be mined are numerous and long. In this paper, we present a novel SQL based approach that we recently proposed, called Prospad (PROjection Sequential PAttern Discovery). Prospad fundamentally differs from an Apriori-like candidate set generation-and-test approach. This approach is a pattern growth-based approach without candidate generation. It grows longer patterns from shorter ones by successively projecting the sequential table into subsequential tables. Since a projected table for a sequential pattern i contains all and only necessary information for mining the sequential patterns that can grow from i, the size of the projected table usually reduces quickly as mining proceeds to longer patterns. Moreover, avoiding creating and dropping cost of some temporary tables, depth first approach is used to facilitate the projecting process.

源语言英语
页(从-至)438-447
页数10
期刊Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
3589
DOI
出版状态已出版 - 2005
已对外发布
活动7th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2005 - Copenhagen, 丹麦
期限: 22 8月 200526 8月 2005

指纹

探究 'Processing sequential patterns in relational databases' 的科研主题。它们共同构成独一无二的指纹。

引用此