Depth-first frequent itemset mining in relational databases

Xuequn Shang, Kai Uwe Sattler

Research output: Contribution to conferencePaperpeer-review

7 Scopus citations

Abstract

Data mining on large relational databases has gained popularity and its significance is well recognized. However, the performance of SQL based data mining is known to fall behind specialized implementation since the prohibitive nature of the cost associated with extracting knowledge, as well as the lack of suitable declarative query language support. We investigate approaches based on SQL for the problem of finding frequent patterns from a transaction table, including an algorithm that we recently proposed, called Propad (Pro-jection PAttern Discovery). Propad fundamentally differs from an Apriori-like candidate set generation-and-test approach. This approach successively projects the transaction table into frequent itemsets to avoid making multiple passes over the large original transaction table and generating a huge sets of candidates. We have made performance evaluation on DBMS (IBM DB2 UDB EEE V8) and compared the performance results with K-Way join approach proposed in [11] and SQL based FP-tree approach proposed in [13]. The experimental results show that our algorithm can get efficient performance.

Original languageEnglish
Pages1112-1117
Number of pages6
DOIs
StatePublished - 2005
Externally publishedYes
Event20th Annual ACM Symposium on Applied Computing - Santa Fe, NM, United States
Duration: 13 Mar 200517 Mar 2005

Conference

Conference20th Annual ACM Symposium on Applied Computing
Country/TerritoryUnited States
CitySanta Fe, NM
Period13/03/0517/03/05

Keywords

  • Data mining
  • Database mining
  • Frequent pattern mining
  • Mining algorithms in SQL

Fingerprint

Dive into the research topics of 'Depth-first frequent itemset mining in relational databases'. Together they form a unique fingerprint.

Cite this