1. Jung Hoon Kim
N5, Room 2239
E-mail: junghoon.kim@kaist.ac.kr
2014.01.07
KAIST Knowledge Service Engineering
Data Mining Lab.
1
2. Introduction
Frequent pattern and association rule mining is one of
the few exceptions to emerge from machine learning
Apriori algorithm
AprioriTid algorithm
AprioriAll algorithm
FP-Tree algorithm
KAIST Knowledge Service Engineering
Data Mining Lab.
2
4. Principle
downward closure property.
If an itemset is frequenct,
then all of its subsets must
also be frequent
if an itemset is not frequent,
any of its superset is never
frequent
KAIST Knowledge Service Engineering
Data Mining Lab.
4
7. Discussion
Too many database scanning makes high computation
Need minsup & minconf to be specified in advance.
Use hash-tree to store the candidate itemsets.
Sometimes it adapt trie-structure to store sets.
KAIST Knowledge Service Engineering
Data Mining Lab.
7
12. FP-Growth
To avoid scanning multiple database
the cost of database is too high !!
To avoid making lots of candidates
in apriori algorithm, the bottleneck is generation of
candidate
How can solve these problems?
KAIST Knowledge Service Engineering
Data Mining Lab.
12
13. FP-Growth
Algorithm was too simple
1. Scan the database once, find frequent 1-itemsets
(single item patterns)
2. Sort the frequent items in frequency descending
order, f-list(F-list = f-c-a-b-m-p)
3. Scan the DB again, construct the FP-tree
KAIST Knowledge Service Engineering
Data Mining Lab.
13
18. Mine a FP-Tree
forming conditional pattern bases
II. constructing conditional FP-trees
III. recursively mining conditional FP-trees
I.
KAIST Knowledge Service Engineering
Data Mining Lab.
18
19. Conditional pattern base
frequent itemset as a co-occurring
suffix pattern
for example
m : <f, c, a> : support / 2
m : <f,c,a,b> : support / 1
KAIST Knowledge Service Engineering
Data Mining Lab.
19
20. Conditional pattern tree
{m}’s conditional pattern tree
KAIST Knowledge Service Engineering
Data Mining Lab.
20
22. Conclusion
In data mining, association rules are useful for analyzing
and predicting customer behavior. They play an
important part in shopping basket data analysis, product
clustering, catalog design and store layout.
KAIST Knowledge Service Engineering
Data Mining Lab.
22