SlideShare a Scribd company logo
1 of 26
Download to read offline
Introduction to Machine
       Learning
                      Lecture 15
 Advanced Topics in Association Rules Mining

                     Albert Orriols i Puig
                 http://www.albertorriols.net
                 htt //       lb t i l      t
                    aorriols@salle.url.edu

          Artificial Intelligence – Machine Learning
                            g                      g
              Enginyeria i Arquitectura La Salle
                     Universitat Ramon Llull
Recap of Lecture 13-14
        Ideas come from the market basket analysis (
                                              y    (MBA)
                                                       )
                Let’s go shopping!

           Milk, eggs, sugar,
                 bread
                                 Milk, eggs, cereal,        Eggs, sugar
                                        bread
                                        bd




              Customer1

                                     Customer2               Customer3

                What do my customer buy? Which product are bought together?
                Aim: Find associations and correlations between t e d e e t
                         d assoc at o s a d co e at o s bet ee the different
                items that customers place in their shopping basket
                                                                          Slide 2
Artificial Intelligence                Machine Learning
Recap of Lecture 13-14
        Apriori
         p
                Will find all the association with minimum support and
                co de ce
                confidence
                However:
                          Scans the data base multiple times
                          Most often, there is a high number of candidates
                          Support counting for candidates can be time expensive

        FP-growth
                Will obtain the same rules than Apriori
                Avoids candidate generation by building a GP tree
                Counting the support of candidates more efficiently



                                                                                  Slide 3
Artificial Intelligence                      Machine Learning
Today’s Agenda
        Continuing our journey through some advanced
        topics in ARM
                Mining frequent patterns without candidate
                generation
                Multiple Level AR
                Sequential Pattern Mining
                Quantitative association rules
                Mining class association rules
                Beyond support & confidence
                B    d       t      fid
                Applications

                                                             Slide 4
Artificial Intelligence             Machine Learning
Acknowledgments
        Part of this lecture is based on the work by
                                                   y




                                                       Slide 5
Artificial Intelligence        Machine Learning
Why Multiple Level AR?
        Aim: Find associations between items

        But wait!
                There are many different diapers
                          Dodot, Huggies …
                                   gg

                There are many different beers:
                          heineken, desperados, king fisher … in bottle/can …
                                  ,    p      ,    g


        Which rule do you prefer?
                diapers ⇒ beer
                dodot diapers M ⇒ Dam beer in Can


        Which will have greater support?

                                                                                Slide 6
Artificial Intelligence                        Machine Learning
Concept Hierarchy
        Create is-a hierarchies
                             Clothes                                  Footwear



                  Outwear                                     Shoes
                                        Shirts                              Hiking Boots



  Jackets                   Ski Pants

        Assume we found the rule: Outwear ⇒ Hiking boots
        Then
                Jackets ⇒ Hiking boots may not have minimum support
                Clothes ⇒ Hiking boots may not have minimum confidence


                                                                                 Slide 7
Artificial Intelligence                    Machine Learning
Concept Hierarchy
        This means that
                Rules at lower levels may not have enough support to be part of any
                frequent itemset
                However, rules at a lower level of the hierarchy which are overspecific
                may denote a strong association
                          Jackets ⇒ Hiking boots

        So, which rules do you want?
                Users are interested in generating rules that span different levels of
                the taxonomy
                Rules of lower levels may not have minimum support
                Taxonomy can be used to prune uninteresting or redundant rules
                Multiple taxonomies may be present
                For example: category, price (cheap, expensive), “items-on-sale”, etc
                Multiple taxonomies may be modeled as a forest, or a DAG

                                                                                    Slide 8
Artificial Intelligence                       Machine Learning
Notation

                                     z
                                                       ancestors
                                                   (marked with ^)
       edge:
                                                       parent
 is_a relationship                   p



                          c1                      c2     child

                          descendants

                                                                     Slide 9
Artificial Intelligence        Machine Learning
Notation
      Formalizing the problem
                g     p
               I = {i1, i2, …, im}- items
               T-transaction, set of items T ⊆ I
               Tt       ti      t f it
               D-set of transactions
               T supports item x, if x is in T or x is an ancestor of some item in T
               T supports X ⊆ I if it supports e e y item in X
                 suppo ts           t suppo ts every te
               Generalized association rule: X ⇒ Y
                          if X ⊂ I Y ⊂ I X ∩ Y = ∅ and no item in Y is an ancestor of any
                                                 ∅,
                                 I,    I,
                          item in X.
                              That is, jacket ⇒ clothes is essentially true
                          The rule X ⇒ Y has confidence c in D if c% of transactions in D
                          that support X also support Y
                          The rule X ⇒ Y has support s in D if s% of transactions in D
                          supports X ∪ Y
                                                                                    Slide 10
Artificial Intelligence                          Machine Learning
So, Let’s Re-state the Problem
        New aim: find all generalized association rules that have
                          g
        support and confidence greater than the user-specified
        minimum support (called minsup) and minimum confidence
        (called minconf) respectively

                             Clothes                                   Footwear


                  Outwear                                      Shoes
                                        Shirts                               Hiking Boots


  Jackets
  J kt                      Ski P t
                                Pants


           Antecedent and consequent may have items of any level of the hierarchy
           Do you see any potential problem?
                    I can find many redundant rules!

                                                                                  Slide 11
Artificial Intelligence                     Machine Learning
Mining the Example
                                                                        Frequent Itemsets
                            Database D
                                                                          Itemset           Support
        Transaction Items Bought
                                                                          {Jacket}          2
               100          Shirt
                                                                         {Outwear}
                                                                         {O t    }          3
               200          Jacket, Hiking Boots
                                                                         {Clothes}          4
               300          Ski Pants, Hiking Boots
                                                                          {Shoes}           2
               400          Shoes
                            Sh
                                                                       {Hiking Boots}       2
               500          Shoes
                                                                        {Footwear}          4
               600          Jacket
                                                                  {Outwear, Hiking Boots}   2
                              Rules
                                                                   {
                                                                   {Clothes,Hiking Boots}
                                                                           ,     g      }   2
                Rule                  Support Confidence
                                                                    {Outwear, Footwear}     2
Outwear ⇒ Hiking Boots                 33%    66.6%
                                                                    {Clothes, Footwear}     2
Outwear ⇒ Footwear                     33%    66.6%
Hiking Boots ⇒ Outwear                 33%    100%                minsup = 30%
Hiking Boots ⇒ Clothes                 33%    100%
                                                                  minconf = 60%

                                                                                            Slide 12
  Artificial Intelligence                      Machine Learning
Mining the Example
        Observation 1
                If the set{x,y} has minimum support, so do {x^,y^} {x^,y} and
                { ,y }
                {x^,y^}
                E.g.:
                          if {Jacket Shoes} has minsup then
                             {Jacket,
                          {Outwear, Shoes}, {Jacket, Footwear}, and {Outwear,
                          Footwear} also have minimum support
                                  }                       pp




                                                                                Slide 13
Artificial Intelligence                     Machine Learning
Mining the Example
        Observation 2
                If the rule x ⇒ y has minimum support and confidence, then
                x ⇒ y^ is guaranteed to have bot minsup a d minconf.
                      y s gua a teed      a e both    sup and    co
                E.g.:
                          The rule Outwear ⇒ Hiking Boots has minsup and minconf
                                                                         minconf.
                          The rule Outwear ⇒ Footwear has both minsup and minconf
                However, th rules x^ ⇒ y and x^ ⇒ y^ will h
                H        the l     ^       d^      ^ ill have minsup, th
                                                               i      they
                may not have minconf.
                E.g.:
                E
                          Clothes ⇒ Hiking Boots
                          Cl th ⇒ F t
                          Clothes Footwear
                             have minsup, but not minconf



                                                                              Slide 14
Artificial Intelligence                      Machine Learning
Interesting Rules
        So, in which rules are we interested?
          ,
                Up to now, we were interested in rules that
                          How much the support of a rule was more than the expected
                          support based on the support of the antecedent and the
                          consequent
                          But this does not consider taxonomy
                          I have poor pruning… But now, I need to prune a lot!
                Shrikant and Agrawal proposed a different approach
                          Consider that                                       Milk
                              Milk ⇒ cereal [s=0.08, c=0.70]                  [s = ]
                          And that
                              Skim milk ⇒ cereal [s=0.02, c=0.70]   2% Milk            Skim Milk
                                                                     [s = ]              [s = ]
                          So, do you think that the second rule
                          is important?
                              May be not!
                                                                                        Slide 15
Artificial Intelligence                       Machine Learning
Interesting Rules
        A rule is X ⇒ Y is R-interesting w.r.t
                                       g
        an ancestor X^ ⇒ Y^ if:

real s ( X ⇒ Y ) > R · expected s( X ⇒ Y ) b d on ( X ^ ⇒ Y ^ )
   l                        td(            based

        or

real c ( X ⇒ Y ) > R · expected s( X ⇒ Y ) b d on ( X ^ ⇒ Y ^ )
   l                          d(           based

                Aim: Interesting rules will be those whose support is more than
                R times the expected value or whose confidence is more than
                R times the expected value for some user specified constant R
                                       value,          user-specified




                                                                          Slide 16
Artificial Intelligence               Machine Learning
Interesting Rules
        What’s the expected value?
                     p
                A method defined to compute the expected value

                                                            Pr( z j )
                                           Pr( z1 )
                           EZˆ [Pr( Z )] =          × ... ×           × Pr( Z )
                                                                            ˆ
                                               ˆ                ˆ
                                           Pr( z1 )         Pr( z j )
                          Where Z^ is an ancestor of Z
                Go to the papers for the details
        Now,
        Now we aim at:
                finding all generalized R-interesting association rules (R is a
                user-specified
                user specified minimum interest called min interest) that have
                                                         min-interest)
                support and confidence greater than minsup and minconf
                respectivelyy

                                                                                  Slide 17
Artificial Intelligence                        Machine Learning
Algorithms to Mine General AR
        Follow three steps:
                        p
                 Find all itemsets whose support is greater than minsup.
        1.
                 These itemsets are ca ed frequent itemsets.
                    ese e se s a e called eque        e se s
                 Use the frequent itemsets to generate the desired rules:
        2.

                          if ABCD and AB are frequent then
                 1.
                 1

                          conf(AB ⇒ CD) = support(ABCD)/support(AB)
                 2.


                 Prune all uninteresting rules f
                 P      ll i t      ti     l from thi set
                                                  this t
        3.




         Different algorithms for this purpose
                 Basic
                 Cumulate
                 EstMerge
                                                                            Slide 18
Artificial Intelligence                     Machine Learning
Basic Algorithm
        Follow the steps:
                      p
                Is itemset X is frequent?
                Does t
                D    transaction T supports X?
                            ti           t
                (X contains items from different levels of taxonomy, T contains only
                leaves)
                T’ = T + ancestors(T);
                Answer: T supports X ↔ X ⊆ T’
                                           T




                                                                                  Slide 19
Artificial Intelligence                  Machine Learning
Details of the Basic Algorithm
                                                Count item occurrences


                                              Generate new k-itemsets
                                                           k itemsets
                                                    candidates

                                             Add all ancestors of each item
                                                in t to t, removing any
                                                       duplication

                                             Find the support of all the
                                                    candidates



                                                Take only those with
                                                support over minsup




                                                                   Slide 20
Artificial Intelligence   Machine Learning
Can You Optimize It?
        Optimization 1: Filtering the ancestors added to
          p                     g
        transactions
                We only need to add to transaction t the ancestors that are in
                one of the candidates.
                If the original item is not in any itemsets it can be dropped from
                                                   itemsets,
                the transaction.
                                                                     Clothes


                                                                Outwear        Shirts


                                                          Jackets    Ski Pants
                Example:

                Candidates: {clothes, shoes}.
                Transaction t: {Jacket, …} can be replaced with {clothes …}
                               {Jacket   }                      {clothes, }

                                                                                  Slide 21
Artificial Intelligence                Machine Learning
Can You Optimize It?
        Optimization 2: Pre-computing ancestors
         p                     p    g
                Rather than finding ancestors for each item by traversing the
                taxonomy g ap , we ca p e co pu e the a ces o s for eac
                 a o o y graph, e can pre-compute e ancestors o each
                item
                We ca d op a ces o s that a e not co a ed in a y o the
                  e can drop ancestors a are o contained any of e
                candidates in the same time
                                                                    Clothes


                                                               Outwear        Shirts


                                                         Jackets    Ski Pants




                                                                                 Slide 22
Artificial Intelligence               Machine Learning
Can You Optimize It?
        Optimization 3: Prune itemsets containing an item and
          p                                     g
        its ancestor
                If we have {Jacket} and {Outwear} we will have candidate
                                        {Outwear},
                {Jacket, Outwear} which is not interesting.
                s({Jacket}) = s ({Jacket, Outwear})
                                ({Jacket
                Delete ({Jacket, Outwear}) in k=2 will ensure it will not erase in
                k>2.
                k>2 (because of the prune step of candidate generation
                method)
                Therefore,
                Therefore we can prune the rules containing an item an its
                ancestor only for k=2, and in the next steps all candidates will
                not include item + ancestor




                                                                             Slide 23
Artificial Intelligence                 Machine Learning
Summary
        Importance of hierarchy in real-world applications
          p                   y                pp
        How?
                Build
                B ild a DAG
                Redefine the problem of ARM
                Get association rules
        Don t
        Don’t take these ideas in isolation!
                Applicable to all the advances we will see in the next classes
                Real-world problems usually require the mixing of many ideas




                                                                            Slide 24
Artificial Intelligence                 Machine Learning
Next Class



        Advanced topics in association rule mining




                                                     Slide 25
Artificial Intelligence      Machine Learning
Introduction to Machine
       Learning
                      Lecture 15
 Advanced Topics in Association Rules Mining

                     Albert Orriols i Puig
                 http://www.albertorriols.net
                 htt //       lb t i l      t
                    aorriols@salle.url.edu

          Artificial Intelligence – Machine Learning
                            g                      g
              Enginyeria i Arquitectura La Salle
                     Universitat Ramon Llull

More Related Content

What's hot

New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...Albert Orriols-Puig
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryAlbert Orriols-Puig
 
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...Albert Orriols-Puig
 
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...Albert Orriols-Puig
 
CCIA'2008: On the dimensions of data complexity through synthetic data sets
CCIA'2008: On the dimensions of data complexity through synthetic data setsCCIA'2008: On the dimensions of data complexity through synthetic data sets
CCIA'2008: On the dimensions of data complexity through synthetic data setsAlbert Orriols-Puig
 
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...Albert Orriols-Puig
 
GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter S...
GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter S...GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter S...
GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter S...Albert Orriols-Puig
 
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSHIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSAlbert Orriols-Puig
 
Creating Documentation Your Users Will Love
Creating Documentation Your Users Will LoveCreating Documentation Your Users Will Love
Creating Documentation Your Users Will LoveEna Arel
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learningbutest
 

What's hot (20)

Lecture23
Lecture23Lecture23
Lecture23
 
Lecture24
Lecture24Lecture24
Lecture24
 
Lecture18
Lecture18Lecture18
Lecture18
 
Lecture1 - Machine Learning
Lecture1 - Machine LearningLecture1 - Machine Learning
Lecture1 - Machine Learning
 
Lecture4 - Machine Learning
Lecture4 - Machine LearningLecture4 - Machine Learning
Lecture4 - Machine Learning
 
Lecture21
Lecture21Lecture21
Lecture21
 
Lecture5 - C4.5
Lecture5 - C4.5Lecture5 - C4.5
Lecture5 - C4.5
 
Lecture8 - From CBR to IBk
Lecture8 - From CBR to IBkLecture8 - From CBR to IBk
Lecture8 - From CBR to IBk
 
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-Theory
 
Lecture11 - neural networks
Lecture11 - neural networksLecture11 - neural networks
Lecture11 - neural networks
 
Lecture22
Lecture22Lecture22
Lecture22
 
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...
 
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers B...
 
CCIA'2008: On the dimensions of data complexity through synthetic data sets
CCIA'2008: On the dimensions of data complexity through synthetic data setsCCIA'2008: On the dimensions of data complexity through synthetic data sets
CCIA'2008: On the dimensions of data complexity through synthetic data sets
 
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
 
GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter S...
GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter S...GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter S...
GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter S...
 
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSHIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
 
Creating Documentation Your Users Will Love
Creating Documentation Your Users Will LoveCreating Documentation Your Users Will Love
Creating Documentation Your Users Will Love
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Similar to Lecture15 - Advances topics on association rules PART II

Transformative iPad Use in Elementary School
Transformative iPad Use in  Elementary SchoolTransformative iPad Use in  Elementary School
Transformative iPad Use in Elementary SchoolSilvia Rosenthal Tolisano
 
Play& Learn Booklet
Play& Learn BookletPlay& Learn Booklet
Play& Learn Bookletmariebrouder
 
Artificial Intelligence Explained: What Are Generative Adversarial Networks (...
Artificial Intelligence Explained: What Are Generative Adversarial Networks (...Artificial Intelligence Explained: What Are Generative Adversarial Networks (...
Artificial Intelligence Explained: What Are Generative Adversarial Networks (...Bernard Marr
 
Just the basics_strata_2013
Just the basics_strata_2013Just the basics_strata_2013
Just the basics_strata_2013Ken Mwai
 
Gigwalk talk at re think conference 4 18 12
Gigwalk talk at re think conference 4 18 12Gigwalk talk at re think conference 4 18 12
Gigwalk talk at re think conference 4 18 12Gigwalk
 
AI Is Changing The Way We Look At Data Science
AI Is Changing The Way We Look At Data ScienceAI Is Changing The Way We Look At Data Science
AI Is Changing The Way We Look At Data ScienceAbe
 
The internet economy slides november 2017
The internet economy slides november 2017The internet economy slides november 2017
The internet economy slides november 2017Ville Saarikoski
 
Unexperienced pasts
Unexperienced pastsUnexperienced pasts
Unexperienced pastsBuhwan Jeong
 
SPWK '20 - explaining data science to humans.pptx
SPWK '20 - explaining data science to humans.pptxSPWK '20 - explaining data science to humans.pptx
SPWK '20 - explaining data science to humans.pptxDoug Hall
 
Understanding the New World of Cognitive Computing
Understanding the New World of Cognitive ComputingUnderstanding the New World of Cognitive Computing
Understanding the New World of Cognitive ComputingDATAVERSITY
 
Exploring solutions for humanity's greatest challenges
Exploring solutions for humanity's greatest challengesExploring solutions for humanity's greatest challenges
Exploring solutions for humanity's greatest challengesAlison B. Lowndes
 
Free Software In The State
Free Software In The StateFree Software In The State
Free Software In The StateCarlos Brys
 
An Introduction into Creative Thinking
An Introduction into Creative ThinkingAn Introduction into Creative Thinking
An Introduction into Creative ThinkingGuy Hafkamp
 

Similar to Lecture15 - Advances topics on association rules PART II (20)

iPads in Elementary School
iPads in Elementary SchooliPads in Elementary School
iPads in Elementary School
 
Transformative iPad Use in Elementary School
Transformative iPad Use in  Elementary SchoolTransformative iPad Use in  Elementary School
Transformative iPad Use in Elementary School
 
Lecture13 - Association Rules
Lecture13 - Association RulesLecture13 - Association Rules
Lecture13 - Association Rules
 
iPads in Education III
iPads in Education IIIiPads in Education III
iPads in Education III
 
Semantic AI
Semantic AISemantic AI
Semantic AI
 
Play& Learn Booklet
Play& Learn BookletPlay& Learn Booklet
Play& Learn Booklet
 
Artificial Intelligence Explained: What Are Generative Adversarial Networks (...
Artificial Intelligence Explained: What Are Generative Adversarial Networks (...Artificial Intelligence Explained: What Are Generative Adversarial Networks (...
Artificial Intelligence Explained: What Are Generative Adversarial Networks (...
 
Just the basics_strata_2013
Just the basics_strata_2013Just the basics_strata_2013
Just the basics_strata_2013
 
Gigwalk talk at re think conference 4 18 12
Gigwalk talk at re think conference 4 18 12Gigwalk talk at re think conference 4 18 12
Gigwalk talk at re think conference 4 18 12
 
AI Is Changing The Way We Look At Data Science
AI Is Changing The Way We Look At Data ScienceAI Is Changing The Way We Look At Data Science
AI Is Changing The Way We Look At Data Science
 
The internet economy slides november 2017
The internet economy slides november 2017The internet economy slides november 2017
The internet economy slides november 2017
 
Unexperienced pasts
Unexperienced pastsUnexperienced pasts
Unexperienced pasts
 
SPWK '20 - explaining data science to humans.pptx
SPWK '20 - explaining data science to humans.pptxSPWK '20 - explaining data science to humans.pptx
SPWK '20 - explaining data science to humans.pptx
 
Transforming Teaching & Learning
Transforming Teaching & LearningTransforming Teaching & Learning
Transforming Teaching & Learning
 
Datavisualisation & Stories
Datavisualisation & StoriesDatavisualisation & Stories
Datavisualisation & Stories
 
iPads in Education- Part 2
iPads in Education- Part 2iPads in Education- Part 2
iPads in Education- Part 2
 
Understanding the New World of Cognitive Computing
Understanding the New World of Cognitive ComputingUnderstanding the New World of Cognitive Computing
Understanding the New World of Cognitive Computing
 
Exploring solutions for humanity's greatest challenges
Exploring solutions for humanity's greatest challengesExploring solutions for humanity's greatest challenges
Exploring solutions for humanity's greatest challenges
 
Free Software In The State
Free Software In The StateFree Software In The State
Free Software In The State
 
An Introduction into Creative Thinking
An Introduction into Creative ThinkingAn Introduction into Creative Thinking
An Introduction into Creative Thinking
 

More from Albert Orriols-Puig

Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceAlbert Orriols-Puig
 
HAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsHAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsAlbert Orriols-Puig
 
Lecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesLecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesAlbert Orriols-Puig
 
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...Albert Orriols-Puig
 

More from Albert Orriols-Puig (6)

Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligence
 
HAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsHAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasets
 
Lecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesLecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rules
 
Lecture12 - SVM
Lecture12 - SVMLecture12 - SVM
Lecture12 - SVM
 
Lecture10 - Naïve Bayes
Lecture10 - Naïve BayesLecture10 - Naïve Bayes
Lecture10 - Naïve Bayes
 
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
 

Recently uploaded

BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Osopher
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...Nguyen Thanh Tu Collection
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17Celine George
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxMadhavi Dharankar
 
An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPCeline George
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptxmary850239
 

Recently uploaded (20)

BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptx
 
An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERP
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx
 

Lecture15 - Advances topics on association rules PART II

  • 1. Introduction to Machine Learning Lecture 15 Advanced Topics in Association Rules Mining Albert Orriols i Puig http://www.albertorriols.net htt // lb t i l t aorriols@salle.url.edu Artificial Intelligence – Machine Learning g g Enginyeria i Arquitectura La Salle Universitat Ramon Llull
  • 2. Recap of Lecture 13-14 Ideas come from the market basket analysis ( y (MBA) ) Let’s go shopping! Milk, eggs, sugar, bread Milk, eggs, cereal, Eggs, sugar bread bd Customer1 Customer2 Customer3 What do my customer buy? Which product are bought together? Aim: Find associations and correlations between t e d e e t d assoc at o s a d co e at o s bet ee the different items that customers place in their shopping basket Slide 2 Artificial Intelligence Machine Learning
  • 3. Recap of Lecture 13-14 Apriori p Will find all the association with minimum support and co de ce confidence However: Scans the data base multiple times Most often, there is a high number of candidates Support counting for candidates can be time expensive FP-growth Will obtain the same rules than Apriori Avoids candidate generation by building a GP tree Counting the support of candidates more efficiently Slide 3 Artificial Intelligence Machine Learning
  • 4. Today’s Agenda Continuing our journey through some advanced topics in ARM Mining frequent patterns without candidate generation Multiple Level AR Sequential Pattern Mining Quantitative association rules Mining class association rules Beyond support & confidence B d t fid Applications Slide 4 Artificial Intelligence Machine Learning
  • 5. Acknowledgments Part of this lecture is based on the work by y Slide 5 Artificial Intelligence Machine Learning
  • 6. Why Multiple Level AR? Aim: Find associations between items But wait! There are many different diapers Dodot, Huggies … gg There are many different beers: heineken, desperados, king fisher … in bottle/can … , p , g Which rule do you prefer? diapers ⇒ beer dodot diapers M ⇒ Dam beer in Can Which will have greater support? Slide 6 Artificial Intelligence Machine Learning
  • 7. Concept Hierarchy Create is-a hierarchies Clothes Footwear Outwear Shoes Shirts Hiking Boots Jackets Ski Pants Assume we found the rule: Outwear ⇒ Hiking boots Then Jackets ⇒ Hiking boots may not have minimum support Clothes ⇒ Hiking boots may not have minimum confidence Slide 7 Artificial Intelligence Machine Learning
  • 8. Concept Hierarchy This means that Rules at lower levels may not have enough support to be part of any frequent itemset However, rules at a lower level of the hierarchy which are overspecific may denote a strong association Jackets ⇒ Hiking boots So, which rules do you want? Users are interested in generating rules that span different levels of the taxonomy Rules of lower levels may not have minimum support Taxonomy can be used to prune uninteresting or redundant rules Multiple taxonomies may be present For example: category, price (cheap, expensive), “items-on-sale”, etc Multiple taxonomies may be modeled as a forest, or a DAG Slide 8 Artificial Intelligence Machine Learning
  • 9. Notation z ancestors (marked with ^) edge: parent is_a relationship p c1 c2 child descendants Slide 9 Artificial Intelligence Machine Learning
  • 10. Notation Formalizing the problem g p I = {i1, i2, …, im}- items T-transaction, set of items T ⊆ I Tt ti t f it D-set of transactions T supports item x, if x is in T or x is an ancestor of some item in T T supports X ⊆ I if it supports e e y item in X suppo ts t suppo ts every te Generalized association rule: X ⇒ Y if X ⊂ I Y ⊂ I X ∩ Y = ∅ and no item in Y is an ancestor of any ∅, I, I, item in X. That is, jacket ⇒ clothes is essentially true The rule X ⇒ Y has confidence c in D if c% of transactions in D that support X also support Y The rule X ⇒ Y has support s in D if s% of transactions in D supports X ∪ Y Slide 10 Artificial Intelligence Machine Learning
  • 11. So, Let’s Re-state the Problem New aim: find all generalized association rules that have g support and confidence greater than the user-specified minimum support (called minsup) and minimum confidence (called minconf) respectively Clothes Footwear Outwear Shoes Shirts Hiking Boots Jackets J kt Ski P t Pants Antecedent and consequent may have items of any level of the hierarchy Do you see any potential problem? I can find many redundant rules! Slide 11 Artificial Intelligence Machine Learning
  • 12. Mining the Example Frequent Itemsets Database D Itemset Support Transaction Items Bought {Jacket} 2 100 Shirt {Outwear} {O t } 3 200 Jacket, Hiking Boots {Clothes} 4 300 Ski Pants, Hiking Boots {Shoes} 2 400 Shoes Sh {Hiking Boots} 2 500 Shoes {Footwear} 4 600 Jacket {Outwear, Hiking Boots} 2 Rules { {Clothes,Hiking Boots} , g } 2 Rule Support Confidence {Outwear, Footwear} 2 Outwear ⇒ Hiking Boots 33% 66.6% {Clothes, Footwear} 2 Outwear ⇒ Footwear 33% 66.6% Hiking Boots ⇒ Outwear 33% 100% minsup = 30% Hiking Boots ⇒ Clothes 33% 100% minconf = 60% Slide 12 Artificial Intelligence Machine Learning
  • 13. Mining the Example Observation 1 If the set{x,y} has minimum support, so do {x^,y^} {x^,y} and { ,y } {x^,y^} E.g.: if {Jacket Shoes} has minsup then {Jacket, {Outwear, Shoes}, {Jacket, Footwear}, and {Outwear, Footwear} also have minimum support } pp Slide 13 Artificial Intelligence Machine Learning
  • 14. Mining the Example Observation 2 If the rule x ⇒ y has minimum support and confidence, then x ⇒ y^ is guaranteed to have bot minsup a d minconf. y s gua a teed a e both sup and co E.g.: The rule Outwear ⇒ Hiking Boots has minsup and minconf minconf. The rule Outwear ⇒ Footwear has both minsup and minconf However, th rules x^ ⇒ y and x^ ⇒ y^ will h H the l ^ d^ ^ ill have minsup, th i they may not have minconf. E.g.: E Clothes ⇒ Hiking Boots Cl th ⇒ F t Clothes Footwear have minsup, but not minconf Slide 14 Artificial Intelligence Machine Learning
  • 15. Interesting Rules So, in which rules are we interested? , Up to now, we were interested in rules that How much the support of a rule was more than the expected support based on the support of the antecedent and the consequent But this does not consider taxonomy I have poor pruning… But now, I need to prune a lot! Shrikant and Agrawal proposed a different approach Consider that Milk Milk ⇒ cereal [s=0.08, c=0.70] [s = ] And that Skim milk ⇒ cereal [s=0.02, c=0.70] 2% Milk Skim Milk [s = ] [s = ] So, do you think that the second rule is important? May be not! Slide 15 Artificial Intelligence Machine Learning
  • 16. Interesting Rules A rule is X ⇒ Y is R-interesting w.r.t g an ancestor X^ ⇒ Y^ if: real s ( X ⇒ Y ) > R · expected s( X ⇒ Y ) b d on ( X ^ ⇒ Y ^ ) l td( based or real c ( X ⇒ Y ) > R · expected s( X ⇒ Y ) b d on ( X ^ ⇒ Y ^ ) l d( based Aim: Interesting rules will be those whose support is more than R times the expected value or whose confidence is more than R times the expected value for some user specified constant R value, user-specified Slide 16 Artificial Intelligence Machine Learning
  • 17. Interesting Rules What’s the expected value? p A method defined to compute the expected value Pr( z j ) Pr( z1 ) EZˆ [Pr( Z )] = × ... × × Pr( Z ) ˆ ˆ ˆ Pr( z1 ) Pr( z j ) Where Z^ is an ancestor of Z Go to the papers for the details Now, Now we aim at: finding all generalized R-interesting association rules (R is a user-specified user specified minimum interest called min interest) that have min-interest) support and confidence greater than minsup and minconf respectivelyy Slide 17 Artificial Intelligence Machine Learning
  • 18. Algorithms to Mine General AR Follow three steps: p Find all itemsets whose support is greater than minsup. 1. These itemsets are ca ed frequent itemsets. ese e se s a e called eque e se s Use the frequent itemsets to generate the desired rules: 2. if ABCD and AB are frequent then 1. 1 conf(AB ⇒ CD) = support(ABCD)/support(AB) 2. Prune all uninteresting rules f P ll i t ti l from thi set this t 3. Different algorithms for this purpose Basic Cumulate EstMerge Slide 18 Artificial Intelligence Machine Learning
  • 19. Basic Algorithm Follow the steps: p Is itemset X is frequent? Does t D transaction T supports X? ti t (X contains items from different levels of taxonomy, T contains only leaves) T’ = T + ancestors(T); Answer: T supports X ↔ X ⊆ T’ T Slide 19 Artificial Intelligence Machine Learning
  • 20. Details of the Basic Algorithm Count item occurrences Generate new k-itemsets k itemsets candidates Add all ancestors of each item in t to t, removing any duplication Find the support of all the candidates Take only those with support over minsup Slide 20 Artificial Intelligence Machine Learning
  • 21. Can You Optimize It? Optimization 1: Filtering the ancestors added to p g transactions We only need to add to transaction t the ancestors that are in one of the candidates. If the original item is not in any itemsets it can be dropped from itemsets, the transaction. Clothes Outwear Shirts Jackets Ski Pants Example: Candidates: {clothes, shoes}. Transaction t: {Jacket, …} can be replaced with {clothes …} {Jacket } {clothes, } Slide 21 Artificial Intelligence Machine Learning
  • 22. Can You Optimize It? Optimization 2: Pre-computing ancestors p p g Rather than finding ancestors for each item by traversing the taxonomy g ap , we ca p e co pu e the a ces o s for eac a o o y graph, e can pre-compute e ancestors o each item We ca d op a ces o s that a e not co a ed in a y o the e can drop ancestors a are o contained any of e candidates in the same time Clothes Outwear Shirts Jackets Ski Pants Slide 22 Artificial Intelligence Machine Learning
  • 23. Can You Optimize It? Optimization 3: Prune itemsets containing an item and p g its ancestor If we have {Jacket} and {Outwear} we will have candidate {Outwear}, {Jacket, Outwear} which is not interesting. s({Jacket}) = s ({Jacket, Outwear}) ({Jacket Delete ({Jacket, Outwear}) in k=2 will ensure it will not erase in k>2. k>2 (because of the prune step of candidate generation method) Therefore, Therefore we can prune the rules containing an item an its ancestor only for k=2, and in the next steps all candidates will not include item + ancestor Slide 23 Artificial Intelligence Machine Learning
  • 24. Summary Importance of hierarchy in real-world applications p y pp How? Build B ild a DAG Redefine the problem of ARM Get association rules Don t Don’t take these ideas in isolation! Applicable to all the advances we will see in the next classes Real-world problems usually require the mixing of many ideas Slide 24 Artificial Intelligence Machine Learning
  • 25. Next Class Advanced topics in association rule mining Slide 25 Artificial Intelligence Machine Learning
  • 26. Introduction to Machine Learning Lecture 15 Advanced Topics in Association Rules Mining Albert Orriols i Puig http://www.albertorriols.net htt // lb t i l t aorriols@salle.url.edu Artificial Intelligence – Machine Learning g g Enginyeria i Arquitectura La Salle Universitat Ramon Llull