SlideShare a Scribd company logo
1 of 58
Predicting Defects in
SAP Java Code
An Experience Report
                       by Tilman Holschuh
                         (SQS AG)
                         Markus Päuser
                         (SAP AG)
                         Kim Herzig
                         (Saarland University)
                         Thomas Zimmermann
                         (Microsoft Research)
                         Rahul Premraj
                         (Vrije University Amsterdam)
                         Andreas Zeller
                         (Saarland University)
Motivation
Motivation


Quality Manager
Motivation


Quality Manager
Motivation


Quality Manager
Motivation


Quality Manager
Motivation
                              Problems




Quality Manager   Resources     Time     Knowledge
Motivation
                              Problems




Quality Manager   Resources     Time     Knowledge




Where do we put the most effort?
Replicated 2 Studies
Replicated 2 Studies
1
Replicated 2 Studies
1



    Source
     code


    Version
    archive


      Bug
    database
Replicated 2 Studies
1



    Source
     code       McCabe
                FanOut
                LoC
                Coupling
    Version
    archive


      Bug
    database
Replicated 2 Studies
1



    Source
     code       McCabe
                FanOut
                LoC
                Coupling
    Version
    archive
                     Component
                     Quality

      Bug
    database
Replicated 2 Studies
1



    Source
     code       McCabe
                FanOut
                LoC
                Coupling
    Version
    archive                      Predictor
                     Component
                     Quality

      Bug
    database
Replicated 2 Studies
2



    Source
     code       McCabe
                FanOut
                LoC
                Coupling
    Version
    archive                      Predictor
                     Component
                     Quality

      Bug
    database
Replicated 2 Studies
2



    Source
     code          McCabe
                   FanOut
               Dependencies
                   LoC
                   Coupling
    Version
    archive                         Predictor
                        Component
                        Quality

      Bug
    database
The Product

‣   SAP Standard Software
‣   Large scale Java software system ( > 10M LoC )
‣   Separated in projects
‣   Service pack release cycles
Defect Distribution




            graphic created with TreeMap (University of Maryland)
                          see http://www.cs.umd.edu/hcil/treemap
Defect Distribution




            graphic created with TreeMap (University of Maryland)
                          see http://www.cs.umd.edu/hcil/treemap
Defect Distribution
20% of the code
contain ~75% of defects




                          graphic created with TreeMap (University of Maryland)
                                        see http://www.cs.umd.edu/hcil/treemap
Defect Distribution
20% of the code
contain ~75% of defects




Upper bound for
prediction




                          graphic created with TreeMap (University of Maryland)
                                        see http://www.cs.umd.edu/hcil/treemap
Basics


         Predictor
Input     Model      Output
How to collect
    Input Data?

1               2
     McCabe
     FanOut
     LoC            Dependencies
     Coupling
Collecting Metric Data

1
     McCabe
     FanOut
     LoC
     Coupling
Collecting Metric Data
                ‣ Metric tools: ckjm,
                  JDepend, ephyra
1
     McCabe
     FanOut
     LoC
     Coupling
Collecting Metric Data
                ‣ Metric tools: ckjm,
                  JDepend, ephyra
1
     McCabe
     FanOut     ‣ Static code checkers:
     LoC
     Coupling     PMD, FindBugs
Collecting Metric Data
                ‣ Metric tools: ckjm,
                  JDepend, ephyra
1
     McCabe
     FanOut     ‣ Static code checkers:
     LoC
     Coupling     PMD, FindBugs
                ‣ Change frequency


                  JDepend               ckjm
Collecting
    Dependency Data
2
    Dependencies
Collecting
    Dependency Data
2                  ‣ extracting package
                     import relations
    Dependencies
Collecting
    Dependency Data
2                  ‣ extracting package
                     import relations
    Dependencies
                   ‣ Tool: JDepend

                      JDepend
How to measure
Component Quality?


Input ✔   Predictor
           Model      Output
Component Quality
Component Quality
  Bug
database




Version-
 archive
Component Quality
  Bug               Bug 42233
                    FileSystemPreferences
database            lockFile() should close
                    ...




Version-
 archive    v1.17                         v1.18   v1.19
Component Quality
  Bug               Bug 42233
                    FileSystemPreferences
database            lockFile() should close
                    ...




                                   Fixed Bug
                                   42233




Version-
 archive    v1.17                         v1.18   v1.19
Component Quality
  Bug               Bug 42233
                    FileSystemPreferences
database            lockFile() should close
                    ...




                                   Fixed Bug
                                   42233




Version-
 archive    v1.17                         v1.18   v1.19
Component Quality
  Bug               Bug 42233
                    FileSystemPreferences
database            lockFile() should close
                    ...




                                   Fixed Bug
                                   42233




Version-
 archive    v1.17                         v1.18   v1.19
Component Quality


                             Fixed Bug
                             42233




Maintenance branch
                     v1.17      v1.18    v1.19


     Version-
      archive        v1.17      v1.18    v1.19
Component Quality

                                         #defects + 1
                             Fixed Bug
                             42233




Maintenance branch
                     v1.17      v1.18    v1.19


     Version-
      archive        v1.17      v1.18    v1.19
How to build
Predictor Models?

 Linear Regression     Support Vector
  Y = Xβ + ε           Machine
      McCabe         McCabe
      FanOut         FanOut
      LoC            LoC        Dependencies
      Coupling       Coupling
Forward Prediction


                          t
V1     V2



               static analysis
               training bug data
               test bug data
Results
Metric Correlations
    Metric                Level: package     Class
                           Project 2       Project 4
                    Sum       0.583          0.377
     LoC
                    Max       0.587           n/a
                    Sum       0.583          0.299
   McCabe
                    Max       0.588          0.261
                              0.608           n/a
Efferent Coupling

                    Sum       0.557          0.264
  Design Rules
                    Max       0.578           n/a
                    Sum       0.308          0.403
  Changes
                    Max       0.240           n/a
Metric Correlations
    Metric                Level: package     Class
                           Project 2       Project 4
                    Sum       0.583          0.377
     LoC
    Prediction is more precise at
                    Max       0.587           n/a
                    Sum       0.583          0.299
   McCabe
       higher granularity levels
                    Max       0.588          0.261
                              0.608           n/a
Efferent Coupling

                    Sum       0.557          0.264
  Design Rules
                    Max       0.578           n/a
                    Sum       0.308          0.403
  Changes
                    Max       0.240           n/a
Hit Rate
          actual   predicted
             1         4
             2         9    Hit rate = 50%
             3         2
Top 20%      4        11
             5         6
             6         1
             7         3
             8         5
             9        10
            10         8
            11         7
McCabe
FanOut
LoC
                 Predictions using
                 Linear Regression
Coupling




                       Top 5%   Top 20%
      All projects      46%      55%
           Group 1      47%      63%
           Project 1    21%      43%
           Project 2    42%      64%
           Project 3    41%      55%
Dependencies
                Predicting from
                Dependencies
       Support Vector
                        Top 5%   Top 20%
          Machine
           Group 1       26%      43%

          Project 1      38%      50%

          Project 2      36%      46%

          Project 3      46%      49%
Dependencies
                Predicting from
                Dependencies
       Support Vector
                         Top 5%      Top 20%
          Machine
            Stable
           Group 1  prediction results 43%
                          26%
                  across projects
          Project 1       38%         50%

          Project 2       36%         46%

          Project 3       46%         49%
Compare Results
                           Dependencies     Metrics
           80%



           60%
Hit rate




           40%



           20%



           0%
                 Group 1     Project 1    Project 2   Project 3
Compare Results
                           Dependencies     Metrics
           80%



           Complexity metrics have higher
           60%

                 predictive power
Hit rate




           40%



           20%



           0%
                 Group 1     Project 1    Project 2   Project 3
Lessons Learned
                 Nagappan   Schröter
                   et al.     et al.   our study
metrics defect
 correlation       ✔          n/a        ✔
  prediction
   possible        ✔         ✔           ✔
   forward
  prediction       ✘         ✘           ✔
  universal
  predictor        ✘         ✘           ✘
Lessons Learned
Lessons Learned
 Predictions based on static code features provide
limited results and depend on the project context
Lessons Learned
 Predictions based on static code features provide
limited results and depend on the project context


        Software archives are reliable and
      easily accessible source of defect data
Lessons Learned
 Predictions based on static code features provide
limited results and depend on the project context


        Software archives are reliable and
      easily accessible source of defect data


     Defects have many sources, and code is
                just one of them
SQS Software Quality Systems AG

Stollwerckstraße 11
51149 Cologne, Germany
Phone: + 49 22 03 91 54 - 7149
Fax: + 49 22 03 91 54 - 15
Email: tilman.holschuh@sqs.de

Internet: www.sqs-group.com
Thank you!
         SQS Software Quality Systems AG

         Stollwerckstraße 11
         51149 Cologne, Germany
         Phone: + 49 22 03 91 54 - 7149
         Fax: + 49 22 03 91 54 - 15
         Email: tilman.holschuh@sqs.de

         Internet: www.sqs-group.com
Predicting Defects in SAP Java Code: An Experience Report

More Related Content

Viewers also liked

Intro To Sap Netweaver Java
Intro To Sap Netweaver JavaIntro To Sap Netweaver Java
Intro To Sap Netweaver JavaLeland Bartlett
 
Practical SAP pentesting workshop (NullCon Goa)
Practical SAP pentesting workshop (NullCon Goa)Practical SAP pentesting workshop (NullCon Goa)
Practical SAP pentesting workshop (NullCon Goa)ERPScan
 
Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012hwilming
 
Low latency Java apps
Low latency Java appsLow latency Java apps
Low latency Java appsSimon Ritter
 
Sap java connector / Hybris RFC
Sap java connector / Hybris RFCSap java connector / Hybris RFC
Sap java connector / Hybris RFCMonsif Elaissoussi
 
TMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Defect Report Classification in Accordance with Areas of TestingTMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Defect Report Classification in Accordance with Areas of TestingIosif Itkin
 
Overview of the ehcache
Overview of the ehcacheOverview of the ehcache
Overview of the ehcacheHyeonSeok Choi
 
Building low latency java applications with ehcache
Building low latency java applications with ehcacheBuilding low latency java applications with ehcache
Building low latency java applications with ehcacheChris Westin
 

Viewers also liked (8)

Intro To Sap Netweaver Java
Intro To Sap Netweaver JavaIntro To Sap Netweaver Java
Intro To Sap Netweaver Java
 
Practical SAP pentesting workshop (NullCon Goa)
Practical SAP pentesting workshop (NullCon Goa)Practical SAP pentesting workshop (NullCon Goa)
Practical SAP pentesting workshop (NullCon Goa)
 
Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012
 
Low latency Java apps
Low latency Java appsLow latency Java apps
Low latency Java apps
 
Sap java connector / Hybris RFC
Sap java connector / Hybris RFCSap java connector / Hybris RFC
Sap java connector / Hybris RFC
 
TMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Defect Report Classification in Accordance with Areas of TestingTMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Defect Report Classification in Accordance with Areas of Testing
 
Overview of the ehcache
Overview of the ehcacheOverview of the ehcache
Overview of the ehcache
 
Building low latency java applications with ehcache
Building low latency java applications with ehcacheBuilding low latency java applications with ehcache
Building low latency java applications with ehcache
 

Similar to Predicting Defects in SAP Java Code: An Experience Report

How to Test Enterprise Java Applications
How to Test Enterprise Java ApplicationsHow to Test Enterprise Java Applications
How to Test Enterprise Java ApplicationsAlex Soto
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureMasud Rahman
 
Studying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software QualityStudying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software QualityNicolas Bettenburg
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesThomas Zimmermann
 
The Pill for Your Migration Hell
The Pill for Your Migration HellThe Pill for Your Migration Hell
The Pill for Your Migration HellDatabricks
 
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug PredictionIt's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Predictionsjust
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsThomas Zimmermann
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .netMarco Parenzan
 
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica setsBack to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica setsMongoDB
 
Mining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentMining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentThomas Zimmermann
 
Overview of DuraMat software tool development (poster version)
Overview of DuraMat software tool development(poster version)Overview of DuraMat software tool development(poster version)
Overview of DuraMat software tool development (poster version)Anubhav Jain
 
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelParsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelchk49
 
Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016Chris Fregly
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with TransformersDatabricks
 
Pipelining Chem Axon: US UGM 2008
Pipelining Chem Axon: US UGM 2008Pipelining Chem Axon: US UGM 2008
Pipelining Chem Axon: US UGM 2008ChemAxon
 
Dmitriy evdokimov. light and dark side of code instrumentation
Dmitriy evdokimov. light and dark side of code instrumentationDmitriy evdokimov. light and dark side of code instrumentation
Dmitriy evdokimov. light and dark side of code instrumentationYury Chemerkin
 
The Radeox Wiki Render Engine
The Radeox Wiki Render EngineThe Radeox Wiki Render Engine
The Radeox Wiki Render EngineMatthias Jugel
 
ドワンゴでのScala活用事例「ニコニコandroid」
ドワンゴでのScala活用事例「ニコニコandroid」ドワンゴでのScala活用事例「ニコニコandroid」
ドワンゴでのScala活用事例「ニコニコandroid」Satoshi Goto
 

Similar to Predicting Defects in SAP Java Code: An Experience Report (20)

How to Test Enterprise Java Applications
How to Test Enterprise Java ApplicationsHow to Test Enterprise Java Applications
How to Test Enterprise Java Applications
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lecture
 
Studying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software QualityStudying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software Quality
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
BioWeka
BioWekaBioWeka
BioWeka
 
The Pill for Your Migration Hell
The Pill for Your Migration HellThe Pill for Your Migration Hell
The Pill for Your Migration Hell
 
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug PredictionIt's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency Graphs
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
 
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica setsBack to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
 
Mining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentMining Software Archives to Support Software Development
Mining Software Archives to Support Software Development
 
Overview of DuraMat software tool development (poster version)
Overview of DuraMat software tool development(poster version)Overview of DuraMat software tool development(poster version)
Overview of DuraMat software tool development (poster version)
 
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelParsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
 
Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with Transformers
 
Pipelining Chem Axon: US UGM 2008
Pipelining Chem Axon: US UGM 2008Pipelining Chem Axon: US UGM 2008
Pipelining Chem Axon: US UGM 2008
 
Dmitriy evdokimov. light and dark side of code instrumentation
Dmitriy evdokimov. light and dark side of code instrumentationDmitriy evdokimov. light and dark side of code instrumentation
Dmitriy evdokimov. light and dark side of code instrumentation
 
The Radeox Wiki Render Engine
The Radeox Wiki Render EngineThe Radeox Wiki Render Engine
The Radeox Wiki Render Engine
 
Debugging TV Frame 0x13
Debugging TV Frame 0x13Debugging TV Frame 0x13
Debugging TV Frame 0x13
 
ドワンゴでのScala活用事例「ニコニコandroid」
ドワンゴでのScala活用事例「ニコニコandroid」ドワンゴでのScala活用事例「ニコニコandroid」
ドワンゴでのScala活用事例「ニコニコandroid」
 

Recently uploaded

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

Predicting Defects in SAP Java Code: An Experience Report

  • 1. Predicting Defects in SAP Java Code An Experience Report by Tilman Holschuh (SQS AG) Markus Päuser (SAP AG) Kim Herzig (Saarland University) Thomas Zimmermann (Microsoft Research) Rahul Premraj (Vrije University Amsterdam) Andreas Zeller (Saarland University)
  • 7. Motivation Problems Quality Manager Resources Time Knowledge
  • 8. Motivation Problems Quality Manager Resources Time Knowledge Where do we put the most effort?
  • 11. Replicated 2 Studies 1 Source code Version archive Bug database
  • 12. Replicated 2 Studies 1 Source code McCabe FanOut LoC Coupling Version archive Bug database
  • 13. Replicated 2 Studies 1 Source code McCabe FanOut LoC Coupling Version archive Component Quality Bug database
  • 14. Replicated 2 Studies 1 Source code McCabe FanOut LoC Coupling Version archive Predictor Component Quality Bug database
  • 15. Replicated 2 Studies 2 Source code McCabe FanOut LoC Coupling Version archive Predictor Component Quality Bug database
  • 16. Replicated 2 Studies 2 Source code McCabe FanOut Dependencies LoC Coupling Version archive Predictor Component Quality Bug database
  • 17. The Product ‣ SAP Standard Software ‣ Large scale Java software system ( > 10M LoC ) ‣ Separated in projects ‣ Service pack release cycles
  • 18. Defect Distribution graphic created with TreeMap (University of Maryland) see http://www.cs.umd.edu/hcil/treemap
  • 19. Defect Distribution graphic created with TreeMap (University of Maryland) see http://www.cs.umd.edu/hcil/treemap
  • 20. Defect Distribution 20% of the code contain ~75% of defects graphic created with TreeMap (University of Maryland) see http://www.cs.umd.edu/hcil/treemap
  • 21. Defect Distribution 20% of the code contain ~75% of defects Upper bound for prediction graphic created with TreeMap (University of Maryland) see http://www.cs.umd.edu/hcil/treemap
  • 22. Basics Predictor Input Model Output
  • 23. How to collect Input Data? 1 2 McCabe FanOut LoC Dependencies Coupling
  • 24. Collecting Metric Data 1 McCabe FanOut LoC Coupling
  • 25. Collecting Metric Data ‣ Metric tools: ckjm, JDepend, ephyra 1 McCabe FanOut LoC Coupling
  • 26. Collecting Metric Data ‣ Metric tools: ckjm, JDepend, ephyra 1 McCabe FanOut ‣ Static code checkers: LoC Coupling PMD, FindBugs
  • 27. Collecting Metric Data ‣ Metric tools: ckjm, JDepend, ephyra 1 McCabe FanOut ‣ Static code checkers: LoC Coupling PMD, FindBugs ‣ Change frequency JDepend ckjm
  • 28. Collecting Dependency Data 2 Dependencies
  • 29. Collecting Dependency Data 2 ‣ extracting package import relations Dependencies
  • 30. Collecting Dependency Data 2 ‣ extracting package import relations Dependencies ‣ Tool: JDepend JDepend
  • 31. How to measure Component Quality? Input ✔ Predictor Model Output
  • 33. Component Quality Bug database Version- archive
  • 34. Component Quality Bug Bug 42233 FileSystemPreferences database lockFile() should close ... Version- archive v1.17 v1.18 v1.19
  • 35. Component Quality Bug Bug 42233 FileSystemPreferences database lockFile() should close ... Fixed Bug 42233 Version- archive v1.17 v1.18 v1.19
  • 36. Component Quality Bug Bug 42233 FileSystemPreferences database lockFile() should close ... Fixed Bug 42233 Version- archive v1.17 v1.18 v1.19
  • 37. Component Quality Bug Bug 42233 FileSystemPreferences database lockFile() should close ... Fixed Bug 42233 Version- archive v1.17 v1.18 v1.19
  • 38. Component Quality Fixed Bug 42233 Maintenance branch v1.17 v1.18 v1.19 Version- archive v1.17 v1.18 v1.19
  • 39. Component Quality #defects + 1 Fixed Bug 42233 Maintenance branch v1.17 v1.18 v1.19 Version- archive v1.17 v1.18 v1.19
  • 40. How to build Predictor Models? Linear Regression Support Vector Y = Xβ + ε Machine McCabe McCabe FanOut FanOut LoC LoC Dependencies Coupling Coupling
  • 41. Forward Prediction t V1 V2 static analysis training bug data test bug data
  • 43. Metric Correlations Metric Level: package Class Project 2 Project 4 Sum 0.583 0.377 LoC Max 0.587 n/a Sum 0.583 0.299 McCabe Max 0.588 0.261 0.608 n/a Efferent Coupling Sum 0.557 0.264 Design Rules Max 0.578 n/a Sum 0.308 0.403 Changes Max 0.240 n/a
  • 44. Metric Correlations Metric Level: package Class Project 2 Project 4 Sum 0.583 0.377 LoC Prediction is more precise at Max 0.587 n/a Sum 0.583 0.299 McCabe higher granularity levels Max 0.588 0.261 0.608 n/a Efferent Coupling Sum 0.557 0.264 Design Rules Max 0.578 n/a Sum 0.308 0.403 Changes Max 0.240 n/a
  • 45. Hit Rate actual predicted 1 4 2 9 Hit rate = 50% 3 2 Top 20% 4 11 5 6 6 1 7 3 8 5 9 10 10 8 11 7
  • 46. McCabe FanOut LoC Predictions using Linear Regression Coupling Top 5% Top 20% All projects 46% 55% Group 1 47% 63% Project 1 21% 43% Project 2 42% 64% Project 3 41% 55%
  • 47. Dependencies Predicting from Dependencies Support Vector Top 5% Top 20% Machine Group 1 26% 43% Project 1 38% 50% Project 2 36% 46% Project 3 46% 49%
  • 48. Dependencies Predicting from Dependencies Support Vector Top 5% Top 20% Machine Stable Group 1 prediction results 43% 26% across projects Project 1 38% 50% Project 2 36% 46% Project 3 46% 49%
  • 49. Compare Results Dependencies Metrics 80% 60% Hit rate 40% 20% 0% Group 1 Project 1 Project 2 Project 3
  • 50. Compare Results Dependencies Metrics 80% Complexity metrics have higher 60% predictive power Hit rate 40% 20% 0% Group 1 Project 1 Project 2 Project 3
  • 51. Lessons Learned Nagappan Schröter et al. et al. our study metrics defect correlation ✔ n/a ✔ prediction possible ✔ ✔ ✔ forward prediction ✘ ✘ ✔ universal predictor ✘ ✘ ✘
  • 53. Lessons Learned Predictions based on static code features provide limited results and depend on the project context
  • 54. Lessons Learned Predictions based on static code features provide limited results and depend on the project context Software archives are reliable and easily accessible source of defect data
  • 55. Lessons Learned Predictions based on static code features provide limited results and depend on the project context Software archives are reliable and easily accessible source of defect data Defects have many sources, and code is just one of them
  • 56. SQS Software Quality Systems AG Stollwerckstraße 11 51149 Cologne, Germany Phone: + 49 22 03 91 54 - 7149 Fax: + 49 22 03 91 54 - 15 Email: tilman.holschuh@sqs.de Internet: www.sqs-group.com
  • 57. Thank you! SQS Software Quality Systems AG Stollwerckstraße 11 51149 Cologne, Germany Phone: + 49 22 03 91 54 - 7149 Fax: + 49 22 03 91 54 - 15 Email: tilman.holschuh@sqs.de Internet: www.sqs-group.com