SlideShare a Scribd company logo
1 of 72
Download to read offline
Automatic Identification of
  Bug-Introducing Changes

Sunghun Kim
Kai Pan
E. James Whitehead, Jr.
                            University of California, Santa Cruz, USA
Tom Zimmermann
                          Saarland University, Saarbrücken, Germany
Motivation



   Bugs
Motivation
Statistics




             Bugs
Motivation
Statistics
                    Detection


             Bugs
Motivation
Statistics
                     Detection


             Bugs

                    Prediction
Motivation
Statistics
                     Detection


             Bugs

                    Prediction
  Analysis
Motivation
              Statistics
                                   Detection


                           Bugs
Observation


                                  Prediction
                Analysis
So far: Focus on fixes
teicher    2003-10-29 16:11:01
fixes issues mentioned in bug 45635: [hovering] rollover hovers
- mouse exit detection is safer and should not allow for
  loopholes any more, except for shell deactiviation
- hovers behave like normal ones:
      - tooltips pop up below the control
      - they move with subjectArea
      - once a popup is showing, they will show up instantly
So far: Focus on fixes
teicher    2003-10-29 16:11:01
fixes issues mentioned in bug 45635: [hovering] rollover hovers
- mouse exit detection is safer and should not allow for
  loopholes any more, except for shell deactiviation
- hovers behave like normal ones:
      - tooltips pop up below the control
      - they move with subjectArea
      - once a popup is showing, they will show up instantly


      Fixes give only the location of a defect,
           not when it was introduced.
Bug-introducing changes

BUG-INTRODUCING
...
if (foo==null) {
    foo.bar();
...
Bug-introducing changes

BUG-INTRODUCING                 FIX
...                             ...
                   later fixed
if (foo==null) {                if (foo!=null) {
    foo.bar();                      foo.bar();
...                             ...
Bug-introducing changes

BUG-INTRODUCING                    FIX
...                                ...
                      later fixed
if (foo==null) {                   if (foo!=null) {
    foo.bar();                         foo.bar();
...                                ...



       Bug-introducing changes are changes that
      lead to problems as indicated by later fixes.
Life-cycle of a “bug”
Life-cycle of a “bug”




BUG-INTRODUCING
CHANGE
Life-cycle of a “bug”

         BUG REPORT
         fixes issues mentioned in bug 45635: [hovering] rollover hovers
         - mouse exit detection is safer and should not allow for
           loopholes any more, except for shell deactiviation
         - hovers behave like normal ones:
               - tooltips pop up below the control
               - they move with subjectArea
               - once a popup is showing, they will show up instantly




BUG-INTRODUCING
CHANGE
Life-cycle of a “bug”

         BUG REPORT
         fixes issues mentioned in bug 45635: [hovering] rollover hovers
         - mouse exit detection is safer and should not allow for
           loopholes any more, except for shell deactiviation
         - hovers behave like normal ones:
               - tooltips pop up below the control
               - they move with subjectArea
               - once a popup is showing, they will show up instantly




BUG-INTRODUCING                                                           FIX
CHANGE                                                                    CHANGE
The SZZ algorithm




                    1.18

                FIXED BUG
                42233
The SZZ algorithm
    $ cvs annotate -r 1.17 Foo.java
    ...
20: 1.11 (john 12-Feb-03):     return i/0;
    ...
40: 1.14 (kate 23-May-03):     return 42;
    ...
60: 1.16 (mary 10-Jun-03):     int i=0;




                                                1.18

                                             FIXED BUG
                                             42233
The SZZ algorithm
    $ cvs annotate -r 1.17 Foo.java
    ...
20: 1.11 (john 12-Feb-03):     return i/0;
    ...
40: 1.14 (kate 23-May-03):     return 42;
    ...
60: 1.16 (mary 10-Jun-03):     int i=0;




                                                1.18

                                             FIXED BUG
                                             42233
The SZZ algorithm
    $ cvs annotate -r 1.17 Foo.java
    ...
20: 1.11 (john 12-Feb-03):     return i/0;
    ...
40: 1.14 (kate 23-May-03):     return 42;
    ...
60: 1.16 (mary 10-Jun-03):     int i=0;




 1.11                 1.14          1.16        1.18

                                             FIXED BUG
                                             42233
The SZZ algorithm
    $ cvs annotate -r 1.17 Foo.java
    ...
20: 1.11 (john 12-Feb-03):     return i/0;
    ...
40: 1.14 (kate 23-May-03):     return 42;
    ...
60: 1.16 (mary 10-Jun-03):     int i=0;




 1.11                 1.14          1.16        1.18

 BUG                                         FIXED BUG
INTRO                                        42233
The SZZ algorithm
    $ cvs annotate -r 1.17 Foo.java
    ...
20: 1.11 (john 12-Feb-03):     return i/0;
    ...
40: 1.14 (kate 23-May-03):     return 42;
    ...
60: 1.16 (mary 10-Jun-03):     int i=0;




 1.11                 1.14          1.16        1.18

 BUG                 BUG                     FIXED BUG
INTRO               INTRO                    42233
The SZZ algorithm
    $ cvs annotate -r 1.17 Foo.java
    ...
20: 1.11 (john 12-Feb-03):     return i/0;
    ...
40: 1.14 (kate 23-May-03):     return 42;
    ...
60: 1.16 (mary 10-Jun-03):     int i=0;




 1.11                 1.14          1.16        1.18

 BUG                 BUG           BUG       FIXED BUG
INTRO               INTRO         INTRO      42233
The SZZ algorithm




 1.11        1.14    1.16      1.18

 BUG         BUG     BUG    FIXED BUG
INTRO       INTRO   INTRO   42233
The SZZ algorithm
         submitted                                                              closed
               BUG REPORT
               fixes issues mentioned in bug 45635: [hovering] rollover hovers
               - mouse exit detection is safer and should not allow for
                 loopholes any more, except for shell deactiviation
               - hovers behave like normal ones:
                     - tooltips pop up below the control
                     - they move with subjectArea
                     - once a popup is showing, they will show up instantly




 1.11        1.14                                  1.16                              1.18

 BUG         BUG                                BUG                               FIXED BUG
INTRO       INTRO                              INTRO                              42233
The SZZ algorithm
         submitted                                                              closed
               BUG REPORT
               fixes issues mentioned in bug 45635: [hovering] rollover hovers
               - mouse exit detection is safer and should not allow for
                 loopholes any more, except for shell deactiviation
               - hovers behave like normal ones:
                     - tooltips pop up below the control
                     - they move with subjectArea
                     - once a popup is showing, they will show up instantly




 1.11        1.14                                  1.16                              1.18

             BUGREMOVE
 BUG                   BUG                                                        FIXED BUG
INTRO       INTRO POSITIVES
                      INTRO                                                       42233
            FALSE
Drawbacks of SZZ

    Annotation by SCMs is insufficient.
(line number in bug-introducing revision is missing)
Drawbacks of SZZ

    Annotation by SCMs is insufficient.
(line number in bug-introducing revision is missing)



       Not all modifications are fixes.
           (blank lines, comments, etc.)
False negatives and positives
                  Correct   False negatives

False positives




BUG-INTRODUCING                     BUGS
An example
BUG-INTRODUCING
void bar() {
  if (val==null) {
    println(val);
  }
...



Revision 7: tom
- introduces the defects
An example
BUG-INTRODUCING                                        BUG FIX
void bar() {                void foo() {               void foo() {
  if (val==null) {              // print val              // print value
    println(val);               if (val==null)            if (val!=null)
  }                             {                         {
...                                println(val);             println(val);
                                }                         }
                            ...                        ...
Revision 7: tom             Revision 23: jim           Revision 42: kim
- introduces the defects    - inserts a comment        - changes comment
                            - reformats if statement   - corrects defect
An example
BUG-INTRODUCING                                        BUG FIX
void bar() {                void foo() {               void foo() {
  if (val==null) {              // print val              // print valueval
    println(val);               if (val==null)            if (val!=null)
  }                             {                         {
...                                println(val);             println(val);
                                }                         }
                            ...                        ...
Revision 7: tom             Revision 23: jim           Revision 42: kim
- introduces the defects    - inserts a comment        - changes comment
                            - reformats if statement   - corrects defect
An example
BUG-INTRODUCING                                        BUG FIX
void bar() {                void foo() {               void foo() {
  if (val==null) {              // print val              // print valueval
    println(val);               if (val==null)            if (val!=null)
  }                             {                         {
...                                println(val);             println(val);
                                }                         }
                            ...                        ...
Revision 7: tom             Revision 23: jim           Revision 42: kim
- introduces the defects    - inserts a comment        - changes comment
                            - reformats if statement   - corrects defect
An example
BUG-INTRODUCING                                        BUG FIX
void bar() {                void foo() {               void foo() {
  if (val==null) {              // print val              // print valueval
    println(val);               if (val==null)            if (val!=null)
  }                             {                         {
...                                println(val);             println(val);
                                }                         }
                            ...                        ...
Revision 7: tom             Revision 23: jim           Revision 42: kim
- introduces the defects    - inserts a comment        - changes comment
                            - reformats if statement   - corrects defect


       The originial SZZ algorithm has too many false
        positives (rev 23) and false negatives (rev 7).
Our study

                        COLUMBA          ECLIPSE (jdt.core)
Project
Software type            Email client           IDE
Investigated period    11/2002-06/2003     06/2001-03/2002
Number of revisions          500                1000
Number of fixes            143 (29%)           158 (16%)
Average LOC                48,135             111,059
Annotation graphs
Annotation graphs



                      fixed
revisions
Annotation graphs


                      foo()

                              fixed
revisions
Annotation graphs

            bar() renamed
               to foo()
                            foo()

                                    fixed
revisions
Annotation graphs
bar()

                    bar() renamed
                       to foo()
                                    foo()

                                            fixed
        revisions
Annotation graphs
bar()

                                   bar() renamed
                                      to foo()
                                                     foo()

                                                             fixed
            revisions

        SZZ reports a bug-introducing change for foo
        (false positive) but not for bar (false negative).
Annotation graphs
bar()

                                 bar() renamed
                                    to foo()
                                                 foo()

                                                         fixed
           revisions

        Using annotation graphs we can remove 2% as
         false positives and identifies further 1%~4%.
Comments & blank lines
  public void notifySourceElementRequestor() {
-
+ if (reportReferenceInfo) {
+     notifyAllUnknownReferences();
+}
  // collect the top level ast nodes
  int length = 0;
Comments & blank lines
  public void notifySourceElementRequestor() {
-
+ if (reportReferenceInfo) {
+     notifyAllUnknownReferences();
+}
  // collect the top level ast nodes
  int length = 0;



      Ignoring comments and blank lines
     removes 14%~20% as false positives.
Format changes

BUG-INTRODUCING
if ( a==true ) return;
Format changes

BUG-INTRODUCING                         BUG FIX
if ( a==true ) return;   if (a==true)   if (a==false)
                             return;        return;
Format changes

BUG-INTRODUCING                         BUG FIX
if ( a==true ) return;   if (a==true)   if (a==false)
                             return;        return;

                          FALSE
                         POSITIVE
Format changes

BUG-INTRODUCING                         BUG FIX
if ( a==true ) return;   if (a==true)   if (a==false)
                             return;        return;

     FALSE                FALSE
    NEGATIVE             POSITIVE
Format changes

BUG-INTRODUCING                          BUG FIX
if ( a==true ) return;   if (a==true)    if (a==false)
                             return;         return;

     FALSE                FALSE
    NEGATIVE             POSITIVE

    Ignoring format changes removes 18%~25% as
    false positives and identifies further 13%~14%.
Fixes that affect many files

Most large fixes are refactoring
- public boolean visit(TypeDeclaration
-   typeDeclaration, BlockScope scope){
+ public boolean visit(LocalTypeDeclaration
+ typeDeclaration, BlockScope scope){
Fixes that affect many files

Most large fixes are refactoring
- public boolean visit(TypeDeclaration
-   typeDeclaration, BlockScope scope){
+ public boolean visit(LocalTypeDeclaration
+ typeDeclaration, BlockScope scope){



       Ignoring fixes that affect many files
      (=more than five times the median)
      removes 7%~16% as false positives
Manual inspection of fixes
Two judges check whether a fix is actually a fix.
    deleteResources(actualNonJavaResources,fForce);
-   IResource[] remaingFiles;
+   IResource[] remainingFiles; 
    try {
-        remaingFiles=((IFolder)res).members();
+        remainingFiles=((IFolder)res).members();
    }
Manual inspection of fixes
Two judges check whether a fix is actually a fix.
    deleteResources(actualNonJavaResources,fForce);
-   IResource[] remaingFiles;
+   IResource[] remainingFiles; 
    try {
-        remaingFiles=((IFolder)res).members();
+        remainingFiles=((IFolder)res).members();
    }



          Manual inspection of fixes removes
            only 4%~5% as false positives.
Putting it together




                      ASE

       SZZ

                            BUGS
Putting it together
       False negatives
        of SZZ: 14%




                         ASE

       SZZ

                               BUGS
Putting it together
                  False negatives
                   of SZZ: 14%


False positives
of SZZ: 38%~51%
                                    ASE

                  SZZ

                                          BUGS
Putting it together
                       False negatives
                        of SZZ: 14%


False positives
of SZZ: 38%~51%
                                         ASE

                       SZZ
Automatically we can
remove 36%~48% as
                                               BUGS
false positive.
Don’t program on Fridays ;-)
  Monday                                           11.3%


  Tuesday                                      10.4%


Wednesday                                         11.1%


 Thursday                                                 12.1%


    Friday                                                 12.2%


 Saturday                                              11.7%


   Sunday                                              11.6%


             0%   2.5%   5.0%   7.5%     10.0%         12.5%       15.0%
                                Percentage of bug-introducing changes
Don’t program on Fridays ;-)
  Monday                                           11.3%


  Tuesday                                      10.4%


Wednesday                                         11.1%


 Thursday                                                 12.1%


    Friday                                                 12.2%


 Saturday                                              11.7%


   Sunday                                              11.6%


             0%   2.5%   5.0%   7.5%     10.0%         12.5%       15.0%
                                Percentage of bug-introducing changes
Defect-prone authors
   Authors
(anonymized)

       Fry


      Leela


    Bender


      Amy

              0%    10%   20%   30%   40%   50%      60%      70%        80%
                                                                      Frequency


                                            Bug-Introducing changes
                                            Fix changes
Risk awareness

“Safe” Location
    (green)



Risky Location
  (dark red)
Change classification
Change classification


bug-introducing (“bad”)

X      X       X    X
Change classification
    BUILD A CLASSIFIER
bug-introducing (“bad”)

X       X      X    X
Change classification
    BUILD A CLASSIFIER
bug-introducing (“bad”)

X       X      X    X
                          new change
Change classification
    BUILD A CLASSIFIER
bug-introducing (“bad”)

X       X      X    X
                                 new change


                          PREDICT QUALITY
Conclusions
Conclusions

• Bug-introducing changes tell when a defect
  was introduced, not only its location.
Conclusions

• Bug-introducing changes tell when a defect
  was introduced, not only its location.
• We can automatically identify
  – 36%~48% of SZZ as false positives and
  – further 14% of missed bug-introductions.
Conclusions

• Bug-introducing changes tell when a defect
  was introduced, not only its location.
• We can automatically identify
  – 36%~48% of SZZ as false positives and
  – further 14% of missed bug-introductions.
• Bug-introducing changes are useful for
  defect prediction and software evolution.
More
                                       minin
                                             g @A
                                   Frida             SE:
                                        y mo
           Conclusions                       rning
                                                   .



• Bug-introducing changes tell when a defect
  was introduced, not only its location.
• We can automatically identify
  – 36%~48% of SZZ as false positives and
  – further 14% of missed bug-introductions.
• Bug-introducing changes are useful for
  defect prediction and software evolution.

More Related Content

What's hot

Dalvik Source Code Reading
Dalvik Source Code ReadingDalvik Source Code Reading
Dalvik Source Code Readingkishima7
 
5 Cool Things About PLSQL
5 Cool Things About PLSQL5 Cool Things About PLSQL
5 Cool Things About PLSQLConnor McDonald
 
BSides Algiers - Stuxnet - Sofiane Talmat
BSides Algiers - Stuxnet - Sofiane TalmatBSides Algiers - Stuxnet - Sofiane Talmat
BSides Algiers - Stuxnet - Sofiane TalmatShellmates
 
openFrameworks – パーティクルを動かす、静的配列と動的配列 - 多摩美メディアアートII
openFrameworks – パーティクルを動かす、静的配列と動的配列 - 多摩美メディアアートIIopenFrameworks – パーティクルを動かす、静的配列と動的配列 - 多摩美メディアアートII
openFrameworks – パーティクルを動かす、静的配列と動的配列 - 多摩美メディアアートIIAtsushi Tadokoro
 
I am the 100% [*] by Chris Evans & Natalie Silvanovich
I am the 100% [*] by Chris Evans & Natalie SilvanovichI am the 100% [*] by Chris Evans & Natalie Silvanovich
I am the 100% [*] by Chris Evans & Natalie SilvanovichShakacon
 
Tetcon2016 160104
Tetcon2016 160104Tetcon2016 160104
Tetcon2016 160104Bordeaux I
 

What's hot (7)

Dalvik Source Code Reading
Dalvik Source Code ReadingDalvik Source Code Reading
Dalvik Source Code Reading
 
5 Cool Things About PLSQL
5 Cool Things About PLSQL5 Cool Things About PLSQL
5 Cool Things About PLSQL
 
BSides Algiers - Stuxnet - Sofiane Talmat
BSides Algiers - Stuxnet - Sofiane TalmatBSides Algiers - Stuxnet - Sofiane Talmat
BSides Algiers - Stuxnet - Sofiane Talmat
 
openFrameworks – パーティクルを動かす、静的配列と動的配列 - 多摩美メディアアートII
openFrameworks – パーティクルを動かす、静的配列と動的配列 - 多摩美メディアアートIIopenFrameworks – パーティクルを動かす、静的配列と動的配列 - 多摩美メディアアートII
openFrameworks – パーティクルを動かす、静的配列と動的配列 - 多摩美メディアアートII
 
Flappy bird
Flappy birdFlappy bird
Flappy bird
 
I am the 100% [*] by Chris Evans & Natalie Silvanovich
I am the 100% [*] by Chris Evans & Natalie SilvanovichI am the 100% [*] by Chris Evans & Natalie Silvanovich
I am the 100% [*] by Chris Evans & Natalie Silvanovich
 
Tetcon2016 160104
Tetcon2016 160104Tetcon2016 160104
Tetcon2016 160104
 

More from Thomas Zimmermann

Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing InformationThomas Zimmermann
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsThomas Zimmermann
 
Analytics for smarter software development
Analytics for smarter software development Analytics for smarter software development
Analytics for smarter software development Thomas Zimmermann
 
Characterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedCharacterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedThomas Zimmermann
 
Data driven games user research
Data driven games user researchData driven games user research
Data driven games user researchThomas Zimmermann
 
Not my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsNot my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsThomas Zimmermann
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchThomas Zimmermann
 
Security trend analysis with CVE topic models
Security trend analysis with CVE topic modelsSecurity trend analysis with CVE topic models
Security trend analysis with CVE topic modelsThomas Zimmermann
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software developmentThomas Zimmermann
 
Characterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedCharacterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedThomas Zimmermann
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesThomas Zimmermann
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect predictionThomas Zimmermann
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesThomas Zimmermann
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsThomas Zimmermann
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceThomas Zimmermann
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Thomas Zimmermann
 
Got Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringGot Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringThomas Zimmermann
 

More from Thomas Zimmermann (20)

Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing Information
 
MSR 2013 Preview
MSR 2013 PreviewMSR 2013 Preview
MSR 2013 Preview
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode Operations
 
Analytics for smarter software development
Analytics for smarter software development Analytics for smarter software development
Analytics for smarter software development
 
Characterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedCharacterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get Reopened
 
Klingon Countdown Timer
Klingon Countdown TimerKlingon Countdown Timer
Klingon Countdown Timer
 
Data driven games user research
Data driven games user researchData driven games user research
Data driven games user research
 
Not my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsNot my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignments
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft Research
 
Security trend analysis with CVE topic models
Security trend analysis with CVE topic modelsSecurity trend analysis with CVE topic models
Security trend analysis with CVE topic models
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software development
 
Characterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedCharacterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixed
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect prediction
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency Graphs
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open Source
 
Meet Tom and his Fish
Meet Tom and his FishMeet Tom and his Fish
Meet Tom and his Fish
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities
 
Got Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringGot Myth? Myths in Software Engineering
Got Myth? Myths in Software Engineering
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Automatic Identification of Bug-Introducing Changes

  • 1. Automatic Identification of Bug-Introducing Changes Sunghun Kim Kai Pan E. James Whitehead, Jr. University of California, Santa Cruz, USA Tom Zimmermann Saarland University, Saarbrücken, Germany
  • 4. Motivation Statistics Detection Bugs
  • 5. Motivation Statistics Detection Bugs Prediction
  • 6. Motivation Statistics Detection Bugs Prediction Analysis
  • 7. Motivation Statistics Detection Bugs Observation Prediction Analysis
  • 8. So far: Focus on fixes teicher 2003-10-29 16:11:01 fixes issues mentioned in bug 45635: [hovering] rollover hovers - mouse exit detection is safer and should not allow for loopholes any more, except for shell deactiviation - hovers behave like normal ones: - tooltips pop up below the control - they move with subjectArea - once a popup is showing, they will show up instantly
  • 9. So far: Focus on fixes teicher 2003-10-29 16:11:01 fixes issues mentioned in bug 45635: [hovering] rollover hovers - mouse exit detection is safer and should not allow for loopholes any more, except for shell deactiviation - hovers behave like normal ones: - tooltips pop up below the control - they move with subjectArea - once a popup is showing, they will show up instantly Fixes give only the location of a defect, not when it was introduced.
  • 11. Bug-introducing changes BUG-INTRODUCING FIX ... ... later fixed if (foo==null) { if (foo!=null) { foo.bar(); foo.bar(); ... ...
  • 12. Bug-introducing changes BUG-INTRODUCING FIX ... ... later fixed if (foo==null) { if (foo!=null) { foo.bar(); foo.bar(); ... ... Bug-introducing changes are changes that lead to problems as indicated by later fixes.
  • 13. Life-cycle of a “bug”
  • 14. Life-cycle of a “bug” BUG-INTRODUCING CHANGE
  • 15. Life-cycle of a “bug” BUG REPORT fixes issues mentioned in bug 45635: [hovering] rollover hovers - mouse exit detection is safer and should not allow for loopholes any more, except for shell deactiviation - hovers behave like normal ones: - tooltips pop up below the control - they move with subjectArea - once a popup is showing, they will show up instantly BUG-INTRODUCING CHANGE
  • 16. Life-cycle of a “bug” BUG REPORT fixes issues mentioned in bug 45635: [hovering] rollover hovers - mouse exit detection is safer and should not allow for loopholes any more, except for shell deactiviation - hovers behave like normal ones: - tooltips pop up below the control - they move with subjectArea - once a popup is showing, they will show up instantly BUG-INTRODUCING FIX CHANGE CHANGE
  • 17. The SZZ algorithm 1.18 FIXED BUG 42233
  • 18. The SZZ algorithm $ cvs annotate -r 1.17 Foo.java ... 20: 1.11 (john 12-Feb-03): return i/0; ... 40: 1.14 (kate 23-May-03): return 42; ... 60: 1.16 (mary 10-Jun-03): int i=0; 1.18 FIXED BUG 42233
  • 19. The SZZ algorithm $ cvs annotate -r 1.17 Foo.java ... 20: 1.11 (john 12-Feb-03): return i/0; ... 40: 1.14 (kate 23-May-03): return 42; ... 60: 1.16 (mary 10-Jun-03): int i=0; 1.18 FIXED BUG 42233
  • 20. The SZZ algorithm $ cvs annotate -r 1.17 Foo.java ... 20: 1.11 (john 12-Feb-03): return i/0; ... 40: 1.14 (kate 23-May-03): return 42; ... 60: 1.16 (mary 10-Jun-03): int i=0; 1.11 1.14 1.16 1.18 FIXED BUG 42233
  • 21. The SZZ algorithm $ cvs annotate -r 1.17 Foo.java ... 20: 1.11 (john 12-Feb-03): return i/0; ... 40: 1.14 (kate 23-May-03): return 42; ... 60: 1.16 (mary 10-Jun-03): int i=0; 1.11 1.14 1.16 1.18 BUG FIXED BUG INTRO 42233
  • 22. The SZZ algorithm $ cvs annotate -r 1.17 Foo.java ... 20: 1.11 (john 12-Feb-03): return i/0; ... 40: 1.14 (kate 23-May-03): return 42; ... 60: 1.16 (mary 10-Jun-03): int i=0; 1.11 1.14 1.16 1.18 BUG BUG FIXED BUG INTRO INTRO 42233
  • 23. The SZZ algorithm $ cvs annotate -r 1.17 Foo.java ... 20: 1.11 (john 12-Feb-03): return i/0; ... 40: 1.14 (kate 23-May-03): return 42; ... 60: 1.16 (mary 10-Jun-03): int i=0; 1.11 1.14 1.16 1.18 BUG BUG BUG FIXED BUG INTRO INTRO INTRO 42233
  • 24. The SZZ algorithm 1.11 1.14 1.16 1.18 BUG BUG BUG FIXED BUG INTRO INTRO INTRO 42233
  • 25. The SZZ algorithm submitted closed BUG REPORT fixes issues mentioned in bug 45635: [hovering] rollover hovers - mouse exit detection is safer and should not allow for loopholes any more, except for shell deactiviation - hovers behave like normal ones: - tooltips pop up below the control - they move with subjectArea - once a popup is showing, they will show up instantly 1.11 1.14 1.16 1.18 BUG BUG BUG FIXED BUG INTRO INTRO INTRO 42233
  • 26. The SZZ algorithm submitted closed BUG REPORT fixes issues mentioned in bug 45635: [hovering] rollover hovers - mouse exit detection is safer and should not allow for loopholes any more, except for shell deactiviation - hovers behave like normal ones: - tooltips pop up below the control - they move with subjectArea - once a popup is showing, they will show up instantly 1.11 1.14 1.16 1.18 BUGREMOVE BUG BUG FIXED BUG INTRO INTRO POSITIVES INTRO 42233 FALSE
  • 27. Drawbacks of SZZ Annotation by SCMs is insufficient. (line number in bug-introducing revision is missing)
  • 28. Drawbacks of SZZ Annotation by SCMs is insufficient. (line number in bug-introducing revision is missing) Not all modifications are fixes. (blank lines, comments, etc.)
  • 29. False negatives and positives Correct False negatives False positives BUG-INTRODUCING BUGS
  • 30. An example BUG-INTRODUCING void bar() { if (val==null) { println(val); } ... Revision 7: tom - introduces the defects
  • 31. An example BUG-INTRODUCING BUG FIX void bar() { void foo() { void foo() { if (val==null) { // print val // print value println(val); if (val==null) if (val!=null) } { { ... println(val); println(val); } } ... ... Revision 7: tom Revision 23: jim Revision 42: kim - introduces the defects - inserts a comment - changes comment - reformats if statement - corrects defect
  • 32. An example BUG-INTRODUCING BUG FIX void bar() { void foo() { void foo() { if (val==null) { // print val // print valueval println(val); if (val==null) if (val!=null) } { { ... println(val); println(val); } } ... ... Revision 7: tom Revision 23: jim Revision 42: kim - introduces the defects - inserts a comment - changes comment - reformats if statement - corrects defect
  • 33. An example BUG-INTRODUCING BUG FIX void bar() { void foo() { void foo() { if (val==null) { // print val // print valueval println(val); if (val==null) if (val!=null) } { { ... println(val); println(val); } } ... ... Revision 7: tom Revision 23: jim Revision 42: kim - introduces the defects - inserts a comment - changes comment - reformats if statement - corrects defect
  • 34. An example BUG-INTRODUCING BUG FIX void bar() { void foo() { void foo() { if (val==null) { // print val // print valueval println(val); if (val==null) if (val!=null) } { { ... println(val); println(val); } } ... ... Revision 7: tom Revision 23: jim Revision 42: kim - introduces the defects - inserts a comment - changes comment - reformats if statement - corrects defect The originial SZZ algorithm has too many false positives (rev 23) and false negatives (rev 7).
  • 35. Our study COLUMBA ECLIPSE (jdt.core) Project Software type Email client IDE Investigated period 11/2002-06/2003 06/2001-03/2002 Number of revisions 500 1000 Number of fixes 143 (29%) 158 (16%) Average LOC 48,135 111,059
  • 37. Annotation graphs fixed revisions
  • 38. Annotation graphs foo() fixed revisions
  • 39. Annotation graphs bar() renamed to foo() foo() fixed revisions
  • 40. Annotation graphs bar() bar() renamed to foo() foo() fixed revisions
  • 41. Annotation graphs bar() bar() renamed to foo() foo() fixed revisions SZZ reports a bug-introducing change for foo (false positive) but not for bar (false negative).
  • 42. Annotation graphs bar() bar() renamed to foo() foo() fixed revisions Using annotation graphs we can remove 2% as false positives and identifies further 1%~4%.
  • 43. Comments & blank lines public void notifySourceElementRequestor() { - + if (reportReferenceInfo) { + notifyAllUnknownReferences(); +} // collect the top level ast nodes int length = 0;
  • 44. Comments & blank lines public void notifySourceElementRequestor() { - + if (reportReferenceInfo) { + notifyAllUnknownReferences(); +} // collect the top level ast nodes int length = 0; Ignoring comments and blank lines removes 14%~20% as false positives.
  • 46. Format changes BUG-INTRODUCING BUG FIX if ( a==true ) return; if (a==true) if (a==false) return; return;
  • 47. Format changes BUG-INTRODUCING BUG FIX if ( a==true ) return; if (a==true) if (a==false) return; return; FALSE POSITIVE
  • 48. Format changes BUG-INTRODUCING BUG FIX if ( a==true ) return; if (a==true) if (a==false) return; return; FALSE FALSE NEGATIVE POSITIVE
  • 49. Format changes BUG-INTRODUCING BUG FIX if ( a==true ) return; if (a==true) if (a==false) return; return; FALSE FALSE NEGATIVE POSITIVE Ignoring format changes removes 18%~25% as false positives and identifies further 13%~14%.
  • 50. Fixes that affect many files Most large fixes are refactoring - public boolean visit(TypeDeclaration - typeDeclaration, BlockScope scope){ + public boolean visit(LocalTypeDeclaration + typeDeclaration, BlockScope scope){
  • 51. Fixes that affect many files Most large fixes are refactoring - public boolean visit(TypeDeclaration - typeDeclaration, BlockScope scope){ + public boolean visit(LocalTypeDeclaration + typeDeclaration, BlockScope scope){ Ignoring fixes that affect many files (=more than five times the median) removes 7%~16% as false positives
  • 52.
  • 53. Manual inspection of fixes Two judges check whether a fix is actually a fix. deleteResources(actualNonJavaResources,fForce); - IResource[] remaingFiles; + IResource[] remainingFiles;  try { - remaingFiles=((IFolder)res).members(); + remainingFiles=((IFolder)res).members(); }
  • 54. Manual inspection of fixes Two judges check whether a fix is actually a fix. deleteResources(actualNonJavaResources,fForce); - IResource[] remaingFiles; + IResource[] remainingFiles;  try { - remaingFiles=((IFolder)res).members(); + remainingFiles=((IFolder)res).members(); } Manual inspection of fixes removes only 4%~5% as false positives.
  • 55. Putting it together ASE SZZ BUGS
  • 56. Putting it together False negatives of SZZ: 14% ASE SZZ BUGS
  • 57. Putting it together False negatives of SZZ: 14% False positives of SZZ: 38%~51% ASE SZZ BUGS
  • 58. Putting it together False negatives of SZZ: 14% False positives of SZZ: 38%~51% ASE SZZ Automatically we can remove 36%~48% as BUGS false positive.
  • 59. Don’t program on Fridays ;-) Monday 11.3% Tuesday 10.4% Wednesday 11.1% Thursday 12.1% Friday 12.2% Saturday 11.7% Sunday 11.6% 0% 2.5% 5.0% 7.5% 10.0% 12.5% 15.0% Percentage of bug-introducing changes
  • 60. Don’t program on Fridays ;-) Monday 11.3% Tuesday 10.4% Wednesday 11.1% Thursday 12.1% Friday 12.2% Saturday 11.7% Sunday 11.6% 0% 2.5% 5.0% 7.5% 10.0% 12.5% 15.0% Percentage of bug-introducing changes
  • 61. Defect-prone authors Authors (anonymized) Fry Leela Bender Amy 0% 10% 20% 30% 40% 50% 60% 70% 80% Frequency Bug-Introducing changes Fix changes
  • 62. Risk awareness “Safe” Location (green) Risky Location (dark red)
  • 65. Change classification BUILD A CLASSIFIER bug-introducing (“bad”) X X X X
  • 66. Change classification BUILD A CLASSIFIER bug-introducing (“bad”) X X X X new change
  • 67. Change classification BUILD A CLASSIFIER bug-introducing (“bad”) X X X X new change PREDICT QUALITY
  • 69. Conclusions • Bug-introducing changes tell when a defect was introduced, not only its location.
  • 70. Conclusions • Bug-introducing changes tell when a defect was introduced, not only its location. • We can automatically identify – 36%~48% of SZZ as false positives and – further 14% of missed bug-introductions.
  • 71. Conclusions • Bug-introducing changes tell when a defect was introduced, not only its location. • We can automatically identify – 36%~48% of SZZ as false positives and – further 14% of missed bug-introductions. • Bug-introducing changes are useful for defect prediction and software evolution.
  • 72. More minin g @A Frida SE: y mo Conclusions rning . • Bug-introducing changes tell when a defect was introduced, not only its location. • We can automatically identify – 36%~48% of SZZ as false positives and – further 14% of missed bug-introductions. • Bug-introducing changes are useful for defect prediction and software evolution.