This is a talk I gave at the 2009 Working Conference on Reverse Engineering in Lille, France about our work on the effects of inconsistent changes on software quality if we observe them at a release level.
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
An Empirical Study on Inconsistent Changes to Code Clones at Release Level
1. Nicolas Bettenburg Walid Ibrahim Ahmed E. Hassan
Weyi Shang Bram Adams Ying Zou
An Empirical Study on
Inconsistent Changes to Code Clones
at Release Level
3. 2
Code Clones: Recent Research in the Field
“Cloning Considered Harmful” Considered Harmful
Cory Kapser and Michael W. Godfrey Cloning as
Software Architecture Group (SWAG) Engineering Tool
David R. Cheriton School of Computer Science, University of Waterloo
{cjkapser, migod}@uwaterloo.ca
Abstract clones pose additional problems if they do not evolve
synchronously. With this in mind, methods for automatic
urrent literature on the topic of duplicated (cloned) refactoring have been suggested [4, 7], and tools specifically
in software systems often considers duplication to aid developers in the manual refactoring of clones have
ful to the system quality and the reasons commonly also been developed [19].
for duplicating code often have a negative There is no doubt that code cloning is often an indication
otation. While these positions are sometimes of sloppy design and in such cases should be considered to
ct, during our case studies we have found that this is be a kind of development “bad smell”. However, we have
niversally true, and we have found several situations found that there are many instances where this is simply not
e code duplication seems to be a reasonable or the case. For example, cloning may be used to introduce
beneficial design option. For example, a method of experimental optimizations to core subsystems without
ducing experimental changes to core subsystems is to negatively effecting the stability of the main code. Thus,
cate the subsystem and introduce changes there in a a variety of concerns such as stability, code ownership, and
of sandbox testbed. As features mature and become design clarity need to be considered before any refactoring
e within the experimental subsystem, they can then is attempted; a manager should try to understand the reason
troduced gradually into the stable code base. In this behind the duplication before deciding what action (if any)
risk of introducing instabilities in the stable version is to take. 1
mized. This paper describes several patterns of cloning This paper introduces eight cloning patterns that we have
we have encountered in our case studies and discusses uncovered during case studies on large software systems,
4. 2
Code Clones: Recent Research in the Field
“Cloning Considered Harmful” Considered Harmful
Cory Kapser and Michael W. Godfrey Cloning as
Software Architecture Group (SWAG) Engineering Tool
David R. Cheriton School of Computer Science, University of Waterloo
{cjkapser, migod}@uwaterloo.ca
Abstract Do Code Clones Matter?
clones pose additional problems if they do not evolve
synchronously. With this in mind, methods for automatic
urrent literature on the topic of duplicated (cloned) refactoring have been suggested [4, 7], and tools specifically
in software systems often considers Deissenboeck, to aid developers in the Stefan refactoring of clones have
Elmar Juergens, Florian duplication Benjamin Hummel, manual Wagner Inconsistent Clones
Institut f¨ r Informatik, Technischebeen developedM¨ nchen
ful to the system quality and theureasons commonly also Universit¨ t [19].
a u Single Snapshots
for duplicating code often have 3, 85748 Garching b. M¨ nchen, Germany
Boltzmannstr. a negative There is no doubt that code cloning is often an indication
u
otation. While these positions are sometimes of sloppy design and in such cases should be considered to
{juergens,deissenb,hummelb,wagnerst}@in.tum.de
ct, during our case studies we have found that this is be a kind of development “bad smell”. However, we have
niversally true, and we have found several situations found that there are many instances where this is simply not
e code duplication seems to be a reasonable or the case. For example, cloning may be used to introduce
Abstract
beneficial design option. For example, a method of experimental optimizations tofixed insubsystems without
found in cloned code but not core all clone instances,
ducing experimental changes to core subsystems is to negatively effecting the still exhibit the incorrect behavior.
the system is likely to stability of the main code. Thus,
cate the subsystem and introduce changes there in a
ode cloning is not only assumed to inflate mainte- a variety of concerns such as stability, codewhere a missing
To illustrate this, Fig. 1 shows an example, ownership, and
ce costs but also considered defect-prone asand become
of sandbox testbed. As features mature inconsistent null-check was retrofitted in only one clone instance.
design clarity need to be considered before any refactoring
nges to code duplicates can lead to unexpected can then
e within the experimental subsystem, they behavior. is attempted; apresents the results of a understand case study
This paper manager should try to large-scale the reason
sequently,gradually into the of duplicated code, clone
troduced the identification stable code base. In this behind the duplication before deciding whatare changed in-
that was undertaken to find out (1) if clones action (if any)
risk of has been a very active area theresearch in recent
ction, introducing instabilities in of stable version is to take. 1
consistently, (2) if these inconsistencies are introduced in-
mized. This paper describes substantial investigation of
s. Up to now, however, no several patterns of cloning tentionally and, (3) if unintentional inconsistencies we have
This paper introduces eight cloning patterns that can rep-
consequences of code cloning on program correctness
we have encountered in our case studies and discusses uncovered during case studies we analyzed three commer-
resent faults. In this case study on large software systems,
5. 2
Code Clones: Recent Research in the Field
“Cloning Considered Harmful” Considered Harmful
Cory Kapser and Michael W. Godfrey Cloning as
Software Architecture Group (SWAG) Engineering Tool
David R. Cheriton School of Computer Science, University of Waterloo
{cjkapser, migod}@uwaterloo.ca
Abstract Do Code Clones Matter?
clones pose additional problems if they do not evolve
synchronously. With this in mind, methods for automatic
urrent literature on the topic of duplicated (cloned) refactoring have been suggested [4, 7], and tools specifically
in software systems often considers Deissenboeck, to aid developers in the Stefan refactoring of clones have
Elmar Juergens, Florian duplication Benjamin Hummel, manual Wagner Inconsistent Clones
Institut f¨ r Informatik, Technischebeen developedM¨ nchen
ful to the system quality and theureasons commonly also Universit¨ t [19].
a u Single Snapshots
for duplicating code often have 3, 85748 Garching b. M¨ nchen, Germany
Boltzmannstr. a negative There is no doubt that code cloning is often an indication
u
otation. While these positions are sometimes of sloppy design and in such cases should be considered to
{juergens,deissenb,hummelb,wagnerst}@in.tum.de
ct, during our case studies we have found that this is be a kind of development “bad smell”. However, we have
niversally true, and we have found several situations found that there are many instances where this is simply not
e code duplication seems to be a reasonable or the case. For example, cloning may be used to introduce
Abstract
beneficial design option. For example, a method of experimental optimizations tofixed insubsystems without
found in cloned code but not core all clone instances,
A Study of Consistent and Inconsistent Changesthe still exhibit Clones code. Thus,
ducing experimental changes to core subsystems is to the system is likely to Code the main
negatively effecting to stability of the incorrect behavior.
cate the subsystem and introduce changes there in a
ode cloning is not only assumed to inflate mainte- a variety of concerns such as stability, codewhere a missing
To illustrate this, Fig. 1 shows an example, ownership, and
ce costs but also considered defect-prone asand become
of sandbox testbed. As features mature inconsistent null-check was retrofitted in only one clone instance. Inconsistent Clones
design clarity need to be considered before any refactoring
nges to code duplicates can lead to unexpected can thenKrinke This paperapresents the results of a understand case study
Jens
e within the experimental subsystem, they behavior. is attempted; manager should try to large-scale the reason
sequently,gradually into the of duplicated code, clone Hagen, Germany to before deciding whatare changed in-
troduced the identification stable code base. In at in
this behind the duplication find out (1) if clones action (if any)
that was undertaken Weekly Snapshots
FernUniversit¨
risk of has been a very active area theresearch in recent
ction, introducing instabilities in of stable version is to take. 1
consistently, (2) if these inconsistencies are introduced in-
krinke@acm.org
tentionally and, (3) if unintentional inconsistencies we have
mized. This paper describes substantial investigation of
s. Up to now, however, no several patterns of cloning This paper introduces eight cloning patterns that can rep-
consequences of code cloning on program correctness
we have encountered in our case studies and discusses uncovered during case studies we analyzed three commer-
resent faults. In this case study on large software systems,
6. 3
Code Clones: Inconsistent Changes
“During the evolution of a system, code clones
should be changed consistently to prevent bugs.”
7. 3
Code Clones: Inconsistent Changes
“During the evolution of a system, code clones
should be changed consistently to prevent bugs.”
Demonstrated to be
true at a micro-level!
9. 4
Revision Level vs. Release Level Analysis
A
r2014 ... r2209 ... r2351 ... r2682 Revisions
10. 4
Revision Level vs. Release Level Analysis
A
A
r2014 ... r2209 ... r2351 ... r2682 Revisions
11. 4
Revision Level vs. Release Level Analysis
A
A
r2014 ... r2209 ... r2351 ... r2682 Revisions
12. 4
Revision Level vs. Release Level Analysis
A
A
r2014 ... r2209 ... r2351 ... r2682 Revisions
13. 4
Revision Level vs. Release Level Analysis
A
A
... Revisions
Experimentation
r2014 ... r2209 ... r2351 r2682
Refactoring
Bug-Fixing
14. 4
Revision Level vs. Release Level Analysis
Transient Effects
Code Clones
Amount
Inconsistent Changes
A
A
Time
... Revisions
Experimentation
r2014 ... r2209 ... r2351 r2682
Refactoring
Bug-Fixing
15. 4
Revision Level vs. Release Level Analysis
Transient Effects
Code Clones
Amount
Inconsistent Changes
A
A
Time
... Revisions
Experimentation
r2014 ... r2209 ... r2351 r2682
Refactoring
Bug-Fixing
2.1 2.2 2.3 2.4 3.0 Releases
16. 5
Study Design: Subject Systems
22 Releases over 1 year
51 Days / release
15k Lines of code
50 Releases over 4 years
36 Days / release
90k Lines of code
22. 8
Research Questions
What are the characteristics of long-lived clone
Q1 genealogies at release level?
What is the effect of inconsistent changes on code
Q2 quality when measured at release level?
What type of cloning patterns do we observe at release
Q3 level?
23. 9
Research Questions
What are the characteristics of long-lived
Q1 clone genealogies at release level?
Life-Time
Group
Size
24. Lifetime of Clone Groups 10
50
20
Number of Releases
10
5
2
1
Apache Mina
jEdit
Number of Genealogies
25. Lifetime of Clone Groups 10
50
20
Number of Releases
10
5
2
1
Apache Mina
jEdit
Number of Genealogies
26. Lifetime of Clone Groups 10
50
20
Number of Releases
10
5
2
1
Apache Mina
jEdit
Number of Genealogies
27. Lifetime of Clone Groups 10
50
Long-lived
clone groups
20
Number of Releases
10
5
2
1
Apache Mina
jEdit
Number of Genealogies
28. 11
Size of Clone Groups
200
100
50
Number of Clones
20
10
5
2
1
Apache Mina
jEdit
Number of Genealogies
29. 11
Size of Clone Groups
200
100
50
Number of Clones
20
10
5
2
1
Apache Mina
jEdit
Number of Genealogies
30. 11
Size of Clone Groups
200
Mostly small
100
clone groups
50
Number of Clones
20
10
5
2
1
Apache Mina
jEdit
Number of Genealogies
31. 12
Research Questions
What is the effect of inconsistent changes on code
Q2 quality when measured at release level?
Inconsistent Change
Reports
2.1 2.2 2.3
Inspection
32. 13
Research Question Q2
org.gjt.sp.jedit.jEdit.newView(View, Buffer)
{
...
// show tip of the day
if(newView == viewsFirst)
{
// Don't show the welcome message if jEdit was started
// with the -nosettings switch
jEdit
if(settingsDirectory != null
&& getBooleanProperty("firstTime"))
new HelpViewer("jeditresource:/doc/welcome.html");
4.0.2
...
org.gjt.sp.jedit.jEdit.newView(View, String)
{
...
// show tip of the day
if(newView == viewsFirst)
{
// Don't show the welcome message if jEdit was started
// with the -nosettings switch
if(settingsDirectory != null
&& getBooleanProperty("firstTime"))
new HelpViewer("jeditresource:/doc/welcome.html");
...
33. 14
Research Question Q2
org.gjt.sp.jedit.jEdit.newView(View, Buffer)
{
...
// show tip of the day
if(newView == viewsFirst)
{
// Don't show the welcome message if jEdit was started
// with the -nosettings switch
jEdit
if(settingsDirectory != null
&& getBooleanProperty("firstTime"))
new HelpViewer("jeditresource:/doc/welcome.html");
4.0.2 4.0.3
...
org.gjt.sp.jedit.jEdit.newView(View, String)
{
...
// show tip of the day
if(newView == viewsFirst)
{
// Don't show the welcome message if jEdit was started
// with the -nosettings switch
if(settingsDirectory != null
&& getBooleanProperty("firstTime"))
new HelpViewer("jeditresource:/doc/welcome.html");
...
34. 15
Research Question Q2
org.gjt.sp.jedit.jEdit.newView(View, Buffer)
{
...
// show tip of the day
if(newView == viewsFirst)
{
// Don't show the welcome message if jEdit was started
// with the -nosettings switch
jEdit
if(settingsDirectory != null
&& getBooleanProperty("firstTime"))
new HelpViewer();
4.0.3 4.0.4
...
org.gjt.sp.jedit.jEdit.newView(View, String)
{
...
// show tip of the day
if(newView == viewsFirst)
{
// Don't show the welcome message if jEdit was started
// with the -nosettings switch
if(settingsDirectory != null
&& getBooleanProperty("firstTime"))
new HelpViewer("jeditresource:/doc/welcome.html");
...
35. 16
Research Question Q2
• 748 inconsistent changes flagged by our tool
• Manual inspection of reports and source code
• Only 7 inconsistent changes related to bugs
• Inconsistent changes seem carried out on purpose.
36. 16
Research Question Q2
• 748 inconsistent changes flagged by our tool
• Manual inspection of reports and source code
• Only 7 inconsistent changes related to bugs
• Inconsistent changes seem carried out on purpose.
Only a fraction of
inconsistent changes to
long-lived clones
introduce bugs!
37. 17
Research Questions
What type of cloning patterns do we observe
Q3 at release level?
Clone
Patterns
Classification
Clone
Reports
2.1 2.2 2.3 2.4 3.0 Releases