Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
On the Relationship Between Change Coupling and Software Defects
1. On the Relationship Between Change
Coupling and Software Defects
Marco D’Ambros
Michele Lanza
Romain Robbes
REVEAL @ University of Lugano, CH
2. Change Coupling
t1 t2 t3 t4 t5 t6
c1
c2
Time
Ball in Proceedings of the International Conference on Software Maintenance 1998 (ICSM ’98) Gall 1/10
199 Detection of Logical Coupling Based on Product Release History
199
7 8
Harald Gall, Karin Hajek, and Mehdi Jazayeri
“The implicit relationship of
Technical University of Vienna, Distributed Systems Group
Argentinierstrasse 8/184-1, A-1040 Wien, Austria, Europe
{gall,hajek,jazayeri}@infosys.tuwien.ac.at
Requirements
Abstract consists of 10 million lines of code (MLOC) per
Code-based metrics such as coupling and cohesion are system release.
Version used to measure a system’s structural complexity. But
Software Control Implementation 2. Such measures do not reveal all dependencies (e.g.
Process Technology dealing with large systems—those consisting of several
History
millions of lines— at the code level faces many prob- dynamic relations). In fact, some dependencies are
lems. An alternative approach is to concentrate on the not written down either in documentation or in the
system’s building blocks such as programs or modules code. The software engineer just “knows” that to
Developers
as the unit of examination. We present an approach that make a change of a certain type, he or she has to
change a certain set of modules.
two software artifacts that
uses information in a release history of a system to un-
cover logical dependencies and change patterns among We may say that such code-based measures reveal
modules. We have developed the approach by working syntactic dependencies and what we are really interested
with 20 releases of a large Telecommunications Switch- in is logical dependencies among modules. The purpose
ing System. We use release information such as version of this paper is to present an approach to uncover such
numbers of programs, modules, and subsystems together logical dependencies by analyzing the release history of a
with change reports to discover common change behav- system. Release histories contain a wealth of information
ior (i.e. change patterns) of modules. Our approach about the software structure. The task is just to analyze
identifies logical coupling among modules in such a way them and uncover the information.
that potential structural shortcomings can be identified In particular, we can analyze release histories to look
and further examined, pointing to restructuring or for patterns of change: are there some modules that are
reengineering opportunities. always changed together in a release? Are there sequen-
tial dependencies such as if module A is changed in one
1 Introduction release, module B is changed in the next release? And so
frequently change together”
on.
Large software systems are continuously modified and
We have developed a technique called CAESAR for
increase in size and complexity. After many enhance-
detecting such patterns. We have applied the technique
ments and other maintenance activities, modifications
to a large system with a 20-release history and identified
become hard to do. Therefore, methods and techniques
potential dependencies among modules. To validate the
are needed to restructure or even reengineer a system
accuracy of these dependencies identified by our tech-
into a more maintainable form.
nique, we examined change reports that contain specific
To evaluate the impact of changes, we need to under-
change information for a release. The results have shown
stand the relationships, that is, dependencies among
that this approach is promising in identifying “logical”
modules that compose the system. Current methods of
couplings among modules across several releases.
identifying dependencies are based on metrics such as
Our technique reveals hidden dependencies not evi-
coupling and cohesion measures [6,17]. These measures
dent in the source code and identifies modules that are
identify dependencies among modules by the existence of
candidates for restructuring. The technique requires very
such relationships as procedure calls or “include” direc-
little data to be kept for each release of a system. Rather
tives. There are two basic issues with these measures:
than dealing with millions of lines of code, it works with
structural information about programs, modules, and
1. These measures are based on source code which is
subsystems, together with their version numbers and
usually very large. In our case study the source code
3. Previous research on Change Coupling
Gall et. al., IWPSE ’03
Change coupling
points to architectural
weaknesses
4. Previous research on Change Coupling
Pinzger et. al., SoftVis ’05
Change coupling
facilitates the detection
of refactoring candidates
5. Previous research on Change Coupling
Beyer et. al., WCRE ’06
Change coupling helps
the comprehension of
system modularization
6. Previous research on Change Coupling
D’Ambros et. al., TSE ‘09
Change coupling helps
in spotting misplaced
software components
7. Previous research on Change Coupling
D’Ambros et. al., TSE ‘09
What about Changeare helps
coupling
softw misplaced
in spotting
defectsoftware components
s?
8. Previous research on Change Coupling
D’Ambros et. al., TSE ‘09
What abonge ofowpcoupling helps
Change
ut in c t u aling
ha s spottinge r misplaced
Q 1 Does c
defectft? are defects?
sw
so software components
correlate with
9. Previous research on Change Coupling
D’Ambros et. al., TSE ‘09
What abonge ofowpcoupling helps
Change
ut in c t u aling
ha s spottinge r misplaced
Q 1 Does c
defectft? are defects?
sw
so software components
correlate with
Q2 Does it improve existing
defect prediction techniques?
11. Change Coupling metrics
Class level change
Class #defects ? coupling metrics
Foo 1 7
Bar 10 4
Boo 2 0
12. n-coupled classes
t1 t2 t3 t4 t5 t6
c1
c2
c3
c4
c5
2 classes are n-coupled if they changed together
at least n times
13. n-coupled classes n=4
t1 t2 t3 t4 t5 t6
c1
c2
c3
c4
c5
2 classes are n-coupled if they changed together
at least n times
14. n-coupled classes n=4
t1 t2 t3 t4 t5 t6
c1
c2
c3
c4
c5
2 classes are n-coupled if they changed together
at least n times
15. n-coupled classes n=4
t1 t2 t3 t4 t5 t6
c1
c2
c3
c4
c5
2 classes are n-coupled if they changed together
at least n times
16. n-coupled classes n=5
t1 t2 t3 t4 t5 t6
c1
c2
c3
c4
c5
2 classes are n-coupled if they changed together
at least n times
17. NOCC(class, n): Number Of Coupled Classes
NOCC(c2, 4) = 1
t1 t2 t3 t4 t5 t6
1 c1
c2
c3
c4
c5
18. NOCC(class, n): Number Of Coupled Classes
NOCC(c2, 4) = 1 + 1
t1 t2 t3 t4 t5 t6
1 c1
c2
c3
c4
1 c5
19. SOC(class, n): Sum Of Coupling
SOC(c2, 4) =
t1 t2 t3 t4 t5 t6
c1
c2
c3
c4
c5
20. SOC(class, n): Sum Of Coupling
SOC(c2, 4) = 4
t1 t2 t3 t4 t5 t6
1 2 3 4
c1
c2
c3
c4
c5
21. SOC(class, n): Sum Of Coupling
SOC(c2, 4) = 4 + 5
t1 t2 t3 t4 t5 t6
c1
c2
c3
c4
c5
1 2 3 4 5
22. Metrics with linear and exponential decay
EWSOC: Exponentially Weighted Sum Of Coupling
LWSOC: Linearly Weighted Sum Of Coupling
k=5 k=4 k=3 k=2 k=1
t1 t2 t3 t4 t5 t6
c1
c2
Current release
Time
23. Metrics with linear and exponential decay
EWSOC: Exponentially Weighted Sum Of Coupling
LWSOC: Linearly Weighted Sum Of Coupling
Exponential = 1
weight 24
Linear = 1
weight 5
k=5 k=4 k=3 k=2 k=1
t1 t2 t3 t4 t5 t6
c1
c2
Current release
Time
24. Metrics with linear and exponential decay
EWSOC: Exponentially Weighted Sum Of Coupling
LWSOC: Linearly Weighted Sum Of Coupling
Exponential = 1
Exponential = 1 weight 20
weight 24
Linear = 1
Linear = 1 weight 1
weight 5
k=5 k=4 k=3 k=2 k=1
t1 t2 t3 t4 t5 t6
c1
c2
Current release
Time
33. Eclipse JDT Core - All bugs
Spearman correlation
0.9
0.8
0.7
0.6
0.5
Fan out
0.4
n
0.3
1 3 5 8 10 15 20 30
34. Eclipse JDT Core - All bugs
Spearman correlation
0.9 #Changes
0.8
0.7
0.6
0.5
Fan out
0.4
n
0.3
1 3 5 8 10 15 20 30
35. Eclipse JDT Core - All bugs
Spearman correlation
0.9 #Changes
0.8
SOC NOCC
0.7
LWSOC EWSOC
0.6
0.5
Fan out
0.4
n
0.3
1 3 5 8 10 15 20 30
36. Eclipse JDT Core - All bugs
Spearman correlation
0.9 #Changes
0.8 Q1 Chan ge coupling does
0.7
SOC
correlate w ith software defects, NOCC
LWSOCan all source code
more th EWSOC
0.6
metrics, wor se than #changes
0.5
Fan out
0.4
n
0.3
1 3 5 8 10 15 20 30
37. Eclipse JDT Core - All bugs
Spearman correlation
0.9 #Changes
0.8 Q1 Chan ge coupling does
0.7
SOC
correlate w ith software defects, NOCC
LWSOCan all source code
more th EWSOC
0.6
metrics, wor se than #changes
0.5
DecayFan outdels do not wor k
mo
0.4
n
0.3
1 3 5 8 10 15 20 30
40. Mylyn - Severe Bugs
Spearman correlation
0.40
0.37
0.33
LOC
0.30
0.27
0.23
n
0.20
1 3 5 8 10 15 20 30
41. Mylyn - Severe Bugs
Spearman correlation
0.40 #Changes
0.37
0.33
LOC
0.30
0.27
0.23
n
0.20
1 3 5 8 10 15 20 30
42. Mylyn - Severe Bugs
Spearman correlation
0.40 SOC #Changes
0.37
0.33 EWSOC NOCC
LWSOC LOC
0.30
0.27
0.23
n
0.20
1 3 5 8 10 15 20 30
43. Mylyn - Severe Bugs
Spearman correlation
0.40 SOC #Changes
0.37
Q1* Cha nge c
EWSOC
ouplingNOCCrelates
cor
severe defects, but is LOC
0.33
less withLWSOC
0.30
bette r than all source code
0.27 m etrics and #changes
0.23
n
0.20
1 3 5 8 10 15 20 30
44. Regression Analysis
Q2 Does cha nge coupling improve
defect pr ediction techniques?
45. Regression Analysis
Q2 Does cha nge coupling improve
defect pr ediction techniques?
Q2* Is the improvement greater
for severe defects?
47. Regression Models
#changes NOCC all
NOCC(n) (for all n)
Source code
metrics + SOC(n)
EWSOC(n)
LWSOC(n)
48. Regression Models
#changes NOCC all
NOCC(n) (for all n)
Source code
metrics + SOC(n)
EWSOC(n) All CC measures
LWSOC(n) (for all n)
49. Regression Models
#changes NOCC all
NOCC(n) (for all n)
Source code
metrics + SOC(n)
EWSOC(n) All CC measures
LWSOC(n) (for all n)
W e measure explanative and
predictive power of the models
63. Eclipse JDT Core - All Bugs - Predictive Power
Spearman correlation NOCC All
0.67
Metrics + SOC
0.61 Metrics + #Changes
All CC measures
0.55
0.49
Metrics LWSOC
0.42 Metrics + NOCC
Metrics EWSOC Metrics
0.36
n
0.30
1 3 5 8 10 15 20 30
64. Eclipse JDT Core - All Bugs - Predictive Power
Spearman correlation NOCC All
0.67
Metrics + SOC
0.61 Metrics + #Changes
All CC measures
0.55
0.49
Metrics LWSOC
0.42 Metrics + NOCC
Metrics EWSOC Metrics
0.36
n
0.30
1 3 5 8 10 15 20 30
65. Eclipse JDT Core - All Bugs - Predictive Power
Spearman correlation NOCC All
0.67
Q2 Change coupli+ SOC
Metrics fect
ng improves de#Changes
Metrics +
ues based All n source
o
0.61
prediction techniq CC measures
0.55 code metrics an d #changes (slightly)
0.49
Metrics LWSOC
0.42 Metrics + NOCC
Metrics EWSOC Metrics
0.36
n
0.30
1 3 5 8 10 15 20 30
66. Eclipse JDT Core - All Bugs - Predictive Power
Spearman correlation NOCC All
0.67
Q2 Change coupli+ SOC
Metrics fect
ng improves de#Changes
Metrics +
ues based All n source
o
0.61
prediction techniq CC measures
0.55 code metrics an d #changes (slightly)
Metrics LWSOC
0.49
Q2* The overall results are worse for
0.42
severe defects, but the improvement
Metrics + NOCC
over existingMetrics EWSOC is greaterMetrics
approaches
0.36
n
0.30
1 3 5 8 10 15 20 30
68. Conclusion
No study on change
coupling and software
defects
69. Conclusion
No study on change
coupling and software Definitions of different
defects class-level change
coupling metrics
70. Conclusion
No study on change
coupling and software Definitions of different
defects class-level change
coupling metrics
Change coupling does
correlate with software
defects
71. Conclusion
No study on change
coupling and software Definitions of different
defects class-level change
coupling metrics
Change coupling does
correlate with software Change coupling can
defects improve existing defect
prediction techniques
72. Conclusion
No study on change
coupling and software Definitions of different
defects class-level change
coupling metrics
Change coupling does
correlate with software Change coupling can
defects improve existing defect
prediction techniques
Change coupling is harmful!