1. Think Locally, Act Globally
Improving Defect and Effort Prediction Models
Nicolas Bettenburg • Meiyappan Nagappan • Ahmed E. Hassan
Queen’s University • Kingston, ON, Canada
SOFTWARE ANALYSIS
& INTELLIGENCE LAB
T
Saturday, 2 June, 12
2. Data Modelling in Empirical SE
measured from project data
Observations
2
Saturday, 2 June, 12
3. Data Modelling in Empirical SE
measured from project data
Observations
describe observations
mathematically Model
2
Saturday, 2 June, 12
4. Data Modelling in Empirical SE
measured from project data
Observations
describe observations
mathematically Model Prediction
guide decision making
Understanding
guide process optimizations and future research
2
Saturday, 2 June, 12
13. In the Field
Tom Zimmermann
Saturday, 2 June, 12
14. In the Field
We ran 622 cross-project
predictions and found that only
3.4% actually worked.
Tom Zimmermann
Saturday, 2 June, 12
15. In the Field
We ran 622 cross-project
predictions and found that only
3.4% actually worked.
Tom Zimmermann
Tim Menzies
Saturday, 2 June, 12
16. In the Field
We ran 622 cross-project
predictions and found that only
3.4% actually worked.
Tom Zimmermann
Rather than focus on
generalities, empirical SE should
focus more on context-specific
principles.
Tim Menzies
Saturday, 2 June, 12
17. In the Field
We ran 622 cross-project
predictions and found that only
3.4% actually worked.
Tom Zimmermann Taking local properties of data into
consideration leads to better models!
Rather than focus on
generalities, empirical SE should
focus more on context-specific
principles.
Tim Menzies
Saturday, 2 June, 12
19. Using Locality in Statistical Models
1 Does this principle work for statistical models?
Saturday, 2 June, 12
20. Using Locality in Statistical Models
1 Does this principle work for statistical models?
2 Does it work for Prediction?
Saturday, 2 June, 12
21. Using Locality in Statistical Models
1 Does this principle work for statistical models?
2 Does it work for Prediction?
3 Can we do better?
Saturday, 2 June, 12
22. Building Local Models
Whole Dataset Training Data Learned Model
M
Y
Testing Data Predictions
8
Saturday, 2 June, 12
23. Building Local Models
ter Data
Clus
Whole Dataset Training Data Learned Model
M
Y
Testing Data Predictions
8
Saturday, 2 June, 12
24. Building Local Models
ltiple
n Mu
Data Lear dels
ter Mo
Clus
Whole Dataset Training Data Learned Models
M1 M2 M3
Y
Testing Data Predictions
8
Saturday, 2 June, 12
25. Building Local Models
ltiple
n Mu
Data Lear dels
ter Mo
Clus
Whole Dataset Training Data Learned Models
M1 M2 M3
Y Y Y
Testing Data Predictions
dict
Pre ally
Ind ividu
8
Saturday, 2 June, 12
26. Building Local Models
ltiple
n Mu
Data Lear dels
ter Mo
Clus
Whole Dataset Training Data Learned Models
M1 M2 M3
Y Y Y
Testing Data Predictions
Compare
dict
Pre ally
Ind ividu
8
Saturday, 2 June, 12
27. HAPTER 2.
Global StatisticalMODELS
GENERAL ASPECTS OF FITTING REGRESSION
Model 34
f(X)
0 1 2 3 4 5 6
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
9
Saturday, 2 June, 12
28. HAPTER 2.
Global StatisticalMODELS
GENERAL ASPECTS OF FITTING REGRESSION
Model 34
f(X)
0 1 2 3 4 5 6
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
9
Saturday, 2 June, 12
29. HAPTER 2.
Global StatisticalMODELS
GENERAL ASPECTS OF FITTING REGRESSION
Model 34
f(X)
0 1 2 3 4 5 6
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
9
Saturday, 2 June, 12
30. HAPTER 2.
Global StatisticalMODELS
GENERAL ASPECTS OF FITTING REGRESSION
Model 34
f(X)
0 1 2 3 4 5 6
X
Model fit leaves much room for improvement!
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
9
Saturday, 2 June, 12
31. Local Statistical Model
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 3
f(X)
0 1 2 3 4 5 6
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
10
Saturday, 2 June, 12
32. Local Statistical Model
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 3
f(X)
0 1 2 3 4 5 6
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
10
Saturday, 2 June, 12
33. Local Statistical Model
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 3
f(X)
Model 2
Model 1
0 1 2 3 4 5 6
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
10
Saturday, 2 June, 12
34. Local Statistical Model
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 3
f(X)
Model 2
Model 1
0 1 2 3 4 5 6
X
Improved Fit!
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
10
Saturday, 2 June, 12
35. How can we use this approach to get an
even better fit?
Saturday, 2 June, 12
36. Be Even More Local !
HAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
f(X)
0 1 2 3 4 5 6
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
12
Saturday, 2 June, 12
37. Be Even More Local !
HAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
f(X)
0 1 2 3 4 5 6
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
12
Saturday, 2 June, 12
38. Be Even More Local !
HAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
f(X)
0 1 2 3 4 5 6
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
12
Saturday, 2 June, 12
39. Be Even More Local !
HAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
f(X)
Great Fit!
0 1 2 3 4 5 6
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
12
Saturday, 2 June, 12
40. Be Even More Local !
HAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
f(X)
Great Fit!
BUT: Risk of Overfitting the Data!!
0 1 2 3 4 5 6
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
12
Saturday, 2 June, 12
43. CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS
GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
f(X)
f(X)
0 1 2 3 4 5 6
0 1 2 3 4 5 6
X
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
C(Y |X) = f (X) = X ,
C(Y |X) = f (X) = X ,
where X = 0 + 1 X1 + 2 X2 + 3 X3 + 4
X = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 ,
and
X1 = X X2 = (X a)+ 14
X1 = X X2 = (X a)+
Saturday, 2 June, 12
X3 = (X b)+ X4 = (X c)+.
44. Optimize Local Fit wrt. Minimizing Global Overfit
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS
GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
f(X)
f(X)
0 1 2 3 4 5 6
0 1 2 3 4 5 6
X
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
C(Y |X) = f (X) = X ,
C(Y |X) = f (X) = X ,
where X = 0 + 1 X1 + 2 X2 + 3 X3 + 4
X = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 ,
and
X1 = X X2 = (X a)+ 14
X1 = X X2 = (X a)+
Saturday, 2 June, 12
X3 = (X b)+ X4 = (X c)+.
45. Optimize Local Fit wrt. Minimizing Global Overfit
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS
GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
f(X)
f(X)
f(X)
0 1 2 3 4 5 6
0 1 2 3 4 5 6
X
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
0 1 2 3 4 5 6
X
C(Y |X) = f (X) = X ,
C(Y |X) = f (X) = X linear spline function with knots at a = 1, b = 3, c = 5.
Figure 2.1: A
,
where X = 0 + 1X1 + 2X2 + 3X3 + 4
X = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 ,
and
X1 = X X2 = (X a)+ 14
X1 = X X2 = (X a)+
Saturday, 2 June, 12 C(Y |X) = f (X) = X , X3 = (X b)+ X4 = (X c)+.
46. Optimize Local Fit wrt. Minimizing Global Overfit
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS
GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
f(X)
f(X)
f(X)
0 1 2 3 4 5 6
0 1 2 3 4 5 6
X
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
0 1 2 3 4 5 6
X
C(Y |X) = f (X) = X ,
C(Y |X) = f (X) = X linear spline function with knots at a = 1, b = 3, c = 5.
Figure 2.1: A
,
where X = 0 + 1X1 + 2X2 + 3X3 + 4
X = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 ,
and
X1 = X X2 = (X a)+ 14
X1 = X X2 = (X a)+
Saturday, 2 June, 12 C(Y |X) = f (X) = X , X3 = (X b)+ X4 = (X c)+.
47. Optimize Local Fit wrt. Minimizing Global Overfit
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS
GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
f(X)
f(X)
f(X)
0 1 2 3 4 5 6
0 1 2 3 4 5 6
X
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
0 1 2 3 4 5 6
X
C(Y |X) = f (X) = X ,
C(Y |X) = f (X) = X linear spline function with knots at a = 1, b = 3, c = 5.
Figure 2.1: A
,
where X = 0 + 1X1 + 2X2 + 3X3 + 4
X = Multivariate2 Adaptive4X4,
0 + 1X1 + 2X + 3X3 + Regression Splines (MARS)
and
X1 = X X2 = (X a)+ 14
X1 = X X2 = (X a)+
Saturday, 2 June, 12 C(Y |X) = f (X) = X , X3 = (X b)+ X4 = (X c)+.
48. Optimize Local Fit wrt. Minimizing Global Overfit
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS
GENERAL ASPECTS OF FITTING REGRESSION MODELS 34
f(X)
f(X)
f(X)
0 1 2 3 4 5 6
0 1 2 3 4 5 6
X
X
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.
0 1 2 3 4 5 6
X
C(Y |X) = f (X) = X ,
C(Y |X) = f (X) = X linear spline function with knots at a = 1, b = 3, c = 5.
Figure 2.1: A
,
where X = 0 + 1X1 + 2X2 + 3X3 + 4
X = Multivariate2 Adaptive4X4,
0 + 1X1 + 2X + 3X3 + Regression Splines (MARS)
and
create local knowledge that optimizes process globally
X1 = X X2 = (X a)+ 14
X1 = X X2 = (X a)+
Saturday, 2 June, 12 C(Y |X) = f (X) = X , X3 = (X b)+ X4 = (X c)+.
50. Case Study
Xalan 2.6
Post-Release Defects per Class
20 CK Metrics
Lucene 2.4
15
Saturday, 2 June, 12
51. Case Study
Xalan 2.6
Post-Release Defects per Class
20 CK Metrics
Lucene 2.4
Total Development Effort in Hours
CHINA
14 FP Metrics
15
Saturday, 2 June, 12
52. Case Study
Xalan 2.6
Post-Release Defects per Class
20 CK Metrics
Lucene 2.4
Total Development Effort in Hours
CHINA
14 FP Metrics
Development Length in Months
NasaCoc 24 COCOMO-II Metrics
15
Saturday, 2 June, 12
53. Results: Goodness of Fit
Rank-Correlation (0 = worst fit, 1 = optimal fit)
16
Saturday, 2 June, 12
54. Results: Goodness of Fit
Rank-Correlation (0 = worst fit, 1 = optimal fit)
Local
Global MARS
(Clustered)
Xalan 2.6 0.33 0.52 0.69
Lucene 2.4 0.32 0.60 0.83
CHINA 0.83 0.89 0.89
NasaCOC 0.93 0.97 0.99
16
Saturday, 2 June, 12
55. Results: Goodness of Fit
Rank-Correlation (0 = worst fit, 1 = optimal fit)
Local
Global MARS
(Clustered)
Xalan 2.6 0.33 0.52 0.69
Lucene 2.4 0.32 0.60 0.83
CHINA 0.83 0.89 0.89
NasaCOC 0.93 0.97 0.99
16
Saturday, 2 June, 12
56. Results: Goodness of Fit
Rank-Correlation (0 = worst fit, 1 = optimal fit)
Local
Global MARS
(Clustered)
Xalan 2.6 0.33 0.52 0.69
Lucene 2.4 0.32 0.60 0.83
CHINA 0.83 0.89 0.89
NasaCOC 0.93 0.97 0.99
16
Saturday, 2 June, 12
57. Results: Goodness of Fit
Rank-Correlation (0 = worst fit, 1 = optimal fit)
Local
Global MARS
(Clustered)
Xalan 2.6 0.33 0.52 0.69
Lucene 2.4 0.32 0.60 0.83
CHINA 0.83 0.89 0.89
NasaCOC 0.93 0.97 0.99
16
Saturday, 2 June, 12
58. Results: Goodness of Fit
Rank-Correlation (0 = worst fit, 1 = optimal fit)
Local
Global MARS
(Clustered)
8 Xalan 2.6 0.33 0.52 0.69
Number of Clusters
Dataset
6
CHINA
4 Lucene 2.4 0.32 0.60 0.83 Lucene 2.4
NasaCoc
Xalan 2.6
2
0 CHINA 0.83 0.89 0.89
Fold01 Fold02 Fold03 Fold04 Fold05 Fold06 Fold07 Fold08 Fold09 Fold10
NasaCOC 0.93 0.97 0.99
Figure 3: Number of clusters generated by MCLUST in each run of the 10-fold cross validation.
term for each additional prediction variable entering the is too small to continue or until a maximum number of terms
regression model [23]. is reached. In our case study, the maximum number of terms
For practical purposes, we use a publicly available imple- is automatically determined by the implementation, and is
mentation of BIC-based model selection, contained in the based on the amount of independent variables we give as
R package: BMA. The input to the BMA implementation input. For MARS models, we use all independent variables
is the dataset itself, as well as a list of all dependent and in a dataset after VIF analysis.
independent variables that should be considered. In our case The first phase often builds a model that suffers from
16
study, we always supply a list of all independent variables overfitting. As a result, the second phase, called the back-
Saturday,were 12
that 2 June, left after VIF analysis. The output of the BMA ward phase, prunes the model, to increase the model’s gen-
59. Results: Goodness of Fit
Rank-Correlation (0 = worst fit, 1 = optimal fit)
Local
Global MARS
(Clustered)
Xalan 2.6 0.33 0.52 0.69
Lucene 2.4 0.32 0.60 0.83
CHINA 0.83 0.89 0.89
NasaCOC 0.93 0.97 0.99
UP TO 2.5x BETTER FIT WHEN USING DATA LOCALITY!
16
Saturday, 2 June, 12
63. Model Interpretation
0.5
1 avg_cc 2 ca 3 cam 4 cbm
0.80
1.1
0.52
1.6
−0.5
0.70
0.9
0.48
1.2
−1.5
0.60
0.7
0.44
0.50
0.5
−2.5
0.8
0 5 10 15 20 0 50 100 150 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 25 30 0.0
5 ce 6 dam 7 dit 8 ic
0.62
0.6
0.8
0.65
0.58
0.5
0.45
0.6
0.60
0.4
0.54
0.55
0.4
0.3
0.35
0.50
0.50
0.2
0 10 20 30 40 50 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 1
(a)lcom of a global 10 lcom3 learned on the Xalan 2.6 dataset
9
Part Model 11 loc 12 max_cc
(b) P
1.8
0.7
6
2.6 d
2.0
4
0.6
5
1.4
4
3
0.5
1.5
Figure 6: Global models report general trends, while global models with local c
1.0
3
2
0.4
1.0
2
1
0.3
0.6
describes the response (in this case bugs) while keeping all other prediction variab
0.5
1
0 1000 3000 5000 0.0 0.5 1.0 1.5 2.0 0 1000 2000 3000 4000 0 20 40 60 80 120 0
Fold 9, Cluster 1
13 mfa 14 moa 15 noc 16 npm pr
0.50
0.58
1.0
0.51
ic npm mfa
O
0.70
0.5
19
0.49
0.46
w
0.0
0.54
0.60
.47
Saturday, 2 June, 12
64. Model Interpretation
0.5
1 avg_cc 2 ca 3 cam 4 cbm
0.80
1.1
0.52
1.6
−0.5
0.70
0.9
0.48
1.2
−1.5
0.60
0.7
0.44
0.50
0.5
−2.5
0.8
0 5 10 15 20 0 50 100 150 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 25 30 0.0
5 ce 6 dam 7 dit 8 ic
0.62
0.6
0.8
0.65
0.58
0.5
0.45
0.6
0.60
0.4
0.54
0.55
0.4
0.3
0.35
0.50
0.50
0.2
0 10 20 30 40 50 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 1
(a)lcom of a global 10 lcom3 learned on the Xalan 2.6 dataset
9
Part Model 11 loc 12 max_cc
(b) P
1.8
0.7
6
2.6 d
2.0
4
0.6
5
1.4
4
3
0.5
1.5
Figure 6: Global models report general trends, while global models with local c
Traditional Global Model: General Trends
1.0
3
2
0.4
1.0
2
1
0.3
0.6
describes the response (in this case bugs) while keeping all other prediction variab
0.5
1
0 1000 3000 5000 0.0 0.5 1.0 1.5 2.0 0 1000 2000 3000 4000 0 20 40 60 80 120 0
Fold 9, Cluster 1
13 mfa 14 moa 15 noc 16 npm pr
0.50
0.58
1.0
0.51
ic npm mfa
O
0.70
0.5
19
0.49
0.46
w
0.0
0.54
0.60
.47
Saturday, 2 June, 12
65. Model Interpretation
0.5
1 avg_cc 2 ca 3 cam 4 cbm
0.80
1.1
0.52
1.6
−0.5
0.70
0.9
0.48
1.2
−1.5
0.60
0.7
0.44
0.50
0.5
−2.5
0.8
0 5 10 15 20 0 50 100 150 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 25 30 0.0
5 ce 6 dam 7 dit 8 ic
0.62
0.6
0.8
0.65
0.58
0.5
0.45
0.6
0.60
0.4
0.54
0.55
0.4
0.3
0.35
0.50
0.50
0.2
0 10 20 30 40 50 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 1
(a)lcom of a global 10 lcom3 learned on the Xalan 2.6 dataset
9
Part Model 11 loc 12 max_cc
(b) P
1.8
0.7
6
2.6 d
2.0
4
0.6
5
1.4
4
3
0.5
1.5
Figure 6: Global models report general trends, while global models with local c
Traditional Global Model: General Trends
1.0
3
2
0.4
1.0
2
describes One Curve per metric, run corp on all other prediction variab
the response (in this case bugs) while keeping that curve
1
0.3
0.6
0.5
1
0 1000 3000 5000 0.0 0.5 1.0 1.5 2.0 0 1000 2000 3000 4000 0 20 40 60 80 120 0
Fold 9, Cluster 1
13 mfa 14 moa 15 noc 16 npm pr
0.50
0.58
1.0
0.51
ic npm mfa
O
0.70
0.5
19
0.49
0.46
w
0.0
0.54
0.60
.47
Saturday, 2 June, 12
66. 1
4
0.3 0.4 0.
0.5 1.0 1.
3
0.3 0.4 0.5
Figure 6: Global models report general trends, while global models with local considerations give insig
0.5 1.0 1.5
Model Interpretation
Figure 6: Global models report general trends, while global models with local considerations give insight
1.0
3
1.0
2
1.0
3
1.0
2
2
2
1
0.6
describes the response (in this case bugs) while keeping all other prediction variables atat their median val
describes the response (in this case bugs) while keeping all other prediction variables their median value
0.8
1
1
0.6
0.8
1
0 1000 3000 5000 0.0 0.5 1.0 1.5 2.0 0 1000 2000 3000 4000 0 20 40 60 80 120 0 1000 2000 3000 4000 0.0 0.2 0.4
0 1000 3000 5000 0.0 0.5 1.0 1.5 2.0 0 1000 2000 3000 4000 0 20 40 60 80 120 0 1000 2000 3000 4000 0.0 0.2 0
Fold 9, Cluster 1 15 noc
Fold 9, Cluster 1
prediction models lead
prediction models lea
13 mfa 14 moa 16 npm 13 npm
0.50
13 npm
0.58
13 mfa 14 moa 15 noc 16 npm
0.0 0.5 1.0
0.51
0.50
0.58
ic npm mfa
Our findings thus co
0.0 0.5 1.0
0.51
0.70
ic npm mfa
Our findings thus c
0.70
0.49
0.46
who observed a asimil
0.49
0.54
0.46
who observed sim
0.60
0.54
0.47
0.60
Clustermachine-lear
WHICH 1
0.47
0.42
WHICH machine-lea
−1.0
0.42
0.50
0.50
0.45
−1.0
0.50
0.50
0.45
have practical implic
0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 0 5 10 15 20 25 30 0 20 40 60 80 100 120 0 20 40 60 80 100 120
have practical impli
0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 0 5 10 15 20 25 30 0 20 40 60 80 100 120 0 20 40 60 80 100 120
0 2 4 6 8 10
using regression mod
0 2 4 6 8 10
using regression mo
are more insightful th
Fold 9, Cluster 6 ...
are more insightful t
general trends across
Fold 9, Cluster 6 general trends acros
ic npm mfa
demonstrated that such
ic npm mfa
demonstrated that su
particular parts of the
0 01 12 2 3 3
particular parts of th
in the Xalan 2.6 def
in the Xalan 2.6 de
Cluster 6 are infl
sets of classes
0 1 2 3 4 0 10 20 30 40 60
sets of classes are in
as inheritance, cohes
0 1 2 3 4 0 10 20 30 40 60 as inheritance, coh
reinforce the recomm
Figure 7: Example of contradicting trends in local models (Xalan 2.6,
Figure 17: Example ofin Fold 9). trends in local models (Xalan 2.6,
contradicting
the use of the recom
reinforce a “one-size
Cluster and Cluster 6 model, whenatrying to
the use of “one-si
Cluster 1 and Cluster 6 in Fold 9). model, when trying t
model already partition the data into regions with individual
model already partition the data into regions increase of ic
properties. For example, we observe that an with individual B. Act Globally
properties. For example, we observethrough parent classes) B. Act Globally
(measuring the inheritance coupling that an increase of ic When the goal is carry
(measuring the only have a negative effect on bug-proneness
is predicted to When the goal is car
inheritance coupling through parent classes) understanding, local m 20
Saturday, predicted to only have a negative effect on bug-proneness
is 2 June, 12 understanding, local