Think Locally, Act Gobally - Improving Defect and Effort Prediction Models

Think Locally, Act Globally
Improving Defect and Effort Prediction Models

Nicolas Bettenburg • Meiyappan Nagappan • Ahmed E. Hassan
Queen’s University • Kingston, ON, Canada

SOFTWARE ANALYSIS
& INTELLIGENCE LAB
T
Saturday, 2 June, 12

Data Modelling in Empirical SE

measured from project data

Observations

2




Observations

describe observations
mathematically Model

2




Observations

describe observations
mathematically Model Prediction
guide decision making

Understanding
guide process optimizations and future research

2


Model Building Today

Whole Dataset

3



Whole Dataset Training Data

Testing Data

3



Whole Dataset Training Data Learned Model
M

Testing Data

3



M

Y

Testing Data Predictions

3



M

Y


Compare

3


Much Research Effort on
new metrics and new models!

4


Maybe we need to look more at the data part


In the Field


In the Field

Tom Zimmermann


In the Field
We ran 622 cross-project
predictions and found that only
3.4% actually worked.

Tom Zimmermann


In the Field

Tom Zimmermann

Tim Menzies

In the Field

Tom Zimmermann

Rather than focus on
generalities, empirical SE should
focus more on context-speciﬁc
principles.

Tim Menzies

In the Field

Tom Zimmermann Taking local properties of data into
consideration leads to better models!

Rather than focus on
generalities, empirical SE should
focus more on context-speciﬁc
principles.

Tim Menzies

Using Locality in Statistical Models



1 Does this principle work for statistical models?




2 Does it work for Prediction?




2 Does it work for Prediction?

3 Can we do better?


Building Local Models

M

Y


8



ter Data
Clus

M

Y


8


ltiple
n Mu
Data Lear dels
ter Mo
Clus

Whole Dataset Training Data Learned Models
M1 M2 M3

Y


8


ltiple
n Mu
Data Lear dels
ter Mo
Clus

M1 M2 M3

Y Y Y


dict
Pre ally
Ind ividu

8


ltiple
n Mu
Data Lear dels
ter Mo
Clus

M1 M2 M3

Y Y Y


Compare
dict
Pre ally
Ind ividu

8


HAPTER 2.
Global StatisticalMODELS
GENERAL ASPECTS OF FITTING REGRESSION
Model 34

f(X)

0 1 2 3 4 5 6

X

Figure 2.1: A linear spline function with knots at a = 1, b = 3, c = 5.

9


HAPTER 2.
Global StatisticalMODELS
GENERAL ASPECTS OF FITTING REGRESSION
Model 34

f(X)

0 1 2 3 4 5 6

X

Model ﬁt leaves much room for improvement!

9


Local Statistical Model
CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 3

f(X)

0 1 2 3 4 5 6

X

10



f(X)

Model 2

Model 1

0 1 2 3 4 5 6

X

10



f(X)

Model 2

Model 1

0 1 2 3 4 5 6

X

Improved Fit!
10


How can we use this approach to get an
even better ﬁt?


Be Even More Local !
HAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS 34

f(X)

0 1 2 3 4 5 6

X


12



f(X)

Great Fit!

0 1 2 3 4 5 6

X


12



f(X)

Great Fit!
BUT: Risk of Overﬁtting the Data!!
0 1 2 3 4 5 6

X


12


Clustering independent of Fit


CHAPTER 2. GENERAL ASPECTS OF FITTING REGRESSION MODELS
GENERAL ASPECTS OF FITTING REGRESSION MODELS 34

f(X)
f(X)

0 1 2 3 4 5 6
0 1 2 3 4 5 6
X
X

C(Y |X) = f (X) = X ,
C(Y |X) = f (X) = X ,
where X = 0 + 1 X1 + 2 X2 + 3 X3 + 4
X = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 ,
and
X1 = X X2 = (X a)+ 14
X1 = X X2 = (X a)+
X3 = (X b)+ X4 = (X c)+.

Optimize Local Fit wrt. Minimizing Global Overﬁt


f(X)
f(X)

0 1 2 3 4 5 6
0 1 2 3 4 5 6
X
X

C(Y |X) = f (X) = X ,
C(Y |X) = f (X) = X ,
where X = 0 + 1 X1 + 2 X2 + 3 X3 + 4
X = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 ,
and
X1 = X X2 = (X a)+ 14
X1 = X X2 = (X a)+
X3 = (X b)+ X4 = (X c)+.



f(X)

f(X)
f(X)

0 1 2 3 4 5 6
0 1 2 3 4 5 6
X
X
0 1 2 3 4 5 6

X
C(Y |X) = f (X) = X ,
C(Y |X) = f (X) = X linear spline function with knots at a = 1, b = 3, c = 5.
Figure 2.1: A
,
where X = 0 + 1X1 + 2X2 + 3X3 + 4
X = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 ,
and
X1 = X X2 = (X a)+ 14
X1 = X X2 = (X a)+
Saturday, 2 June, 12 C(Y |X) = f (X) = X , X3 = (X b)+ X4 = (X c)+.



f(X)

f(X)
f(X)

0 1 2 3 4 5 6
0 1 2 3 4 5 6
X
X
0 1 2 3 4 5 6

X
C(Y |X) = f (X) = X ,
Figure 2.1: A
,
where X = 0 + 1X1 + 2X2 + 3X3 + 4
X = Multivariate2 Adaptive4X4,
0 + 1X1 + 2X + 3X3 + Regression Splines (MARS)
and
X1 = X X2 = (X a)+ 14
X1 = X X2 = (X a)+



f(X)

f(X)
f(X)

0 1 2 3 4 5 6
0 1 2 3 4 5 6
X
X
0 1 2 3 4 5 6

X
C(Y |X) = f (X) = X ,
Figure 2.1: A
,
where X = 0 + 1X1 + 2X2 + 3X3 + 4
X = Multivariate2 Adaptive4X4,
0 + 1X1 + 2X + 3X3 + Regression Splines (MARS)
and
create local knowledge that optimizes process globally
X1 = X X2 = (X a)+ 14
X1 = X X2 = (X a)+

Case Study

15


Case Study

Xalan 2.6
Post-Release Defects per Class
20 CK Metrics
Lucene 2.4

15


Case Study

Xalan 2.6
20 CK Metrics
Lucene 2.4

Total Development Effort in Hours
CHINA
14 FP Metrics

15


Case Study

Xalan 2.6
20 CK Metrics
Lucene 2.4

Total Development Effort in Hours
CHINA
14 FP Metrics

Development Length in Months
NasaCoc 24 COCOMO-II Metrics
15


Results: Goodness of Fit

Rank-Correlation (0 = worst ﬁt, 1 = optimal ﬁt)

16



Local
Global MARS
(Clustered)

Xalan 2.6 0.33 0.52 0.69

Lucene 2.4 0.32 0.60 0.83

CHINA 0.83 0.89 0.89

NasaCOC 0.93 0.97 0.99

16



Local
Global MARS
(Clustered)

8 Xalan 2.6 0.33 0.52 0.69
Number of Clusters

Dataset
6
CHINA

4 Lucene 2.4 0.32 0.60 0.83 Lucene 2.4
NasaCoc
Xalan 2.6
2

0 CHINA 0.83 0.89 0.89
Fold01 Fold02 Fold03 Fold04 Fold05 Fold06 Fold07 Fold08 Fold09 Fold10

NasaCOC 0.93 0.97 0.99
Figure 3: Number of clusters generated by MCLUST in each run of the 10-fold cross validation.
term for each additional prediction variable entering the is too small to continue or until a maximum number of terms
regression model [23]. is reached. In our case study, the maximum number of terms
For practical purposes, we use a publicly available imple- is automatically determined by the implementation, and is
mentation of BIC-based model selection, contained in the based on the amount of independent variables we give as
R package: BMA. The input to the BMA implementation input. For MARS models, we use all independent variables
is the dataset itself, as well as a list of all dependent and in a dataset after VIF analysis.
independent variables that should be considered. In our case The ﬁrst phase often builds a model that suffers from
16
study, we always supply a list of all independent variables overﬁtting. As a result, the second phase, called the back-
Saturday,were 12
that 2 June, left after VIF analysis. The output of the BMA ward phase, prunes the model, to increase the model’s gen-


Local
Global MARS
(Clustered)

Xalan 2.6 0.33 0.52 0.69

Lucene 2.4 0.32 0.60 0.83

CHINA 0.83 0.89 0.89

NasaCOC 0.93 0.97 0.99

UP TO 2.5x BETTER FIT WHEN USING DATA LOCALITY!
16


Results: Prediction Error Global Local MARS

0.7 1.2

0.525 0.9

0.35 0.64 0.6 1.15 1.15
0.52 0.94
0.175 0.4 0.3

0 0
Xalan 2.6 Lucene 2.4
800 4

600 3

400 765 2
3.26
552.85
200 1 2.14
1.63
234.43
0 0
CHINA NasaCoC

17


Results: Prediction Error Global Local MARS

0.7 1.2

0.525 0.9

0.35 0.64 0.6 1.15 1.15
0.52 0.94
0.175 0.4 0.3

0 0
Xalan 2.6 Lucene 2.4
800 4

600 3

400 765 2
3.26
552.85
200 1 2.14
1.63
234.43
0 0
CHINA NasaCoC

Up to 4x lower prediction error with Local Models!
17


?
Model
Interpretation


Model Interpretation
0.5
1 avg_cc 2 ca 3 cam 4 cbm

0.80

1.1
0.52

1.6
−0.5

0.70

0.9
0.48

1.2
−1.5

0.60

0.7
0.44
0.50

0.5
−2.5

0.8
0 5 10 15 20 0 50 100 150 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 25 30 0.0

5 ce 6 dam 7 dit 8 ic
0.62

0.6

0.8
0.65
0.58

0.5
0.45

0.6
0.60
0.4
0.54

0.55

0.4
0.3
0.35
0.50

0.50

0.2
0 10 20 30 40 50 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 1

(a)lcom of a global 10 lcom3 learned on the Xalan 2.6 dataset
9
Part Model 11 loc 12 max_cc
(b) P
1.8

0.7

6
2.6 d
2.0

4
0.6

5
1.4

4
3
0.5

1.5

Figure 6: Global models report general trends, while global models with local c
1.0

3
2
0.4

1.0

2
1
0.3
0.6

describes the response (in this case bugs) while keeping all other prediction variab
0.5

1
0 1000 3000 5000 0.0 0.5 1.0 1.5 2.0 0 1000 2000 3000 4000 0 20 40 60 80 120 0

Fold 9, Cluster 1
13 mfa 14 moa 15 noc 16 npm pr
0.50
0.58

1.0
0.51

ic npm mfa
O
0.70

0.5
19
0.49

0.46

w

0.0
0.54

0.60
.47


0.5

0.80

1.1
0.52

1.6
−0.5

0.70

0.9
0.48

1.2
−1.5

0.60

0.7
0.44
0.50

0.5
−2.5

0.8
0 5 10 15 20 0 50 100 150 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 25 30 0.0

0.62

0.6

0.8
0.65
0.58

0.5
0.45

0.6
0.60
0.4
0.54

0.55

0.4
0.3
0.35
0.50

0.50

0.2
0 10 20 30 40 50 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 1

9
(b) P
1.8

0.7

6
2.6 d
2.0

4
0.6

5
1.4

4
3
0.5

1.5

Traditional Global Model: General Trends
1.0

3
2
0.4

1.0

2
1
0.3
0.6

describes the response (in this case bugs) while keeping all other prediction variab
0.5

1
0 1000 3000 5000 0.0 0.5 1.0 1.5 2.0 0 1000 2000 3000 4000 0 20 40 60 80 120 0

Fold 9, Cluster 1
0.50
0.58

1.0
0.51

ic npm mfa
O
0.70

0.5
19
0.49

0.46

w

0.0
0.54

0.60
.47


0.5

0.80

1.1
0.52

1.6
−0.5

0.70

0.9
0.48

1.2
−1.5

0.60

0.7
0.44
0.50

0.5
−2.5

0.8
0 5 10 15 20 0 50 100 150 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 25 30 0.0

0.62

0.6

0.8
0.65
0.58

0.5
0.45

0.6
0.60
0.4
0.54

0.55

0.4
0.3
0.35
0.50

0.50

0.2
0 10 20 30 40 50 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 1

9
(b) P
1.8

0.7

6
2.6 d
2.0

4
0.6

5
1.4

4
3
0.5

1.5

Traditional Global Model: General Trends
1.0

3
2
0.4

1.0

2
describes One Curve per metric, run corp on all other prediction variab
the response (in this case bugs) while keeping that curve

1
0.3
0.6

0.5

1
0 1000 3000 5000 0.0 0.5 1.0 1.5 2.0 0 1000 2000 3000 4000 0 20 40 60 80 120 0

Fold 9, Cluster 1
0.50
0.58

1.0
0.51

ic npm mfa
O
0.70

0.5
19
0.49

0.46

w

0.0
0.54

0.60
.47


1
4
0.3 0.4 0.

0.5 1.0 1.

3
0.3 0.4 0.5
Figure 6: Global models report general trends, while global models with local considerations give insig

0.5 1.0 1.5
Figure 6: Global models report general trends, while global models with local considerations give insight
1.0

3

1.0
2
1.0

3

1.0
2

2
2
1
0.6
describes the response (in this case bugs) while keeping all other prediction variables atat their median val
describes the response (in this case bugs) while keeping all other prediction variables their median value

0.8
1
1
0.6

0.8
1
0 1000 3000 5000 0.0 0.5 1.0 1.5 2.0 0 1000 2000 3000 4000 0 20 40 60 80 120 0 1000 2000 3000 4000 0.0 0.2 0.4
0 1000 3000 5000 0.0 0.5 1.0 1.5 2.0 0 1000 2000 3000 4000 0 20 40 60 80 120 0 1000 2000 3000 4000 0.0 0.2 0

Fold 9, Cluster 1 15 noc
Fold 9, Cluster 1
prediction models lead
prediction models lea
13 mfa 14 moa 16 npm 13 npm

0.50
13 npm

0.58
13 mfa 14 moa 15 noc 16 npm

0.0 0.5 1.0
0.51

0.50
0.58
ic npm mfa
Our findings thus co

0.0 0.5 1.0
0.51

0.70
ic npm mfa
Our findings thus c

0.70
0.49

0.46
who observed a asimil
0.49

0.54

0.46
who observed sim

0.60
0.54
0.47

0.60
Clustermachine-lear
WHICH 1
0.47

0.42
WHICH machine-lea

−1.0
0.42

0.50
0.50
0.45

−1.0
0.50
0.50
0.45

have practical implic
0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 0 5 10 15 20 25 30 0 20 40 60 80 100 120 0 20 40 60 80 100 120

have practical impli
0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 0 5 10 15 20 25 30 0 20 40 60 80 100 120 0 20 40 60 80 100 120

0 2 4 6 8 10
using regression mod
0 2 4 6 8 10
using regression mo
are more insightful th
Fold 9, Cluster 6 ...
are more insightful t
general trends across
Fold 9, Cluster 6 general trends acros
ic npm mfa
demonstrated that such
ic npm mfa
demonstrated that su
particular parts of the

0 01 12 2 3 3
particular parts of th
in the Xalan 2.6 def
in the Xalan 2.6 de
Cluster 6 are infl
sets of classes
0 1 2 3 4 0 10 20 30 40 60
sets of classes are in
as inheritance, cohes
0 1 2 3 4 0 10 20 30 40 60 as inheritance, coh
reinforce the recomm
Figure 7: Example of contradicting trends in local models (Xalan 2.6,
Figure 17: Example ofin Fold 9). trends in local models (Xalan 2.6,
contradicting
the use of the recom
reinforce a “one-size
Cluster and Cluster 6 model, whenatrying to
the use of “one-si
Cluster 1 and Cluster 6 in Fold 9). model, when trying t
model already partition the data into regions with individual
model already partition the data into regions increase of ic
properties. For example, we observe that an with individual B. Act Globally
properties. For example, we observethrough parent classes) B. Act Globally
(measuring the inheritance coupling that an increase of ic When the goal is carry
(measuring the only have a negative effect on bug-proneness
is predicted to When the goal is car
inheritance coupling through parent classes) understanding, local m 20

Saturday, predicted to only have a negative effect on bug-proneness
is 2 June, 12 understanding, local

Think Locally, Act Gobally - Improving Defect and Effort Prediction Models

Think Locally, Act Gobally - Improving Defect and Effort Prediction Models

Recommended

Recommended

More Related Content

More from Nicolas Bettenburg

More from Nicolas Bettenburg (20)

Recently uploaded

Recently uploaded (20)

Think Locally, Act Gobally - Improving Defect and Effort Prediction Models