Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]
1. Searching for Configurations
in Clone Evaluation:
A Replication Study
C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke
J. H. Drake
CENTRE FOR RESEARCH ON EVOLUTION, SEARCH AND TESTING
DEPARTMENT OF COMPUTER SCIENCE
UNIVERSITY COLLEGE LONDON
2. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Code Clone
2
3. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Clone Detectors
3
if (x==0) then y=y+1;
if (check==0) then count=count+1;
$p ($p==0) $p $p=$p+1;
$p ($p==0) $p $p=$p+1;
if_s
if ( cond_e ) then assign_e
if_s
if ( cond_e ) then assign_e
Deckard
CCFinder
Simian
NiCad
4. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Oracle Problem in Code Clone
Absence of the possibility to establish a ground truth, we do
not know if code is actually cloned
4
?
5. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Agreement
5
?
6. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Parameters Tuning
6
7. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
EvaClone
7
T. Wang, M. Harman., Y. Jia, & J. Krinke. Searching for Better
Configurations: A Rigorous Approach to Clone Evaluation. in FSE’13
6 Clone Detectors:
PMD, iClones
ConQAT, Simian,
NiCad, CCFinder
8 Software Projects:
weltab, cook, snns,
psql, javadoc, ant,
jdtcore, swing
15 years
8. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Maximising Agreement
8
C D N S
Maximise
Clone detectors
9. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
EvaClone (cont.)
9
EvaClone favors recall over precision
and more candidates will be reported.
10. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Replication Study
10
11. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Fitness Function
11
4x3x2x1x ++ +
4 x (All clone lines)
12. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Replication Study (cont.)
12
Deckard
CCFinder
Simian
NiCad 25 parameters
Population size 100
No. of Generation 100
Crossover 0.8
Mutation 0.1
Elitism 0.25
2 x 1012
13. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
13
Ver. 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44
SLOC
(k)
5.5 6.7 6.78 6.82 7.2 7.6 8.4 8.9 10.1 12.4 17.9 22.8 23.6 25.3
%Inc N/A 21% 2% 1% 6% 5% 11% 7% 13% 23% 44% 28% 3% 8%
Note: there are 2 complete libraries (cglib and asm) embedded in release 1.5 — 1.9 and have been removed before the analysis
14. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
RQ1: Optimised Agreement
How do the default parameters perform in terms of
clone agreement on each Mockito release compared
to the optimised ones?
14
0.30
0.35
0.40
0.45
0.50
0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44
Mockito
FitnessValue
Default
EvaClone Highest
EvaClone Lowest
Comparison of optimised tools agreement (the highest and the lowest in 20 runs) to the default agreement over 14 Mockito releases
17. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
RQ3: Clones over Releases
17
How many clones in Mockito are reported with the
highest agreement over releases?
DefaultEvaClone
18. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Maximising Agreement
18
C D N S
Maximise
Clone detectors
19. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Open Challenge
A better fitness function
for EvaClone is needed
It must not only rely on the number of cloned
lines, but also include other aspects:
How often a line is found to be cloned to other
places?
Precision vs. Recall?
Location of clones
19
???