SlideShare a Scribd company logo
1 of 20
Download to read offline
Searching for Configurations
in Clone Evaluation:
A Replication Study
C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke
J. H. Drake
CENTRE FOR RESEARCH ON EVOLUTION, SEARCH AND TESTING
DEPARTMENT OF COMPUTER SCIENCE
UNIVERSITY COLLEGE LONDON
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Code Clone
2
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Clone Detectors
3
if (x==0) then y=y+1;
if (check==0) then count=count+1;
$p ($p==0) $p $p=$p+1;
$p ($p==0) $p $p=$p+1;
if_s
if ( cond_e ) then assign_e
if_s
if ( cond_e ) then assign_e
Deckard
CCFinder
Simian
NiCad
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Oracle Problem in Code Clone
Absence of the possibility to establish a ground truth, we do
not know if code is actually cloned
4
?
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Agreement
5
?
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Parameters Tuning
6
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
EvaClone
7
T. Wang, M. Harman., Y. Jia, & J. Krinke. Searching for Better
Configurations: A Rigorous Approach to Clone Evaluation. in FSE’13
6 Clone Detectors:
PMD, iClones
ConQAT, Simian,
NiCad, CCFinder
8 Software Projects:
weltab, cook, snns,
psql, javadoc, ant,
jdtcore, swing
15 years
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Maximising Agreement
8
C D N S
Maximise
Clone detectors
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
EvaClone (cont.)
9
EvaClone favors recall over precision 

and more candidates will be reported.
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Replication Study
10
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Fitness Function
11
4x3x2x1x ++ +
4 x (All clone lines)
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Replication Study (cont.)
12
Deckard
CCFinder
Simian
NiCad 25 parameters
Population size 100
No. of Generation 100
Crossover 0.8
Mutation 0.1
Elitism 0.25
2 x 1012
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
13
Ver. 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44
SLOC
(k)
5.5 6.7 6.78 6.82 7.2 7.6 8.4 8.9 10.1 12.4 17.9 22.8 23.6 25.3
%Inc N/A 21% 2% 1% 6% 5% 11% 7% 13% 23% 44% 28% 3% 8%
Note: there are 2 complete libraries (cglib and asm) embedded in release 1.5 — 1.9 and have been removed before the analysis
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
RQ1: Optimised Agreement
How do the default parameters perform in terms of
clone agreement on each Mockito release compared
to the optimised ones?
14
0.30
0.35
0.40
0.45
0.50
0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44
Mockito
FitnessValue
Default
EvaClone Highest
EvaClone Lowest
Comparison of optimised tools agreement (the highest and the lowest in 20 runs) to the default agreement over 14 Mockito releases
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
RQ2: Stability of Optimised Parameters
15
Are there noticeable differences in the values of
optimised parameters over releases?
Tool Parameter DF
Optimised
0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44
CCFinder
MinToken
TKS
50
12
10
10
70
16
70
18
70
19
80
18
80
18
80
19
80
20
10
14
10
17
10
10
10
10
10
10
10
10
Deckard
MinToken
Stride
Similarity
30
5
0.9
30
inf
0.9
50
8
1.0
50
8
1.0
50
8
1.0
50
8
1.0
50
8
1.0
50
8
1.0
50
8
1.0
50
16
0.95
50
5
1.0
50
inf
0.9
50
inf
0.9
50
inf
0.9
50
inf
0.9
NiCad
MinLine
MaxLine
UPI
Blind
Abstract
6
1K
0.3
0
0
5
200
0.3
1
4
7
100
0.0
0
6
7
100
0.1
0
6
7
400
0.0
0
6
6
400
0.0
0
6
6
200
0.1
0
5
6
200
0.1
0
5
7
200
0.0
1
6
6
200
0.3
1
6
5
100
0.1
1
2
5
100
0.3
1
4
5
100
0.3
1
4
5
200
0.3
1
4
5
200
0.3
1
4
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
RQ2: Stability of Optimised Parameters
16
Tool Parameter DF
Optimised
0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44
Simian
ignoreCurlyBraces 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
ignoreIdentifiers 0 1 0 0 0 0 0 0 0 1 1 1 1 1 1
ignoreIdentifierCase 0 ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱
ignoreStrings 0 1 0 0 0 0 0 0 0 1 0 ✱ ✱ ✱ ✱
ignoreStringCase 1 ✱ 1 1 0 0 0 0 0 ✱ 0 ✱ ✱ ✱ ✱
ignoreNumbers 0 1 0 1 0 1 1 0 1 1 0 ✱ ✱ ✱ ✱
ignoreCharacters 0 0 0 1 0 0 0 1 0 0 1 ✱ ✱ ✱ ✱
ignoreCharacterCase 1 0 0 ✱ 1 1 0 ✱ 1 1 ✱ ✱ ✱ ✱ ✱
ignoreLiterals 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
ignoreSubtypeNames 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1
ignoreModifiers 1 1 1 0 1 0 0 0 0 0 0 1 1 1 1
ignoreVariableNames 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1
balanceParentheses 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0
balanceSquareBrackets 0 1 0 0 0 1 1 0 1 1 1 1 1 1 0
MinLine 6 5 6 6 6 6 6 6 6 7 7 5 5 5 5
Are there noticeable differences in the values of
optimised parameters over releases?
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
RQ3: Clones over Releases
17
How many clones in Mockito are reported with the
highest agreement over releases?
DefaultEvaClone
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Maximising Agreement
18
C D N S
Maximise
Clone detectors
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Open Challenge
A better fitness function 

for EvaClone is needed
It must not only rely on the number of cloned
lines, but also include other aspects:
How often a line is found to be cloned to other
places?
Precision vs. Recall?
Location of clones
19
???
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
20
0.30
0.35
0.40
0.45
0.50
0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44
Mockito
FitnessValue
Default
EvaClone Highest
EvaClone Lowest
Opt. params vs Def. params
Tool Parameter
D
F
Optimised
0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10
2.
0.
0
2.
0.
44
CCFinder
MinToken
TKS
5
0
1
2
10
10
70
16
70
18
70
19
80
18
80
18
80
19
80
20
10
14
10
17
10
10
10
10
10
10
10
10
Deckard
MinToken
Stride
Similarity
3
0
5
0.
9
30
inf
0.
9
50
8
1.
0
50
8
1.
0
50
8
1.
0
50
8
1.
0
50
8
1.
0
50
8
1.
0
50
8
1.
0
50
16
0.
95
50
5
1.
0
50
inf
0.
9
50
inf
0.
9
50
inf
0.
9
50
inf
0.
9
NiCad
MinLine
MaxLine
UPI
Blind
Abstract
6
1
K
0.
3
0
0
5
20
0
0.
3
1
4
7
10
0
0.
0
0
6
7
10
0
0.
1
0
6
7
40
0
0.
0
0
6
6
40
0
0.
0
0
6
6
20
0
0.
1
0
5
6
20
0
0.
1
0
5
7
20
0
0.
0
1
6
6
20
0
0.
3
1
6
5
10
0
0.
1
1
2
5
10
0
0.
3
1
4
5
10
0
0.
3
1
4
5
20
0
0.
3
1
4
5
20
0
0.
3
1
4
Opt. params are not stable over releases
DefaultEvaClone
Fitness func. needs improvements

More Related Content

Viewers also liked

Viewers also liked (13)

Sickle cell anemia
Sickle cell anemiaSickle cell anemia
Sickle cell anemia
 
Organelles & Diseases Related
Organelles & Diseases RelatedOrganelles & Diseases Related
Organelles & Diseases Related
 
Biochemistry Honors
Biochemistry HonorsBiochemistry Honors
Biochemistry Honors
 
Biology case study #4
Biology case study #4Biology case study #4
Biology case study #4
 
Case study #3
Case study #3Case study #3
Case study #3
 
State v. Mott: A Case Study in Forensic Science
State v. Mott: A Case Study in Forensic ScienceState v. Mott: A Case Study in Forensic Science
State v. Mott: A Case Study in Forensic Science
 
Case 1
Case 1Case 1
Case 1
 
Case Study 2
Case Study 2Case Study 2
Case Study 2
 
Chloroplast dna
Chloroplast dnaChloroplast dna
Chloroplast dna
 
Chemical Bonding
Chemical BondingChemical Bonding
Chemical Bonding
 
Case study on forensic audit
Case study on forensic auditCase study on forensic audit
Case study on forensic audit
 
Sickle Cell Anemia
Sickle Cell AnemiaSickle Cell Anemia
Sickle Cell Anemia
 
DNA Replication in eukaryotes and prokaryotes
DNA Replication in eukaryotes and prokaryotesDNA Replication in eukaryotes and prokaryotes
DNA Replication in eukaryotes and prokaryotes
 

Similar to Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference Talk
C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference TalkC. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference Talk
C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference TalkCarlo Contaldi
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Thomas Zimmermann
 
Terminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryTerminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryGiuseppe Rizzo
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Elia Brodsky
 
Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...
Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...
Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...Miguel Velez
 
Fast Automated Unpacking and Classification of Malware
Fast Automated Unpacking and Classification of MalwareFast Automated Unpacking and Classification of Malware
Fast Automated Unpacking and Classification of MalwareSilvio Cesare
 
A Configurable CEGAR Framework with Interpolation-Based Refinements
A Configurable CEGAR Framework with Interpolation-Based RefinementsA Configurable CEGAR Framework with Interpolation-Based Refinements
A Configurable CEGAR Framework with Interpolation-Based RefinementsAkos Hajdu
 
EXTENT-2017: Keep Investing in QA
EXTENT-2017: Keep Investing in QAEXTENT-2017: Keep Investing in QA
EXTENT-2017: Keep Investing in QAIosif Itkin
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Li Shen
 
OpenCL applications in genomics
OpenCL applications in genomicsOpenCL applications in genomics
OpenCL applications in genomicsUSC
 
Introduction to Julia
Introduction to JuliaIntroduction to Julia
Introduction to Julia岳華 杜
 
2018. gwas data cleaning
2018. gwas data cleaning2018. gwas data cleaning
2018. gwas data cleaningFOODCROPS
 
20170415 當julia遇上資料科學
20170415 當julia遇上資料科學20170415 當julia遇上資料科學
20170415 當julia遇上資料科學岳華 杜
 
20171127 當julia遇上資料科學
20171127 當julia遇上資料科學20171127 當julia遇上資料科學
20171127 當julia遇上資料科學岳華 杜
 
NIPS2007: structured prediction
NIPS2007: structured predictionNIPS2007: structured prediction
NIPS2007: structured predictionzukun
 
CDAC 2018 Pellegrini clustering ppi networks
CDAC 2018 Pellegrini clustering ppi networksCDAC 2018 Pellegrini clustering ppi networks
CDAC 2018 Pellegrini clustering ppi networksMarco Antoniotti
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Daniel Chan
 
Project Presentation
Project PresentationProject Presentation
Project Presentationbutest
 

Similar to Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16] (20)

C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference Talk
C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference TalkC. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference Talk
C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference Talk
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities
 
Terminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryTerminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom Discovery
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
 
Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...
Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...
Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...
 
Fast Automated Unpacking and Classification of Malware
Fast Automated Unpacking and Classification of MalwareFast Automated Unpacking and Classification of Malware
Fast Automated Unpacking and Classification of Malware
 
A Configurable CEGAR Framework with Interpolation-Based Refinements
A Configurable CEGAR Framework with Interpolation-Based RefinementsA Configurable CEGAR Framework with Interpolation-Based Refinements
A Configurable CEGAR Framework with Interpolation-Based Refinements
 
EXTENT-2017: Keep Investing in QA
EXTENT-2017: Keep Investing in QAEXTENT-2017: Keep Investing in QA
EXTENT-2017: Keep Investing in QA
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
 
OpenCL applications in genomics
OpenCL applications in genomicsOpenCL applications in genomics
OpenCL applications in genomics
 
Introduction to Julia
Introduction to JuliaIntroduction to Julia
Introduction to Julia
 
2018. gwas data cleaning
2018. gwas data cleaning2018. gwas data cleaning
2018. gwas data cleaning
 
20170415 當julia遇上資料科學
20170415 當julia遇上資料科學20170415 當julia遇上資料科學
20170415 當julia遇上資料科學
 
20171127 當julia遇上資料科學
20171127 當julia遇上資料科學20171127 當julia遇上資料科學
20171127 當julia遇上資料科學
 
NIPS2007: structured prediction
NIPS2007: structured predictionNIPS2007: structured prediction
NIPS2007: structured prediction
 
CDAC 2018 Pellegrini clustering ppi networks
CDAC 2018 Pellegrini clustering ppi networksCDAC 2018 Pellegrini clustering ppi networks
CDAC 2018 Pellegrini clustering ppi networks
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)
 
Project Presentation
Project PresentationProject Presentation
Project Presentation
 
Benchmarking_ML_Tools
Benchmarking_ML_ToolsBenchmarking_ML_Tools
Benchmarking_ML_Tools
 
2015 osu-metagenome
2015 osu-metagenome2015 osu-metagenome
2015 osu-metagenome
 

Recently uploaded

Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...navyadasi1992
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXDole Philippines School
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 

Recently uploaded (20)

Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 

Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

  • 1. Searching for Configurations in Clone Evaluation: A Replication Study C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke J. H. Drake CENTRE FOR RESEARCH ON EVOLUTION, SEARCH AND TESTING DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY COLLEGE LONDON
  • 2. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Code Clone 2
  • 3. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Clone Detectors 3 if (x==0) then y=y+1; if (check==0) then count=count+1; $p ($p==0) $p $p=$p+1; $p ($p==0) $p $p=$p+1; if_s if ( cond_e ) then assign_e if_s if ( cond_e ) then assign_e Deckard CCFinder Simian NiCad
  • 4. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Oracle Problem in Code Clone Absence of the possibility to establish a ground truth, we do not know if code is actually cloned 4 ?
  • 5. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Agreement 5 ?
  • 6. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Parameters Tuning 6
  • 7. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake EvaClone 7 T. Wang, M. Harman., Y. Jia, & J. Krinke. Searching for Better Configurations: A Rigorous Approach to Clone Evaluation. in FSE’13 6 Clone Detectors: PMD, iClones ConQAT, Simian, NiCad, CCFinder 8 Software Projects: weltab, cook, snns, psql, javadoc, ant, jdtcore, swing 15 years
  • 8. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Maximising Agreement 8 C D N S Maximise Clone detectors
  • 9. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake EvaClone (cont.) 9 EvaClone favors recall over precision 
 and more candidates will be reported.
  • 10. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Replication Study 10
  • 11. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Fitness Function 11 4x3x2x1x ++ + 4 x (All clone lines)
  • 12. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Replication Study (cont.) 12 Deckard CCFinder Simian NiCad 25 parameters Population size 100 No. of Generation 100 Crossover 0.8 Mutation 0.1 Elitism 0.25 2 x 1012
  • 13. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake 13 Ver. 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44 SLOC (k) 5.5 6.7 6.78 6.82 7.2 7.6 8.4 8.9 10.1 12.4 17.9 22.8 23.6 25.3 %Inc N/A 21% 2% 1% 6% 5% 11% 7% 13% 23% 44% 28% 3% 8% Note: there are 2 complete libraries (cglib and asm) embedded in release 1.5 — 1.9 and have been removed before the analysis
  • 14. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake RQ1: Optimised Agreement How do the default parameters perform in terms of clone agreement on each Mockito release compared to the optimised ones? 14 0.30 0.35 0.40 0.45 0.50 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44 Mockito FitnessValue Default EvaClone Highest EvaClone Lowest Comparison of optimised tools agreement (the highest and the lowest in 20 runs) to the default agreement over 14 Mockito releases
  • 15. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake RQ2: Stability of Optimised Parameters 15 Are there noticeable differences in the values of optimised parameters over releases? Tool Parameter DF Optimised 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44 CCFinder MinToken TKS 50 12 10 10 70 16 70 18 70 19 80 18 80 18 80 19 80 20 10 14 10 17 10 10 10 10 10 10 10 10 Deckard MinToken Stride Similarity 30 5 0.9 30 inf 0.9 50 8 1.0 50 8 1.0 50 8 1.0 50 8 1.0 50 8 1.0 50 8 1.0 50 8 1.0 50 16 0.95 50 5 1.0 50 inf 0.9 50 inf 0.9 50 inf 0.9 50 inf 0.9 NiCad MinLine MaxLine UPI Blind Abstract 6 1K 0.3 0 0 5 200 0.3 1 4 7 100 0.0 0 6 7 100 0.1 0 6 7 400 0.0 0 6 6 400 0.0 0 6 6 200 0.1 0 5 6 200 0.1 0 5 7 200 0.0 1 6 6 200 0.3 1 6 5 100 0.1 1 2 5 100 0.3 1 4 5 100 0.3 1 4 5 200 0.3 1 4 5 200 0.3 1 4
  • 16. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake RQ2: Stability of Optimised Parameters 16 Tool Parameter DF Optimised 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44 Simian ignoreCurlyBraces 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 ignoreIdentifiers 0 1 0 0 0 0 0 0 0 1 1 1 1 1 1 ignoreIdentifierCase 0 ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ignoreStrings 0 1 0 0 0 0 0 0 0 1 0 ✱ ✱ ✱ ✱ ignoreStringCase 1 ✱ 1 1 0 0 0 0 0 ✱ 0 ✱ ✱ ✱ ✱ ignoreNumbers 0 1 0 1 0 1 1 0 1 1 0 ✱ ✱ ✱ ✱ ignoreCharacters 0 0 0 1 0 0 0 1 0 0 1 ✱ ✱ ✱ ✱ ignoreCharacterCase 1 0 0 ✱ 1 1 0 ✱ 1 1 ✱ ✱ ✱ ✱ ✱ ignoreLiterals 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 ignoreSubtypeNames 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 ignoreModifiers 1 1 1 0 1 0 0 0 0 0 0 1 1 1 1 ignoreVariableNames 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1 balanceParentheses 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 balanceSquareBrackets 0 1 0 0 0 1 1 0 1 1 1 1 1 1 0 MinLine 6 5 6 6 6 6 6 6 6 7 7 5 5 5 5 Are there noticeable differences in the values of optimised parameters over releases?
  • 17. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake RQ3: Clones over Releases 17 How many clones in Mockito are reported with the highest agreement over releases? DefaultEvaClone
  • 18. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Maximising Agreement 18 C D N S Maximise Clone detectors
  • 19. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Open Challenge A better fitness function 
 for EvaClone is needed It must not only rely on the number of cloned lines, but also include other aspects: How often a line is found to be cloned to other places? Precision vs. Recall? Location of clones 19 ???
  • 20. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake 20 0.30 0.35 0.40 0.45 0.50 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44 Mockito FitnessValue Default EvaClone Highest EvaClone Lowest Opt. params vs Def. params Tool Parameter D F Optimised 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2. 0. 0 2. 0. 44 CCFinder MinToken TKS 5 0 1 2 10 10 70 16 70 18 70 19 80 18 80 18 80 19 80 20 10 14 10 17 10 10 10 10 10 10 10 10 Deckard MinToken Stride Similarity 3 0 5 0. 9 30 inf 0. 9 50 8 1. 0 50 8 1. 0 50 8 1. 0 50 8 1. 0 50 8 1. 0 50 8 1. 0 50 8 1. 0 50 16 0. 95 50 5 1. 0 50 inf 0. 9 50 inf 0. 9 50 inf 0. 9 50 inf 0. 9 NiCad MinLine MaxLine UPI Blind Abstract 6 1 K 0. 3 0 0 5 20 0 0. 3 1 4 7 10 0 0. 0 0 6 7 10 0 0. 1 0 6 7 40 0 0. 0 0 6 6 40 0 0. 0 0 6 6 20 0 0. 1 0 5 6 20 0 0. 1 0 5 7 20 0 0. 0 1 6 6 20 0 0. 3 1 6 5 10 0 0. 1 1 2 5 10 0 0. 3 1 4 5 10 0 0. 3 1 4 5 20 0 0. 3 1 4 5 20 0 0. 3 1 4 Opt. params are not stable over releases DefaultEvaClone Fitness func. needs improvements