SlideShare a Scribd company logo
1 of 35
On the “Naturalness” of Buggy Code
Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane,
Zhaopeng Tu, Alberto Bacchelli, Premkumar Devanbu.
published in ICSE 2016
Jinhan Kim
2018.2.9
Naturalness
• Real software tends to be natural, like speech or natural
language.
• It tends to be highly repetitive and predictable.
Naturalness of Software1
[1] A. Hindle, E. Barr, M. Gabel, Z. Su, and P. Devanbu. On the naturalness of software. In ICSE, pages 837–847,
2012.
What does it mean when a code is considered
“unnatural”?
Research Questions
• Are buggy lines less “natural” than non-buggy lines?
• Are buggy lines less “natural" than bug-fix lines?
• Is “naturalness" a good way to direct inspection effort?
Background
Language Model
• Language model assign a probability to every sequence of
words.
• Given a code token sequence, 𝑆 = 𝑡1 𝑡2 … 𝑡 𝑁
ngram Language Model
• Using only the preceding n - 1 tokens.
• ℎ = 𝑡1 𝑡2 … 𝑡𝑖−1
$gram Language Model2
• Improving language model by deploying an additional cache-
list of ngrams extracted from the local context, to capture the
local regularities.
[2] Z. Tu, Z. Su, and P. Devanbu. On the localness of software. In SIGSOFT FSE, pages 269–280, 2014.
Study
Study Subject
Phase-1 (during active development)
• They chose to analyze each project for the period of one-year
which contained the most bug fixes in that project’s history.
• Then, extract snapshots at 1-month intervals.
Data Collection
Phase-2 (after release)
Entropy Measurement
• $gram
• The line and file entropies are computed by averaging over all
the tokens belong to a line and all lines corresponding to a file
respectively.
Entropy Measurement
• Package, class and method declarations
• previously unseen identifiers – higher entropy scores
• For-loop statements and catch clauses
• being often repetitive – lower entropy scores
Abstract-syntax-based line-types
and computing a syntax-sensitive
entropy score
Syntax-sensitive Entropy Score
• Matching between line and AST node.
• Then, compute how much a line’s entropy deviated from the
mean entropy of its line-type.
• => $gram+type
Relative bug-proneness
• => $gram+wType
Evaluation
RQ1: Are buggy lines less “natural" than
non-buggy lines?
Are buggy lines less “natural" than non-buggy lines?
Bug Duration
Bug Duration
Bugs that stay longer in a repository tend to have lower entropy than the
short-lived bugs
RQ2: Are buggy lines less “natural" than
bug-fix lines?
Are buggy lines less “natural" than bug-fix lines?
Example 1
Example 2
Example 3
Counterexample
RQ3: Is “naturalness" a good way to direct
inspection effort?
DP: Defect Prediction
• Two classifier
• Logistic Regression(LR)
• Random Forest(RF)
• Process metrics
• # of developers
• # of file-commit
• Code churn
• Previous bug history
SBF: Static Bug Finder
• SBF uses syntactic and semantic properties of source code.
• For this study, PMD and FindBugs are used.
• NBF: Naturalness Bug Finder
• AUCEC: Area Under the Cost-Effectiveness Curve
Detecting Buggy Files
Detecting Buggy Lines
Result
• Buggy lines, on average, have higher entropies, i.e. are “less
natural”, than non-buggy lines.
• Entropy of the buggy lines drops after bug-fixes with statistical
significance.
• Entropy can be used to guide bug-finding efforts at both the file-
level and the line-level.
Appendix

More Related Content

What's hot

Ilsvrc2015 deep residual_learning_kaiminghe
Ilsvrc2015 deep residual_learning_kaimingheIlsvrc2015 deep residual_learning_kaiminghe
Ilsvrc2015 deep residual_learning_kaiminghepramod naik
 
BERT Finetuning Webinar Presentation
BERT Finetuning Webinar PresentationBERT Finetuning Webinar Presentation
BERT Finetuning Webinar Presentationbhavesh_physics
 
Deep learning on mobile - 2019 Practitioner's Guide
Deep learning on mobile - 2019 Practitioner's GuideDeep learning on mobile - 2019 Practitioner's Guide
Deep learning on mobile - 2019 Practitioner's GuideAnirudh Koul
 
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Universitat Politècnica de Catalunya
 
[PR12] Generative Models as Distributions of Functions
[PR12] Generative Models as Distributions of Functions[PR12] Generative Models as Distributions of Functions
[PR12] Generative Models as Distributions of FunctionsJaeJun Yoo
 
Natural Language Processing in AI
Natural Language Processing in AINatural Language Processing in AI
Natural Language Processing in AISaurav Shrestha
 
20160409午後チーム関西background question 配布版
20160409午後チーム関西background question   配布版20160409午後チーム関西background question   配布版
20160409午後チーム関西background question 配布版SR WS
 
Spell checker using Natural language processing
Spell checker using Natural language processing Spell checker using Natural language processing
Spell checker using Natural language processing Sandeep Wakchaure
 
1 2. 직선과 평면에서의 벡터 방정식
1 2. 직선과 평면에서의 벡터 방정식1 2. 직선과 평면에서의 벡터 방정식
1 2. 직선과 평면에서의 벡터 방정식jdo
 
Machine Learning and Data Mining: 04 Association Rule Mining
Machine Learning and Data Mining: 04 Association Rule MiningMachine Learning and Data Mining: 04 Association Rule Mining
Machine Learning and Data Mining: 04 Association Rule MiningPier Luca Lanzi
 
Flow based generative models
Flow based generative modelsFlow based generative models
Flow based generative models수철 박
 
Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)nikhilus85
 

What's hot (18)

Visual reasoning
Visual reasoningVisual reasoning
Visual reasoning
 
Text summarization
Text summarizationText summarization
Text summarization
 
Ilsvrc2015 deep residual_learning_kaiminghe
Ilsvrc2015 deep residual_learning_kaimingheIlsvrc2015 deep residual_learning_kaiminghe
Ilsvrc2015 deep residual_learning_kaiminghe
 
BERT Finetuning Webinar Presentation
BERT Finetuning Webinar PresentationBERT Finetuning Webinar Presentation
BERT Finetuning Webinar Presentation
 
Deep learning on mobile - 2019 Practitioner's Guide
Deep learning on mobile - 2019 Practitioner's GuideDeep learning on mobile - 2019 Practitioner's Guide
Deep learning on mobile - 2019 Practitioner's Guide
 
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
 
[PR12] Generative Models as Distributions of Functions
[PR12] Generative Models as Distributions of Functions[PR12] Generative Models as Distributions of Functions
[PR12] Generative Models as Distributions of Functions
 
NP Complete Problems -- Internship
NP Complete Problems -- InternshipNP Complete Problems -- Internship
NP Complete Problems -- Internship
 
Natural Language Processing in AI
Natural Language Processing in AINatural Language Processing in AI
Natural Language Processing in AI
 
20160409午後チーム関西background question 配布版
20160409午後チーム関西background question   配布版20160409午後チーム関西background question   配布版
20160409午後チーム関西background question 配布版
 
Spell checker using Natural language processing
Spell checker using Natural language processing Spell checker using Natural language processing
Spell checker using Natural language processing
 
Text Summarization
Text SummarizationText Summarization
Text Summarization
 
1 2. 직선과 평면에서의 벡터 방정식
1 2. 직선과 평면에서의 벡터 방정식1 2. 직선과 평면에서의 벡터 방정식
1 2. 직선과 평면에서의 벡터 방정식
 
Human Action Recognition
Human Action RecognitionHuman Action Recognition
Human Action Recognition
 
Machine Learning and Data Mining: 04 Association Rule Mining
Machine Learning and Data Mining: 04 Association Rule MiningMachine Learning and Data Mining: 04 Association Rule Mining
Machine Learning and Data Mining: 04 Association Rule Mining
 
Flow based generative models
Flow based generative modelsFlow based generative models
Flow based generative models
 
Word Embedding
Word EmbeddingWord Embedding
Word Embedding
 
Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)
 

Similar to Review: On the Naturalness of Buggy Code

Do characters abuse more than words?
Do characters abuse more than words?Do characters abuse more than words?
Do characters abuse more than words?Tharushi Ruwandika
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
Code Mixing computationally bahut challenging hai
Code Mixing computationally bahut challenging haiCode Mixing computationally bahut challenging hai
Code Mixing computationally bahut challenging haiIIIT Hyderabad
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingToine Bogers
 
Spell checker for Kannada OCR
Spell checker for Kannada OCRSpell checker for Kannada OCR
Spell checker for Kannada OCRdbpublications
 
PacMin @ AMPLab All-Hands
PacMin @ AMPLab All-HandsPacMin @ AMPLab All-Hands
PacMin @ AMPLab All-Handsfnothaft
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguisticsshrey bhate
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4DigiGurukul
 
Cross-Language Information Retrieval
Cross-Language Information RetrievalCross-Language Information Retrieval
Cross-Language Information RetrievalSumin Byeon
 
Automated Abstracts and Big Data
Automated Abstracts and Big DataAutomated Abstracts and Big Data
Automated Abstracts and Big DataSameer Wadkar
 
Authorship attribution
Authorship attributionAuthorship attribution
Authorship attributionReza Ramezani
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelHemantha Kulathilake
 
Noun Paraphrasing Based on a Variety of Contexts
Noun Paraphrasing Based on a Variety of ContextsNoun Paraphrasing Based on a Variety of Contexts
Noun Paraphrasing Based on a Variety of ContextsTomoyuki Kajiwara
 
BibleTech2013.pptx
BibleTech2013.pptxBibleTech2013.pptx
BibleTech2013.pptxAndi Wu
 
Natural language processing (Python)
Natural language processing (Python)Natural language processing (Python)
Natural language processing (Python)Sumit Raj
 
Natural language processing
Natural language processingNatural language processing
Natural language processingBasha Chand
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)VenkateshMurugadas
 
2013 siam-cse-big-data
2013 siam-cse-big-data2013 siam-cse-big-data
2013 siam-cse-big-datac.titus.brown
 

Similar to Review: On the Naturalness of Buggy Code (20)

Do characters abuse more than words?
Do characters abuse more than words?Do characters abuse more than words?
Do characters abuse more than words?
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Code Mixing computationally bahut challenging hai
Code Mixing computationally bahut challenging haiCode Mixing computationally bahut challenging hai
Code Mixing computationally bahut challenging hai
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Spell checker for Kannada OCR
Spell checker for Kannada OCRSpell checker for Kannada OCR
Spell checker for Kannada OCR
 
Language models
Language modelsLanguage models
Language models
 
PacMin @ AMPLab All-Hands
PacMin @ AMPLab All-HandsPacMin @ AMPLab All-Hands
PacMin @ AMPLab All-Hands
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
 
Cross-Language Information Retrieval
Cross-Language Information RetrievalCross-Language Information Retrieval
Cross-Language Information Retrieval
 
Automated Abstracts and Big Data
Automated Abstracts and Big DataAutomated Abstracts and Big Data
Automated Abstracts and Big Data
 
Authorship attribution
Authorship attributionAuthorship attribution
Authorship attribution
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language Model
 
Noun Paraphrasing Based on a Variety of Contexts
Noun Paraphrasing Based on a Variety of ContextsNoun Paraphrasing Based on a Variety of Contexts
Noun Paraphrasing Based on a Variety of Contexts
 
BibleTech2013.pptx
BibleTech2013.pptxBibleTech2013.pptx
BibleTech2013.pptx
 
Natural language processing (Python)
Natural language processing (Python)Natural language processing (Python)
Natural language processing (Python)
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
2013 siam-cse-big-data
2013 siam-cse-big-data2013 siam-cse-big-data
2013 siam-cse-big-data
 

Recently uploaded

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGSIVASHANKAR N
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 

Recently uploaded (20)

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 

Review: On the Naturalness of Buggy Code

  • 1. On the “Naturalness” of Buggy Code Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, Premkumar Devanbu. published in ICSE 2016 Jinhan Kim 2018.2.9
  • 2. Naturalness • Real software tends to be natural, like speech or natural language. • It tends to be highly repetitive and predictable. Naturalness of Software1 [1] A. Hindle, E. Barr, M. Gabel, Z. Su, and P. Devanbu. On the naturalness of software. In ICSE, pages 837–847, 2012.
  • 3. What does it mean when a code is considered “unnatural”?
  • 4. Research Questions • Are buggy lines less “natural” than non-buggy lines? • Are buggy lines less “natural" than bug-fix lines? • Is “naturalness" a good way to direct inspection effort?
  • 6. Language Model • Language model assign a probability to every sequence of words. • Given a code token sequence, 𝑆 = 𝑡1 𝑡2 … 𝑡 𝑁
  • 7. ngram Language Model • Using only the preceding n - 1 tokens. • ℎ = 𝑡1 𝑡2 … 𝑡𝑖−1
  • 8. $gram Language Model2 • Improving language model by deploying an additional cache- list of ngrams extracted from the local context, to capture the local regularities. [2] Z. Tu, Z. Su, and P. Devanbu. On the localness of software. In SIGSOFT FSE, pages 269–280, 2014.
  • 11. Phase-1 (during active development) • They chose to analyze each project for the period of one-year which contained the most bug fixes in that project’s history. • Then, extract snapshots at 1-month intervals.
  • 14. Entropy Measurement • $gram • The line and file entropies are computed by averaging over all the tokens belong to a line and all lines corresponding to a file respectively.
  • 15. Entropy Measurement • Package, class and method declarations • previously unseen identifiers – higher entropy scores • For-loop statements and catch clauses • being often repetitive – lower entropy scores Abstract-syntax-based line-types and computing a syntax-sensitive entropy score
  • 16. Syntax-sensitive Entropy Score • Matching between line and AST node. • Then, compute how much a line’s entropy deviated from the mean entropy of its line-type. • => $gram+type
  • 19. RQ1: Are buggy lines less “natural" than non-buggy lines?
  • 20. Are buggy lines less “natural" than non-buggy lines?
  • 22. Bug Duration Bugs that stay longer in a repository tend to have lower entropy than the short-lived bugs
  • 23. RQ2: Are buggy lines less “natural" than bug-fix lines?
  • 24. Are buggy lines less “natural" than bug-fix lines?
  • 29. RQ3: Is “naturalness" a good way to direct inspection effort?
  • 30. DP: Defect Prediction • Two classifier • Logistic Regression(LR) • Random Forest(RF) • Process metrics • # of developers • # of file-commit • Code churn • Previous bug history
  • 31. SBF: Static Bug Finder • SBF uses syntactic and semantic properties of source code. • For this study, PMD and FindBugs are used. • NBF: Naturalness Bug Finder • AUCEC: Area Under the Cost-Effectiveness Curve
  • 34. Result • Buggy lines, on average, have higher entropies, i.e. are “less natural”, than non-buggy lines. • Entropy of the buggy lines drops after bug-fixes with statistical significance. • Entropy can be used to guide bug-finding efforts at both the file- level and the line-level.