SlideShare a Scribd company logo
1 of 18
Download to read offline
Mining Software Repositories: Using Humans to Better
Software
Marat Akhin
15/06/2015
Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 1 / 18
What is MSR?
What is MSR?
Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 2 / 18
What is MSR?
Mining software repositories
Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 3 / 18
What is MSR?
Mining software repositories
Understand empirical aspects of software development
Use the past to guide the future
Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 4 / 18
What is MSR?
MSR data
Historical data
Version control systems: CVS, SVN, Git, Mercurial
Bug trackers: Bugzilla, JIRA, YouTrack
Communication: e-mails, chat logs, wiki pages
Execution data
Execution traces
Deployment logs
Crash dumps
Source code data
Source code itself
Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 5 / 18
What is MSR?
MSR methods
Classification
aka Supervised learning
Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 6 / 18
What is MSR?
MSR methods
Clustering
aka Unsupervised learning
Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 7 / 18
What is MSR?
MSR methods
Statistical hypothesis testing
Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 8 / 18
What is MSR?
MSR insights
Quality assurance
Architecture analysis
Bug prediction
Developer feedback
You-name-it!
Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 9 / 18
Can we predict bugs?
Can we predict bugs?
Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 10 / 18
Can we predict bugs?
Don’t code on Fridays 1
Eclipse/Mozilla repos / bug-trackers
Link bug fixes to source code changes
Find interesting correlations
1
Jacek ´Sliwerski, Thomas Zimmermann, and Andreas Zeller. When do changes
induce fixes? (MSR’05)
Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 11 / 18
Can we predict bugs?
Reopened bugs stay 2
Eclipse / Apache / OpenOffice
Build decision trees by different criteria
Analyze the results
2
Emad Shihab et al. Studying re-opened bugs in open source software (ESE’12)
Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 12 / 18
Code reviews: yay or nay?
Code reviews: yay or nay?
Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 13 / 18
Code reviews: yay or nay?
More reviews == less bugs 3
Qt / ITK / VTK
Collect review metrics
Bulid regression models for bug prediction
3
Shane McIntosh et al. The impact of code review coverage and code review
participation on software quality: a case study of the qt, VTK, and ITK projects.
(MSR’14)
Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 14 / 18
Code clones: what is that smell?
Code clones: what is that smell?
Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 15 / 18
Code clones: what is that smell?
Clones are better than other code 4
Apache / Evolution / GIMP / Nautilus
Detect clones and link them to bugs
Analyze clone-to-bug ratio
4
Foyzur Rahman et al. Clones: what is that smell? (ESE’12)
Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 16 / 18
What next?
What next?
Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 17 / 18
What next?
More data to explore
OSS source code doubles every year
Active use of *aaS platforms
MSR has access to vast amounts of development data
More insights coming next week!
Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 18 / 18

More Related Content

What's hot

SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationdgarijo
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...dgarijo
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
 
2016 Summer - Araport Project Overview Leaflet
2016 Summer - Araport Project Overview Leaflet2016 Summer - Araport Project Overview Leaflet
2016 Summer - Araport Project Overview LeafletAraport
 
ICAR 2015 Workshop - Blake Meyers
ICAR 2015 Workshop - Blake MeyersICAR 2015 Workshop - Blake Meyers
ICAR 2015 Workshop - Blake MeyersAraport
 
Addo 2019 ppt_the_dream_of_antifragile_systems- final
Addo 2019 ppt_the_dream_of_antifragile_systems- finalAddo 2019 ppt_the_dream_of_antifragile_systems- final
Addo 2019 ppt_the_dream_of_antifragile_systems- finalVictor Martinez
 

What's hot (7)

SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentation
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Future
 
2016 Summer - Araport Project Overview Leaflet
2016 Summer - Araport Project Overview Leaflet2016 Summer - Araport Project Overview Leaflet
2016 Summer - Araport Project Overview Leaflet
 
ICAR 2015 Workshop - Blake Meyers
ICAR 2015 Workshop - Blake MeyersICAR 2015 Workshop - Blake Meyers
ICAR 2015 Workshop - Blake Meyers
 
Addo 2019 ppt_the_dream_of_antifragile_systems- final
Addo 2019 ppt_the_dream_of_antifragile_systems- finalAddo 2019 ppt_the_dream_of_antifragile_systems- final
Addo 2019 ppt_the_dream_of_antifragile_systems- final
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 

Viewers also liked

ICSE 2011: Research industry panel
ICSE 2011: Research industry panelICSE 2011: Research industry panel
ICSE 2011: Research industry panelMargaret-Anne Storey
 
Mining Development Repositories to Study the Impact of Collaboration on Softw...
Mining Development Repositories to Study the Impact of Collaboration on Softw...Mining Development Repositories to Study the Impact of Collaboration on Softw...
Mining Development Repositories to Study the Impact of Collaboration on Softw...Nicolas Bettenburg
 
ICSME2014
ICSME2014ICSME2014
ICSME2014swy351
 
ICSE2013
ICSE2013ICSE2013
ICSE2013swy351
 
MSR 2009
MSR 2009MSR 2009
MSR 2009swy351
 
ICPE2015
ICPE2015ICPE2015
ICPE2015swy351
 
Msr2016 tarek
Msr2016 tarek Msr2016 tarek
Msr2016 tarek swy351
 
WCRE2011
WCRE2011WCRE2011
WCRE2011swy351
 
ICSE2014
ICSE2014ICSE2014
ICSE2014swy351
 
ASE2010
ASE2010ASE2010
ASE2010swy351
 
MSR End of Internship Talk
MSR End of Internship TalkMSR End of Internship Talk
MSR End of Internship TalkRay Buse
 
Towards the Social Programmer (MSR 2012 Keynote by M. Storey)
Towards the Social Programmer (MSR 2012 Keynote by M. Storey)Towards the Social Programmer (MSR 2012 Keynote by M. Storey)
Towards the Social Programmer (MSR 2012 Keynote by M. Storey)Margaret-Anne Storey
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchThomas Zimmermann
 
A Metric for Code Readability
A Metric for Code ReadabilityA Metric for Code Readability
A Metric for Code ReadabilityRay Buse
 
The (R)evolution of Social Media in Software Engineering
The (R)evolution of Social Media in Software EngineeringThe (R)evolution of Social Media in Software Engineering
The (R)evolution of Social Media in Software EngineeringMargaret-Anne Storey
 
Benevol 2012 Keynote: The Social Software (R)evolution
Benevol 2012 Keynote: The Social Software (R)evolutionBenevol 2012 Keynote: The Social Software (R)evolution
Benevol 2012 Keynote: The Social Software (R)evolutionMargaret-Anne Storey
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software DatasetsTao Xie
 
FSE 2016 Panel: The State of Software Engineering Research
FSE 2016 Panel: The State of Software Engineering ResearchFSE 2016 Panel: The State of Software Engineering Research
FSE 2016 Panel: The State of Software Engineering ResearchMargaret-Anne Storey
 

Viewers also liked (20)

ICSE 2011: Research industry panel
ICSE 2011: Research industry panelICSE 2011: Research industry panel
ICSE 2011: Research industry panel
 
Mining Development Repositories to Study the Impact of Collaboration on Softw...
Mining Development Repositories to Study the Impact of Collaboration on Softw...Mining Development Repositories to Study the Impact of Collaboration on Softw...
Mining Development Repositories to Study the Impact of Collaboration on Softw...
 
Icpc 2011 storey
Icpc 2011 storeyIcpc 2011 storey
Icpc 2011 storey
 
ICSME2014
ICSME2014ICSME2014
ICSME2014
 
ICSE2013
ICSE2013ICSE2013
ICSE2013
 
MSR 2009
MSR 2009MSR 2009
MSR 2009
 
ICPE2015
ICPE2015ICPE2015
ICPE2015
 
Msr2016 tarek
Msr2016 tarek Msr2016 tarek
Msr2016 tarek
 
WCRE2011
WCRE2011WCRE2011
WCRE2011
 
ICSE2014
ICSE2014ICSE2014
ICSE2014
 
ASE2010
ASE2010ASE2010
ASE2010
 
MSR End of Internship Talk
MSR End of Internship TalkMSR End of Internship Talk
MSR End of Internship Talk
 
Towards the Social Programmer (MSR 2012 Keynote by M. Storey)
Towards the Social Programmer (MSR 2012 Keynote by M. Storey)Towards the Social Programmer (MSR 2012 Keynote by M. Storey)
Towards the Social Programmer (MSR 2012 Keynote by M. Storey)
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft Research
 
A Metric for Code Readability
A Metric for Code ReadabilityA Metric for Code Readability
A Metric for Code Readability
 
The (R)evolution of Social Media in Software Engineering
The (R)evolution of Social Media in Software EngineeringThe (R)evolution of Social Media in Software Engineering
The (R)evolution of Social Media in Software Engineering
 
Benevol 2012 Keynote: The Social Software (R)evolution
Benevol 2012 Keynote: The Social Software (R)evolutionBenevol 2012 Keynote: The Social Software (R)evolution
Benevol 2012 Keynote: The Social Software (R)evolution
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
 
FSE 2016 Panel: The State of Software Engineering Research
FSE 2016 Panel: The State of Software Engineering ResearchFSE 2016 Panel: The State of Software Engineering Research
FSE 2016 Panel: The State of Software Engineering Research
 
Research industry panel review
Research industry panel reviewResearch industry panel review
Research industry panel review
 

Similar to Mining Software Repositories: Using Humans to Better Software

Malware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning TechniquesMalware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning TechniquesArshadRaja786
 
Big Data Analytics to Enhance Security
Big Data Analytics to Enhance SecurityBig Data Analytics to Enhance Security
Big Data Analytics to Enhance SecurityData Science Thailand
 
μ/log and the next 100 log systems
μ/log and the next 100 log systemsμ/log and the next 100 log systems
μ/log and the next 100 log systemsBruno Bonacci
 
Software Security Assurance for DevOps - Hewlett Packard Enterprise + Black Duck
Software Security Assurance for DevOps - Hewlett Packard Enterprise + Black DuckSoftware Security Assurance for DevOps - Hewlett Packard Enterprise + Black Duck
Software Security Assurance for DevOps - Hewlett Packard Enterprise + Black DuckBlack Duck by Synopsys
 
I can be apple and so can you
I can be apple and so can youI can be apple and so can you
I can be apple and so can youShakacon
 
Software Security Assurance for DevOps
Software Security Assurance for DevOpsSoftware Security Assurance for DevOps
Software Security Assurance for DevOpsBlack Duck by Synopsys
 
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Chakkrit (Kla) Tantithamthavorn
 
IT Vulnerability & Tools Watch 2011
IT Vulnerability & Tools Watch 2011IT Vulnerability & Tools Watch 2011
IT Vulnerability & Tools Watch 2011WASecurity
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsKamalika Dutta
 
GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...
GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...
GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...Aaron Edell
 
Data Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of ThingsData Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of ThingsVMware Tanzu
 
Fix Heap corruption in Android - Using valgrind
Fix Heap corruption in Android - Using valgrindFix Heap corruption in Android - Using valgrind
Fix Heap corruption in Android - Using valgrindCheng Hsien Chen
 
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...Chakkrit (Kla) Tantithamthavorn
 
Open Source Insight: 2017 Top 10 IT Security Stories, Breaches, and Predictio...
Open Source Insight:2017 Top 10 IT Security Stories, Breaches, and Predictio...Open Source Insight:2017 Top 10 IT Security Stories, Breaches, and Predictio...
Open Source Insight: 2017 Top 10 IT Security Stories, Breaches, and Predictio...Black Duck by Synopsys
 
The Internal Signs of Compromise
The Internal Signs of CompromiseThe Internal Signs of Compromise
The Internal Signs of CompromiseFireEye, Inc.
 
Don't build a data science team
Don't build a data science teamDon't build a data science team
Don't build a data science teamLars Albertsson
 
PatrOwl - Security Operations Orchestration
PatrOwl  - Security Operations OrchestrationPatrOwl  - Security Operations Orchestration
PatrOwl - Security Operations OrchestrationMaKyOtOx
 
Jeremy Brown & David Seidman - Microsoft Vulnerability Research: How to be a ...
Jeremy Brown & David Seidman - Microsoft Vulnerability Research: How to be a ...Jeremy Brown & David Seidman - Microsoft Vulnerability Research: How to be a ...
Jeremy Brown & David Seidman - Microsoft Vulnerability Research: How to be a ...RootedCON
 
Rest microservice ml_deployment_ntalagala_ai_conf_2019
Rest microservice ml_deployment_ntalagala_ai_conf_2019Rest microservice ml_deployment_ntalagala_ai_conf_2019
Rest microservice ml_deployment_ntalagala_ai_conf_2019Nisha Talagala
 

Similar to Mining Software Repositories: Using Humans to Better Software (20)

Malware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning TechniquesMalware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning Techniques
 
Big Data Analytics to Enhance Security
Big Data Analytics to Enhance SecurityBig Data Analytics to Enhance Security
Big Data Analytics to Enhance Security
 
μ/log and the next 100 log systems
μ/log and the next 100 log systemsμ/log and the next 100 log systems
μ/log and the next 100 log systems
 
Software Security Assurance for DevOps - Hewlett Packard Enterprise + Black Duck
Software Security Assurance for DevOps - Hewlett Packard Enterprise + Black DuckSoftware Security Assurance for DevOps - Hewlett Packard Enterprise + Black Duck
Software Security Assurance for DevOps - Hewlett Packard Enterprise + Black Duck
 
I can be apple and so can you
I can be apple and so can youI can be apple and so can you
I can be apple and so can you
 
Software Security Assurance for DevOps
Software Security Assurance for DevOpsSoftware Security Assurance for DevOps
Software Security Assurance for DevOps
 
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
 
IT Vulnerability & Tools Watch 2011
IT Vulnerability & Tools Watch 2011IT Vulnerability & Tools Watch 2011
IT Vulnerability & Tools Watch 2011
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time Systems
 
GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...
GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...
GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...
 
Data Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of ThingsData Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of Things
 
Fix Heap corruption in Android - Using valgrind
Fix Heap corruption in Android - Using valgrindFix Heap corruption in Android - Using valgrind
Fix Heap corruption in Android - Using valgrind
 
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
 
Open Source Insight: 2017 Top 10 IT Security Stories, Breaches, and Predictio...
Open Source Insight:2017 Top 10 IT Security Stories, Breaches, and Predictio...Open Source Insight:2017 Top 10 IT Security Stories, Breaches, and Predictio...
Open Source Insight: 2017 Top 10 IT Security Stories, Breaches, and Predictio...
 
The Internal Signs of Compromise
The Internal Signs of CompromiseThe Internal Signs of Compromise
The Internal Signs of Compromise
 
Don't build a data science team
Don't build a data science teamDon't build a data science team
Don't build a data science team
 
PatrOwl - Security Operations Orchestration
PatrOwl  - Security Operations OrchestrationPatrOwl  - Security Operations Orchestration
PatrOwl - Security Operations Orchestration
 
Jeremy Brown & David Seidman - Microsoft Vulnerability Research: How to be a ...
Jeremy Brown & David Seidman - Microsoft Vulnerability Research: How to be a ...Jeremy Brown & David Seidman - Microsoft Vulnerability Research: How to be a ...
Jeremy Brown & David Seidman - Microsoft Vulnerability Research: How to be a ...
 
Rest microservice ml_deployment_ntalagala_ai_conf_2019
Rest microservice ml_deployment_ntalagala_ai_conf_2019Rest microservice ml_deployment_ntalagala_ai_conf_2019
Rest microservice ml_deployment_ntalagala_ai_conf_2019
 
Global Cyber Threat Intelligence
Global Cyber Threat IntelligenceGlobal Cyber Threat Intelligence
Global Cyber Threat Intelligence
 

More from Marat Akhin

Тестирование ПО (2016)
Тестирование ПО (2016)Тестирование ПО (2016)
Тестирование ПО (2016)Marat Akhin
 
Чудеса производительности
Чудеса производительностиЧудеса производительности
Чудеса производительностиMarat Akhin
 
Дебаггинг
ДебаггингДебаггинг
ДебаггингMarat Akhin
 
Регрессионное тестирование
Регрессионное тестированиеРегрессионное тестирование
Регрессионное тестированиеMarat Akhin
 
Случайное тестирование
Случайное тестированиеСлучайное тестирование
Случайное тестированиеMarat Akhin
 
Тестовый оракул: что, где, когда
Тестовый оракул: что, где, когдаТестовый оракул: что, где, когда
Тестовый оракул: что, где, когдаMarat Akhin
 
Полнота тестирования ПО
Полнота тестирования ПОПолнота тестирования ПО
Полнота тестирования ПОMarat Akhin
 
Проблема наблюдаемости
Проблема наблюдаемостиПроблема наблюдаемости
Проблема наблюдаемостиMarat Akhin
 
Проблема тестовых входных данных
Проблема тестовых входных данныхПроблема тестовых входных данных
Проблема тестовых входных данныхMarat Akhin
 
Тестирование программного обеспечения: что, зачем и почему?
Тестирование программного обеспечения: что, зачем и почему?Тестирование программного обеспечения: что, зачем и почему?
Тестирование программного обеспечения: что, зачем и почему?Marat Akhin
 
Scala EE: Myth or Reality?
Scala EE: Myth or Reality?Scala EE: Myth or Reality?
Scala EE: Myth or Reality?Marat Akhin
 

More from Marat Akhin (11)

Тестирование ПО (2016)
Тестирование ПО (2016)Тестирование ПО (2016)
Тестирование ПО (2016)
 
Чудеса производительности
Чудеса производительностиЧудеса производительности
Чудеса производительности
 
Дебаггинг
ДебаггингДебаггинг
Дебаггинг
 
Регрессионное тестирование
Регрессионное тестированиеРегрессионное тестирование
Регрессионное тестирование
 
Случайное тестирование
Случайное тестированиеСлучайное тестирование
Случайное тестирование
 
Тестовый оракул: что, где, когда
Тестовый оракул: что, где, когдаТестовый оракул: что, где, когда
Тестовый оракул: что, где, когда
 
Полнота тестирования ПО
Полнота тестирования ПОПолнота тестирования ПО
Полнота тестирования ПО
 
Проблема наблюдаемости
Проблема наблюдаемостиПроблема наблюдаемости
Проблема наблюдаемости
 
Проблема тестовых входных данных
Проблема тестовых входных данныхПроблема тестовых входных данных
Проблема тестовых входных данных
 
Тестирование программного обеспечения: что, зачем и почему?
Тестирование программного обеспечения: что, зачем и почему?Тестирование программного обеспечения: что, зачем и почему?
Тестирование программного обеспечения: что, зачем и почему?
 
Scala EE: Myth or Reality?
Scala EE: Myth or Reality?Scala EE: Myth or Reality?
Scala EE: Myth or Reality?
 

Recently uploaded

Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 

Recently uploaded (20)

LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 

Mining Software Repositories: Using Humans to Better Software

  • 1. Mining Software Repositories: Using Humans to Better Software Marat Akhin 15/06/2015 Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 1 / 18
  • 2. What is MSR? What is MSR? Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 2 / 18
  • 3. What is MSR? Mining software repositories Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 3 / 18
  • 4. What is MSR? Mining software repositories Understand empirical aspects of software development Use the past to guide the future Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 4 / 18
  • 5. What is MSR? MSR data Historical data Version control systems: CVS, SVN, Git, Mercurial Bug trackers: Bugzilla, JIRA, YouTrack Communication: e-mails, chat logs, wiki pages Execution data Execution traces Deployment logs Crash dumps Source code data Source code itself Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 5 / 18
  • 6. What is MSR? MSR methods Classification aka Supervised learning Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 6 / 18
  • 7. What is MSR? MSR methods Clustering aka Unsupervised learning Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 7 / 18
  • 8. What is MSR? MSR methods Statistical hypothesis testing Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 8 / 18
  • 9. What is MSR? MSR insights Quality assurance Architecture analysis Bug prediction Developer feedback You-name-it! Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 9 / 18
  • 10. Can we predict bugs? Can we predict bugs? Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 10 / 18
  • 11. Can we predict bugs? Don’t code on Fridays 1 Eclipse/Mozilla repos / bug-trackers Link bug fixes to source code changes Find interesting correlations 1 Jacek ´Sliwerski, Thomas Zimmermann, and Andreas Zeller. When do changes induce fixes? (MSR’05) Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 11 / 18
  • 12. Can we predict bugs? Reopened bugs stay 2 Eclipse / Apache / OpenOffice Build decision trees by different criteria Analyze the results 2 Emad Shihab et al. Studying re-opened bugs in open source software (ESE’12) Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 12 / 18
  • 13. Code reviews: yay or nay? Code reviews: yay or nay? Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 13 / 18
  • 14. Code reviews: yay or nay? More reviews == less bugs 3 Qt / ITK / VTK Collect review metrics Bulid regression models for bug prediction 3 Shane McIntosh et al. The impact of code review coverage and code review participation on software quality: a case study of the qt, VTK, and ITK projects. (MSR’14) Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 14 / 18
  • 15. Code clones: what is that smell? Code clones: what is that smell? Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 15 / 18
  • 16. Code clones: what is that smell? Clones are better than other code 4 Apache / Evolution / GIMP / Nautilus Detect clones and link them to bugs Analyze clone-to-bug ratio 4 Foyzur Rahman et al. Clones: what is that smell? (ESE’12) Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 16 / 18
  • 17. What next? What next? Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 17 / 18
  • 18. What next? More data to explore OSS source code doubles every year Active use of *aaS platforms MSR has access to vast amounts of development data More insights coming next week! Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 18 / 18