SlideShare a Scribd company logo
1 of 35
Download to read offline
Measuring Architecture Quality
by
Structure Plus History Analysis
Presented by
Naveed Kamran
naveed@datenc.com
About Authors
• Robert Schwanke
• Siemens Corporation, Corporate Technology
• Princeton, New Jersey, USA
• robert.schwanke@siemens.com
• Lu Xiao, Yuanfang Cai
• Computer Science Department, Drexel University
• Philadelphia, Pennsylvania, USA
• {lx52,yc349}@drexel.edu
2
Agenda
• Terms used in Research Paper
• Research Paper Details
I. Motivation and Background
II. Related Work
III. Case Study Questions
IV. Case Study Procedure
V. Case Study and Results
VI. Discussion of Results
VII. Conclusion
• Problems
3
Terms used in paper
Fan-in
• The number of modules that depend on a given module
Fan Out
• The number of modules that the given module depends on
Correlation
• A mutual relationship or connection between two or more things (for
example modules/functions etc.).
Outlier
• A module or function situated away or detached from the main body or
system.
4
Abstract
• A case study paper
• Combines
• Known software structure analysis techniques
• Known revision history analysis techniques
• To predict bug-related change frequency, and uncover architecture-
related risks in an agile industrial software development project.
5
(I). Motivation and Background
• Architecture often receives inadequate attention, especially in agile software
development
• Only a few principles kept in the heads of developers
• As small projects grow up, they tend to postpone both architecture specification
and necessary restructuring
• The experience of restructuring is hard to justify until there is a crisis.
• Architectural concerns are debts
• These debts tends to hurt future project sustainability much more than it does
current functionality or quality
• Development team needs ways to prioritize the backlog of restructuring tasks,
selecting which ones to include in the architecture renovation plan
• This is a case study of measuring architecture quality and identifying architecture
issues by combining analyses of dependency structure and revision history
6
Is your business ready to scale changes in
rapid manner?
7
(II). Related Work
• Size Measures
• Number of lines of code (LOC)
• Number of non-commented lines of code (NCLOC)
• Number of executable lines of code (ELOC)
• Kilo-SLOC (Thousand Source/Software Lines of Code) (KSLOC)
• Complexity Measures
• Fan-In
• Fan-Out
• Change Measures
• LOC added/deleted and number of sub systems affected
• Effort and Impact Measures
• Counting tasks, bug reports, file versions, change sets, or staff hours
8
Related Work (2)
• Types of Changes
• Bug/Task
• Software Fault Prediction and Project Repositories
• Which files will have faults in future
• Prediction Performance
• Correlation, accuracy, precision and recall
• Correlation and Regression Analysis of Count Data
• Architecture Violation Detection
• Find dependencies that violate the architecture structure
9
(III). Case Study Questions
1. Does this combined
structure/history measurement
reveal critical architecture issues
that are worth fixing?
2. Are there any structure-based
measures that can be used to
predict quality variation in the
absence of adequate revision history
data?
3. What measurements could help the
developers make important
architectural decisions, and how?
10
Developer Concerns
• “What can I do with this information?
There will always be some modules
with high complexity, coupling, fanout,
or size. What will I gain by
reorganizing them? Why do I care if
the code violates the architecture, as
long as it works?”
11
(IV). Case study procedure
• Data collection
• Structure and history measurement
• Validation
• Prediction
• Uncovering architecture problems
• Presenting findings
• Investigating developer concerns
12
(v). Case Study and Results
(a) The project and its data
• About Team
• Team of 20 developers
• About Project
• Project name System J (code name)
• Service oriented structure
• 300 KSLOC in 900 files and 165 java packages
• Development Methodology
• Agile project development
• Project manager is also the customer proxy
• Sprints are usually two weeks long
• Rebuild and automatically tested at least nightly
• VCS
• Mercurial
• Bug and Tracking system
• JIRA
13
VCS (Subversion/TortoiseSVN)
14
Project Support Tool (BunnyCRM)
15
Case Study and Results
(b) Measure of structure and history
• Single-file measures
1. File size
2. Fan-in
3. Fan-out
4. Change frequency
5. Ticket Frequency
6. Bug Change Frequency
• File pair measure
1. Pair Change Frequency
16
Case Study and Results
(b) Measure of structure and history
17
Case Study and Results
(b) Measure of structure and history
18
Case Study and Results
(c) Exploratory data analysis
• Distribution of each measure’s values
• Scatter plots of relationships between measure
• Comparing Releases
19
Case Study and Results
(c) Exploratory data analysis - Distribution of each
measure’s values
20
Case Study and Results
(c) Exploratory data analysis - Scatter plots of
relationships between measure
21
Case Study and Results
(c) Exploratory data analysis - Comparing Releases
22
Case Study and Results
(d) Validation on project data
• Faulty file recall. An ‘instance’ is a file that is changed (in the future)
at least once due to any bug ticket, but additional bug tickets and
additional changes don’t count extra.
• Fault recall. An ‘instance’ is a pair <file, bug ticket>,where the file is
changed at least once due to that bug ticket. Additional changes due
to the same ticket don’t count extra.
• Fault impact recall. An ‘instance’ is a triple <file, change set, bug
ticket>, where the file is a member of the change set and the change
set is associated with the bug ticket. If a file is checked in several
times due to the same bug ticket, each time is a separate instance.
23
Case Study and Results
(d) Validation on project data
24
Case Study and Results
(e) Uncovering architecture problems
• Outlier Analysis – Individual Files
• Lists the predicted fault-prone files “worst first”, sorting them according to the median of their
ranks by fan-out, size, and change frequency.
• Structure Browsing with Understand™
• A static code analysis tool called Understand™. It creates a database of code structure and
crossreference information, processes queries on this information, and renders the results using
Graphviz, a popular, free graph drawing engine
• Using this tool, we would analyze a file by creating a diagram that included the file itself and all of
its immediate neighbors in the cross-reference model, including the packages in which these files
are located. Showing the enclosing packages provides guidance for analysts who are unfamiliar
with some of the files.
• Outlier Analysis – Pairs and Clusters
• In addition to measuring individual files, we investigated the structure and history of pairs and
clusters of files.
• According to Parnas, the decomposition of a system should be based on mature, stable design
decisions and should encapsulate known future variabilities, so that the impact of each kind of
future change is as limited as practical.
25
Case Study and Results
(f) Investigating a Developer Concern
• A senior developer noticed that most of the fault-prone files belong
to the same entry point package.
• Authors used the layered dependency model to investigate and
illustrate the problem.
• The root cause was inadequate architecture awareness by junior
developers.
• The senior developer wrote a small renovation proposal, using this
data and diagrams to justify the work of architecture changes.
26
(VI) Discussion of Results
a) Answering the case study questions
1. Does this combined structure/history measurement reveal critical
architecture issues that are worth fixing?
• There are key interface files that are supposed to be stable and
correct, but actually have faults and change frequently.
• Frequent, distant change pairs often correspond to important
architecture weaknesses or violations.
• Structure gives us a visual context for analyzing change associations
found in the history, discovering important, but undocumented
shared assumptions that should be made explicit.
27
(VI) Discussion of Results
a) Answering the case study questions
• 2. Are there any structure-based measures that can be used, without
history, to predict quality variation?
• High fan-out, it should be examined for accidental complexity and
architecture violations. Fan-in was not a good predictor of quality
• High fan-in files did tend to be architecturally- significant
infrastructure classes.
28
(VI) Discussion of Results
a) Answering the case study questions
• 3. What measurements could help the developers make important
architectural decisions, and how?
• Outlier Investigation was a good way to find low-hanging, bad
smelling fruit.
• Predicting fault-proneness helped to prioritize architecture risks.
• Clustering distant change pairs seemed to bring together locality
violations with the same root cause.
29
(VI) Discussion of Results
• b) Visualizing structural concerns
• Interactive, visual architecture models are essential for analyzing and communicating
architecture issues.
• C) Overcoming dirty data
• Check correlations between change types, developer habits, and other measurable
attributes, then tailor the prediction functions to account for them
• d). Novelty and effectiveness
• Approach is repeatable and on other projects using similar tools
• e). Assumption and limitations
• we only studied one project, and we cannot claim that the result can be generalized
to other projects with different domains or using different tools or languages.
30
Conclusion
• This case study reported measuring code structure and history to
identify architecture problems in an industrial project.
• It demonstrated the effectiveness of the combination by enabling
developers to visualize the architectural roles and impact scope of
files with high fault and change frequency, which justified and
supported a successful renovation task.
31
Problems
• They have not considered following types of changes:
• Incident (A non producible defect)
• Lead
32
Problems - Why release 1.05 and 1.0 skipped?
33
Problems (3)
• The author of this paper self explained that
• They considered only one project and a single business domain. Its not
enough to be applied on all kinds of projects and business domains.
• They have not use bug severity. A change in a file can be because of
label change or a very minor cosmetic change in the user interface of
application. But all kinds of bugs are treated with single severity
34
Thank you
35

More Related Content

What's hot

Data collection for software defect prediction
Data collection for software defect predictionData collection for software defect prediction
Data collection for software defect predictionAmmAr mobark
 
Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...ICSM 2011
 
Animated Visualization of Software History Using Software Evolution Storyboards
Animated Visualization of Software History Using Software Evolution StoryboardsAnimated Visualization of Software History Using Software Evolution Storyboards
Animated Visualization of Software History Using Software Evolution StoryboardsSAIL_QU
 
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Feng Zhang
 
Towards Software Sustainability Guides for Industrial Software Systems
Towards Software Sustainability Guides for Industrial Software SystemsTowards Software Sustainability Guides for Industrial Software Systems
Towards Software Sustainability Guides for Industrial Software SystemsHeiko Koziolek
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving softwareSung Kim
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesSung Kim
 
User-Perceived Source Code Quality Estimation based on Static Analysis Metrics
User-Perceived Source Code Quality Estimation based on Static Analysis MetricsUser-Perceived Source Code Quality Estimation based on Static Analysis Metrics
User-Perceived Source Code Quality Estimation based on Static Analysis MetricsISSEL
 
Challenges in Assessing Technical Debt based on Dynamic Runtime Data
Challenges in Assessing Technical Debt based on Dynamic Runtime DataChallenges in Assessing Technical Debt based on Dynamic Runtime Data
Challenges in Assessing Technical Debt based on Dynamic Runtime DataQAware GmbH
 
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...Ali Ouni
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSung Kim
 
ProDebt's Lessons Learned from Planning Technical Debt Strategically
ProDebt's Lessons Learned from Planning Technical Debt StrategicallyProDebt's Lessons Learned from Planning Technical Debt Strategically
ProDebt's Lessons Learned from Planning Technical Debt StrategicallyQAware GmbH
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
 
A defect prediction model based on the relationships between developers and c...
A defect prediction model based on the relationships between developers and c...A defect prediction model based on the relationships between developers and c...
A defect prediction model based on the relationships between developers and c...Vrije Universiteit Brussel
 
Defect Prediction Over Software Life Cycle in Automotive Domain
Defect Prediction Over Software Life Cycle   in Automotive DomainDefect Prediction Over Software Life Cycle   in Automotive Domain
Defect Prediction Over Software Life Cycle in Automotive DomainRAKESH RANA
 
Reverse Engineering for Documenting Software Architectures, a Literature Review
Reverse Engineering for Documenting Software Architectures, a Literature ReviewReverse Engineering for Documenting Software Architectures, a Literature Review
Reverse Engineering for Documenting Software Architectures, a Literature ReviewEditor IJCATR
 
Presentation
PresentationPresentation
PresentationSATYALOK
 

What's hot (20)

Data collection for software defect prediction
Data collection for software defect predictionData collection for software defect prediction
Data collection for software defect prediction
 
Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...
 
Animated Visualization of Software History Using Software Evolution Storyboards
Animated Visualization of Software History Using Software Evolution StoryboardsAnimated Visualization of Software History Using Software Evolution Storyboards
Animated Visualization of Software History Using Software Evolution Storyboards
 
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
 
Towards Software Sustainability Guides for Industrial Software Systems
Towards Software Sustainability Guides for Industrial Software SystemsTowards Software Sustainability Guides for Industrial Software Systems
Towards Software Sustainability Guides for Industrial Software Systems
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving software
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution Techniques
 
User-Perceived Source Code Quality Estimation based on Static Analysis Metrics
User-Perceived Source Code Quality Estimation based on Static Analysis MetricsUser-Perceived Source Code Quality Estimation based on Static Analysis Metrics
User-Perceived Source Code Quality Estimation based on Static Analysis Metrics
 
Software bug prediction
Software bug prediction Software bug prediction
Software bug prediction
 
Icsm19.ppt
Icsm19.pptIcsm19.ppt
Icsm19.ppt
 
Challenges in Assessing Technical Debt based on Dynamic Runtime Data
Challenges in Assessing Technical Debt based on Dynamic Runtime DataChallenges in Assessing Technical Debt based on Dynamic Runtime Data
Challenges in Assessing Technical Debt based on Dynamic Runtime Data
 
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled Datasets
 
ProDebt's Lessons Learned from Planning Technical Debt Strategically
ProDebt's Lessons Learned from Planning Technical Debt StrategicallyProDebt's Lessons Learned from Planning Technical Debt Strategically
ProDebt's Lessons Learned from Planning Technical Debt Strategically
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
 
A defect prediction model based on the relationships between developers and c...
A defect prediction model based on the relationships between developers and c...A defect prediction model based on the relationships between developers and c...
A defect prediction model based on the relationships between developers and c...
 
Defect Prediction Over Software Life Cycle in Automotive Domain
Defect Prediction Over Software Life Cycle   in Automotive DomainDefect Prediction Over Software Life Cycle   in Automotive Domain
Defect Prediction Over Software Life Cycle in Automotive Domain
 
Reverse Engineering for Documenting Software Architectures, a Literature Review
Reverse Engineering for Documenting Software Architectures, a Literature ReviewReverse Engineering for Documenting Software Architectures, a Literature Review
Reverse Engineering for Documenting Software Architectures, a Literature Review
 
Presentation
PresentationPresentation
Presentation
 
Technical report jada
Technical report jadaTechnical report jada
Technical report jada
 

Viewers also liked

Infografía: Cuatro Grandes Habilidades Lingüísticas
Infografía: Cuatro Grandes Habilidades Lingüísticas Infografía: Cuatro Grandes Habilidades Lingüísticas
Infografía: Cuatro Grandes Habilidades Lingüísticas Diana Ortiz
 
Javaユーザに知ってほしい Processing入門
Javaユーザに知ってほしいProcessing入門Javaユーザに知ってほしいProcessing入門
Javaユーザに知ってほしい Processing入門chickenJr
 
The franchising challenges forecast for international brands analysis
The franchising challenges forecast for international brands analysis The franchising challenges forecast for international brands analysis
The franchising challenges forecast for international brands analysis FranGlobal
 
UML and Data Modeling - A Reconciliation
UML and Data Modeling - A ReconciliationUML and Data Modeling - A Reconciliation
UML and Data Modeling - A Reconciliationdmurph4
 
Meglio un marketing interno o un'agenzia
Meglio un marketing interno o un'agenzia Meglio un marketing interno o un'agenzia
Meglio un marketing interno o un'agenzia Federico Simonetti
 
Domain-Driven Data
Domain-Driven DataDomain-Driven Data
Domain-Driven DataDATAVERSITY
 
Literatura trovadoresca
Literatura trovadoresca Literatura trovadoresca
Literatura trovadoresca Lurdes Augusto
 

Viewers also liked (11)

Infografía: Cuatro Grandes Habilidades Lingüísticas
Infografía: Cuatro Grandes Habilidades Lingüísticas Infografía: Cuatro Grandes Habilidades Lingüísticas
Infografía: Cuatro Grandes Habilidades Lingüísticas
 
Javaユーザに知ってほしい Processing入門
Javaユーザに知ってほしいProcessing入門Javaユーザに知ってほしいProcessing入門
Javaユーザに知ってほしい Processing入門
 
Ashford bus 318
Ashford bus 318Ashford bus 318
Ashford bus 318
 
The franchising challenges forecast for international brands analysis
The franchising challenges forecast for international brands analysis The franchising challenges forecast for international brands analysis
The franchising challenges forecast for international brands analysis
 
UML and Data Modeling - A Reconciliation
UML and Data Modeling - A ReconciliationUML and Data Modeling - A Reconciliation
UML and Data Modeling - A Reconciliation
 
El desastre de Chernóbyl
El desastre de ChernóbylEl desastre de Chernóbyl
El desastre de Chernóbyl
 
Parque de Bomberos de Cazalla
Parque de Bomberos de CazallaParque de Bomberos de Cazalla
Parque de Bomberos de Cazalla
 
Meglio un marketing interno o un'agenzia
Meglio un marketing interno o un'agenzia Meglio un marketing interno o un'agenzia
Meglio un marketing interno o un'agenzia
 
CFO Role - riesgos
CFO Role - riesgosCFO Role - riesgos
CFO Role - riesgos
 
Domain-Driven Data
Domain-Driven DataDomain-Driven Data
Domain-Driven Data
 
Literatura trovadoresca
Literatura trovadoresca Literatura trovadoresca
Literatura trovadoresca
 

Similar to naveed-kamran-software-architecture-agile

Good Slides on Architecture.ppt
Good Slides on Architecture.pptGood Slides on Architecture.ppt
Good Slides on Architecture.pptpoleshan
 
Connecting the dots mbse process dec02 2015
Connecting the dots mbse process dec02 2015Connecting the dots mbse process dec02 2015
Connecting the dots mbse process dec02 2015loydbakerjr
 
Scalable and Cost-Effective Model-Based Software Verification and Testing
Scalable and Cost-Effective Model-Based Software Verification and TestingScalable and Cost-Effective Model-Based Software Verification and Testing
Scalable and Cost-Effective Model-Based Software Verification and TestingLionel Briand
 
An Introduction to Clinical Study Migrations
An Introduction to Clinical Study MigrationsAn Introduction to Clinical Study Migrations
An Introduction to Clinical Study MigrationsPerficient, Inc.
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Lionel Briand
 
Model Based Test Validation and Oracles for Data Acquisition Systems
Model Based Test Validation and Oracles for Data Acquisition SystemsModel Based Test Validation and Oracles for Data Acquisition Systems
Model Based Test Validation and Oracles for Data Acquisition SystemsLionel Briand
 
Intern Project Showcase.pptx
Intern Project Showcase.pptxIntern Project Showcase.pptx
Intern Project Showcase.pptxritikgarg48
 
Reproducibility of computational workflows is automated using continuous anal...
Reproducibility of computational workflows is automated using continuous anal...Reproducibility of computational workflows is automated using continuous anal...
Reproducibility of computational workflows is automated using continuous anal...Kento Aoyama
 
Mixing d ps building architecture on the cross cutting example
Mixing d ps building architecture on the cross cutting exampleMixing d ps building architecture on the cross cutting example
Mixing d ps building architecture on the cross cutting examplecorehard_by
 
Software development PROCESS
Software development PROCESSSoftware development PROCESS
Software development PROCESSIvano Malavolta
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
A presentation on forward engineering
A presentation on forward engineeringA presentation on forward engineering
A presentation on forward engineeringGTU
 
Lecture 3 software_engineering
Lecture 3 software_engineeringLecture 3 software_engineering
Lecture 3 software_engineeringmoduledesign
 
Lecture 3 software_engineering
Lecture 3 software_engineeringLecture 3 software_engineering
Lecture 3 software_engineeringmoduledesign
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesTao Xie
 

Similar to naveed-kamran-software-architecture-agile (20)

Good Slides on Architecture.ppt
Good Slides on Architecture.pptGood Slides on Architecture.ppt
Good Slides on Architecture.ppt
 
Connecting the dots mbse process dec02 2015
Connecting the dots mbse process dec02 2015Connecting the dots mbse process dec02 2015
Connecting the dots mbse process dec02 2015
 
Seminar on Project Management by Rj
Seminar on Project Management by RjSeminar on Project Management by Rj
Seminar on Project Management by Rj
 
Scalable and Cost-Effective Model-Based Software Verification and Testing
Scalable and Cost-Effective Model-Based Software Verification and TestingScalable and Cost-Effective Model-Based Software Verification and Testing
Scalable and Cost-Effective Model-Based Software Verification and Testing
 
An Introduction to Clinical Study Migrations
An Introduction to Clinical Study MigrationsAn Introduction to Clinical Study Migrations
An Introduction to Clinical Study Migrations
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
 
Model Based Test Validation and Oracles for Data Acquisition Systems
Model Based Test Validation and Oracles for Data Acquisition SystemsModel Based Test Validation and Oracles for Data Acquisition Systems
Model Based Test Validation and Oracles for Data Acquisition Systems
 
Intern Project Showcase.pptx
Intern Project Showcase.pptxIntern Project Showcase.pptx
Intern Project Showcase.pptx
 
Design For Testability
Design For TestabilityDesign For Testability
Design For Testability
 
Reproducibility of computational workflows is automated using continuous anal...
Reproducibility of computational workflows is automated using continuous anal...Reproducibility of computational workflows is automated using continuous anal...
Reproducibility of computational workflows is automated using continuous anal...
 
Ch 9-design-engineering
Ch 9-design-engineeringCh 9-design-engineering
Ch 9-design-engineering
 
Architecture Haiku
Architecture HaikuArchitecture Haiku
Architecture Haiku
 
Architectural design
Architectural designArchitectural design
Architectural design
 
Mixing d ps building architecture on the cross cutting example
Mixing d ps building architecture on the cross cutting exampleMixing d ps building architecture on the cross cutting example
Mixing d ps building architecture on the cross cutting example
 
Software development PROCESS
Software development PROCESSSoftware development PROCESS
Software development PROCESS
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
A presentation on forward engineering
A presentation on forward engineeringA presentation on forward engineering
A presentation on forward engineering
 
Lecture 3 software_engineering
Lecture 3 software_engineeringLecture 3 software_engineering
Lecture 3 software_engineering
 
Lecture 3 software_engineering
Lecture 3 software_engineeringLecture 3 software_engineering
Lecture 3 software_engineering
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 

naveed-kamran-software-architecture-agile

  • 1. Measuring Architecture Quality by Structure Plus History Analysis Presented by Naveed Kamran naveed@datenc.com
  • 2. About Authors • Robert Schwanke • Siemens Corporation, Corporate Technology • Princeton, New Jersey, USA • robert.schwanke@siemens.com • Lu Xiao, Yuanfang Cai • Computer Science Department, Drexel University • Philadelphia, Pennsylvania, USA • {lx52,yc349}@drexel.edu 2
  • 3. Agenda • Terms used in Research Paper • Research Paper Details I. Motivation and Background II. Related Work III. Case Study Questions IV. Case Study Procedure V. Case Study and Results VI. Discussion of Results VII. Conclusion • Problems 3
  • 4. Terms used in paper Fan-in • The number of modules that depend on a given module Fan Out • The number of modules that the given module depends on Correlation • A mutual relationship or connection between two or more things (for example modules/functions etc.). Outlier • A module or function situated away or detached from the main body or system. 4
  • 5. Abstract • A case study paper • Combines • Known software structure analysis techniques • Known revision history analysis techniques • To predict bug-related change frequency, and uncover architecture- related risks in an agile industrial software development project. 5
  • 6. (I). Motivation and Background • Architecture often receives inadequate attention, especially in agile software development • Only a few principles kept in the heads of developers • As small projects grow up, they tend to postpone both architecture specification and necessary restructuring • The experience of restructuring is hard to justify until there is a crisis. • Architectural concerns are debts • These debts tends to hurt future project sustainability much more than it does current functionality or quality • Development team needs ways to prioritize the backlog of restructuring tasks, selecting which ones to include in the architecture renovation plan • This is a case study of measuring architecture quality and identifying architecture issues by combining analyses of dependency structure and revision history 6
  • 7. Is your business ready to scale changes in rapid manner? 7
  • 8. (II). Related Work • Size Measures • Number of lines of code (LOC) • Number of non-commented lines of code (NCLOC) • Number of executable lines of code (ELOC) • Kilo-SLOC (Thousand Source/Software Lines of Code) (KSLOC) • Complexity Measures • Fan-In • Fan-Out • Change Measures • LOC added/deleted and number of sub systems affected • Effort and Impact Measures • Counting tasks, bug reports, file versions, change sets, or staff hours 8
  • 9. Related Work (2) • Types of Changes • Bug/Task • Software Fault Prediction and Project Repositories • Which files will have faults in future • Prediction Performance • Correlation, accuracy, precision and recall • Correlation and Regression Analysis of Count Data • Architecture Violation Detection • Find dependencies that violate the architecture structure 9
  • 10. (III). Case Study Questions 1. Does this combined structure/history measurement reveal critical architecture issues that are worth fixing? 2. Are there any structure-based measures that can be used to predict quality variation in the absence of adequate revision history data? 3. What measurements could help the developers make important architectural decisions, and how? 10
  • 11. Developer Concerns • “What can I do with this information? There will always be some modules with high complexity, coupling, fanout, or size. What will I gain by reorganizing them? Why do I care if the code violates the architecture, as long as it works?” 11
  • 12. (IV). Case study procedure • Data collection • Structure and history measurement • Validation • Prediction • Uncovering architecture problems • Presenting findings • Investigating developer concerns 12
  • 13. (v). Case Study and Results (a) The project and its data • About Team • Team of 20 developers • About Project • Project name System J (code name) • Service oriented structure • 300 KSLOC in 900 files and 165 java packages • Development Methodology • Agile project development • Project manager is also the customer proxy • Sprints are usually two weeks long • Rebuild and automatically tested at least nightly • VCS • Mercurial • Bug and Tracking system • JIRA 13
  • 15. Project Support Tool (BunnyCRM) 15
  • 16. Case Study and Results (b) Measure of structure and history • Single-file measures 1. File size 2. Fan-in 3. Fan-out 4. Change frequency 5. Ticket Frequency 6. Bug Change Frequency • File pair measure 1. Pair Change Frequency 16
  • 17. Case Study and Results (b) Measure of structure and history 17
  • 18. Case Study and Results (b) Measure of structure and history 18
  • 19. Case Study and Results (c) Exploratory data analysis • Distribution of each measure’s values • Scatter plots of relationships between measure • Comparing Releases 19
  • 20. Case Study and Results (c) Exploratory data analysis - Distribution of each measure’s values 20
  • 21. Case Study and Results (c) Exploratory data analysis - Scatter plots of relationships between measure 21
  • 22. Case Study and Results (c) Exploratory data analysis - Comparing Releases 22
  • 23. Case Study and Results (d) Validation on project data • Faulty file recall. An ‘instance’ is a file that is changed (in the future) at least once due to any bug ticket, but additional bug tickets and additional changes don’t count extra. • Fault recall. An ‘instance’ is a pair <file, bug ticket>,where the file is changed at least once due to that bug ticket. Additional changes due to the same ticket don’t count extra. • Fault impact recall. An ‘instance’ is a triple <file, change set, bug ticket>, where the file is a member of the change set and the change set is associated with the bug ticket. If a file is checked in several times due to the same bug ticket, each time is a separate instance. 23
  • 24. Case Study and Results (d) Validation on project data 24
  • 25. Case Study and Results (e) Uncovering architecture problems • Outlier Analysis – Individual Files • Lists the predicted fault-prone files “worst first”, sorting them according to the median of their ranks by fan-out, size, and change frequency. • Structure Browsing with Understand™ • A static code analysis tool called Understand™. It creates a database of code structure and crossreference information, processes queries on this information, and renders the results using Graphviz, a popular, free graph drawing engine • Using this tool, we would analyze a file by creating a diagram that included the file itself and all of its immediate neighbors in the cross-reference model, including the packages in which these files are located. Showing the enclosing packages provides guidance for analysts who are unfamiliar with some of the files. • Outlier Analysis – Pairs and Clusters • In addition to measuring individual files, we investigated the structure and history of pairs and clusters of files. • According to Parnas, the decomposition of a system should be based on mature, stable design decisions and should encapsulate known future variabilities, so that the impact of each kind of future change is as limited as practical. 25
  • 26. Case Study and Results (f) Investigating a Developer Concern • A senior developer noticed that most of the fault-prone files belong to the same entry point package. • Authors used the layered dependency model to investigate and illustrate the problem. • The root cause was inadequate architecture awareness by junior developers. • The senior developer wrote a small renovation proposal, using this data and diagrams to justify the work of architecture changes. 26
  • 27. (VI) Discussion of Results a) Answering the case study questions 1. Does this combined structure/history measurement reveal critical architecture issues that are worth fixing? • There are key interface files that are supposed to be stable and correct, but actually have faults and change frequently. • Frequent, distant change pairs often correspond to important architecture weaknesses or violations. • Structure gives us a visual context for analyzing change associations found in the history, discovering important, but undocumented shared assumptions that should be made explicit. 27
  • 28. (VI) Discussion of Results a) Answering the case study questions • 2. Are there any structure-based measures that can be used, without history, to predict quality variation? • High fan-out, it should be examined for accidental complexity and architecture violations. Fan-in was not a good predictor of quality • High fan-in files did tend to be architecturally- significant infrastructure classes. 28
  • 29. (VI) Discussion of Results a) Answering the case study questions • 3. What measurements could help the developers make important architectural decisions, and how? • Outlier Investigation was a good way to find low-hanging, bad smelling fruit. • Predicting fault-proneness helped to prioritize architecture risks. • Clustering distant change pairs seemed to bring together locality violations with the same root cause. 29
  • 30. (VI) Discussion of Results • b) Visualizing structural concerns • Interactive, visual architecture models are essential for analyzing and communicating architecture issues. • C) Overcoming dirty data • Check correlations between change types, developer habits, and other measurable attributes, then tailor the prediction functions to account for them • d). Novelty and effectiveness • Approach is repeatable and on other projects using similar tools • e). Assumption and limitations • we only studied one project, and we cannot claim that the result can be generalized to other projects with different domains or using different tools or languages. 30
  • 31. Conclusion • This case study reported measuring code structure and history to identify architecture problems in an industrial project. • It demonstrated the effectiveness of the combination by enabling developers to visualize the architectural roles and impact scope of files with high fault and change frequency, which justified and supported a successful renovation task. 31
  • 32. Problems • They have not considered following types of changes: • Incident (A non producible defect) • Lead 32
  • 33. Problems - Why release 1.05 and 1.0 skipped? 33
  • 34. Problems (3) • The author of this paper self explained that • They considered only one project and a single business domain. Its not enough to be applied on all kinds of projects and business domains. • They have not use bug severity. A change in a file can be because of label change or a very minor cosmetic change in the user interface of application. But all kinds of bugs are treated with single severity 34