SlideShare a Scribd company logo
1 of 73
An Ecosystemic
and Socio-Technical
View on Software
Maintenance & Evolution
Tom Mens @tom_mens
COMPLEXYS Research Institute
University of Mons, Belgium
-1999 PhD
@VUB
1999-2003
Postdoc @VUB
2003-now
(full) professor
OO design
&
refactoring
MDSE, model
transformation
empirical
research of
software
ecosystems
2004 2008
1994-
2004
1998-
2004
2010-
now
Research Collaborators
Research Context
2012-2017 ongoing research project
“Ecological Studies of Open Source Software Ecosystems”
- Interdisciplinary research
- Use ideas from biological ecology to understand and
improve evolution of software ecosystems
A software ecosystem is a collection
of software projects that are
developed and evolve together in
the same environment.
Mircea Lungu
(PhD, 2008)
8
Software Ecosystem Examples
Gnom
e
CRAN
Debian Ubuntu KDE
JavaScript Ruby
When things go wrong…
CRAN
Credits: http://www.designandanalytics.com/cran-gephi/
Package dependency graph
> 9K active packages
> 21K dependencies
in April 2016
CRAN
• Increasing number of R packages hosted on GitHub
“non-transparent nature of the CRAN submission / rejection
process”
“CRAN […] is revealing some limitations of the current design. One
such problem is the general lack of dependency versioning in the
infrastructure.”
• Problems with breaking dependencies
“It is more and more of a pain if the package I’m depending on
breaks”
“One recent example was the forced roll-back of the ggplot2
update to version 0.9.0, because the introduced changes caused
several other packages to break.”
Decan et al. “When GitHub Meets CRAN: An Analysis of Inter-Repository Package
Dependency Problems.” SANER 2016
JavaScript
> 317K packages
> 728K dependencies
in June 2016
JavaScript
• Deliberate desire to distribute micropackages
• Lots of dependencies to micropackages
Example: isarray
(150 direct, 77K transitive in-deps on Aug 2016)
var toString = {}.toString;
module.exports = Array.isArray || function (arr) {
return toString.call(arr) == '[object Array]’;
};
David Haney’s code blog, March 2016
http://www.haneycodes.net/npm-left-pad-have-we-forgotten-how-to-program/
Example: leftpad
• Package leftpad
function leftpad (str, len, ch) {
str = String(str);
var i = -1;
if (!ch && ch !== 0) ch = ' ';
len = len - str.length;
while (++i < len) { str = ch + str; }
return str;
}
• What happened?
– Its developer unpublished all his modules from npm
“This impacted many thousands of projects. [...] We
began observing hundreds of failures per minute, as
dependent projects – and their dependents, and their
dependents... – all failed when requesting the now-
unpublished package.”
http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm
Example: leftpad
Departure of a
central contributor
• All bug handling became concentrated in 1 contributor
• Contributor suddenly left project, being dissatisfied
• Lasting negative impact on bug handling performance
Zanetti et al. “The rise and fall of a central contributor: Dynamics of social
organization and performance in the Gentoo community.” CHASE 2013
17
Strict policy and tools for ensuring
backward compatibility
• “Prime Directive: When evolving the Component API
from release to release, do not break existing clients”
Bogart et al. “How to break an API: Cost negotiation and community values in
three software ecosystems.” FSE 2016
18
May lead to stagnation
and drive away developers
– Coordination around synchronized yearly releases
slows down development
“If you have hip things, then you get people who create new
APIs on top of that […] These things don’t happen on the
Eclipse platform anymore.”
“you have to be very patient and know who to talk with […] in
order to get your patches accepted, and I think it’s very
intimidating for some new people to come on.”
Bogart et al. “How to break an API: Cost negotiation and community values in
three software ecosystems.” FSE 2016
Socio-Technical View
20
• Software ecosystems suffer
from problems because of
technical factors, social
reasons, or both.
• A socio-technical view
is therefore essential for
software ecosystem evolution
research.
Socio-Technical View
• Socio-technical analyses can benefit from
mixed method research
– Combine quantitative and qualitative methods
into a single study
• Empirical analysis of objective data
• user surveys and interviews
– Exploiting their complementarity increases
confidence of the findings
Johnson et al. Mixed methods research: A research paradigm whose time has
come. Educational Researcher 33(7): 14–26, 2004
Software Ecosystem (SECO)
Research Challenges
Understanding SECOs
• How are SECOs structured?
• What are their tools, habits, values, boundaries?
• How do they emerge and evolve over time?
• What are the mechanisms driving their dynamics?
• How do different SECOs compare?
• How to face technical challenges?
Serebrenik et al. “Challenges in Software Ecosystems Research.” IWSECO-WEA 2015
Software Ecosystem
Research Challenges
Supporting SECO communities
• How can they be made more sustainable and
resilient?
• How can we predict their evolution?
• How can we improve the SECO?
– In terms of productivity, quality, diversity,
maintainability, survival, popularity, …
Serebrenik et al. “Challenges in Software Ecosystems Research.” IWSECO-WEA 2015
Supporting SECOs
Increasing resilience & sustainability
24
Can the SECO
• resist to major disturbances?
• return to a stable equilibrium after a major
disturbance?
Possible approach:
• Estimate, predict and reduce risk of bus factor
Bus factor
Social view
Specific activity concentrated in few persons.
Examples:
– Single responsible for bug handling in Gentoo
– Only one developer knows some part of the code
Bus factor
Technical view
Too much software components depend on a
single software component.
– Makes components more brittle to future changes
– npm leftpad example
Bus factor
Active area of research
At least 4 GitHub projects compute (social) bus
factor.
Cosentino et al. “Assessing the bus factor of Git repositories.”
SANER 2015
Avelino et al. “A novel approach for estimating truck factors.”
ICPC 2016
Bus factor
Experimental support on GitHub
https://libraries.io/bus-factor
Bus factor
https://dependencyci.com
Supporting SECOs
Improving quality
By increasing technical wealth
through reducing technical debt
“a concept in programming that reflects the extra
development work that arises when code that is
easy to implement in the short run is used instead
of applying the best overall solution”
(Ward Cunningham, 1992)
http://legacycoderocks.libsyn.com/technical-wealth-with-declan-wheelan
Implementation of SQALE model in SonarQube
Supporting SECOs
Improving quality
Social view: Reducing social debt
“Unforeseen project cost connected to sub-optimal
organizational-social structures”
Supporting SECOs
Improving quality
Reducing social debt by removing community smells
– Organisational silo
• High decoupling and lack of communication between tasks
– Black cloud
• lack of people able to bridge the knowledge and experience gap
between distinct communities
– Prima-donnas
• Seemingly condescending and egotistical behaviour, irreceptiveness to
collaboration
– Sharing villainy
• Lack of knowledge exchange incentives
– Organisational skirmish
• Misalignment of organisational cultures between distinct communities
– …
Interdisciplinary research
“Many challenges we face are not solvable by people
remaining in their single discipline silos”…
www.newscientist.com/article/mg20928002-100-open-your-mind-to-interdisciplinary-research/
Interdisciplinary research
“bringing […] disciplines together in the long term
is what provides the big, big breakthroughs”
Interdisciplinary research
Social Network Analysis (SNA)
Social Network Analysis
Social network centrality measures
Degree
Number of in- or outgoing dependencies of a node.
Betweenness
Quantifies number of times a node acts as a bridge along the
shortest path between two other nodes.
Closeness
The more central a node, the lower its total distance from all
other nodes.
Eigenvector centrality and PageRank
Measures the influence of a node in a network.
Social Network Analysis
Social network centrality measures
Social Network Analysis
Can be used to
– detect social debt
– identify social bus factor
– predict software failures
– … and many more …
Social Network Analysis
Social bus factor in Gentoo Linux
– All bug handling became concentrated in one contributor
– Measured by significant increase of centralization and
performance.
Zanetti et al. “The rise and fall of a central contributor: Dynamics of social
organization and performance in the Gentoo community.” CHASE 2013
Social Network Analysis
Social bus factor in Gentoo Linux
– Contributor suddenly left the project, being
dissatisfied
– Sentiment analysis showed correlation with negative
emotions
– Lasting negative impact on the bug handling
performance of the community.
Zanetti et al. “The rise and fall of a central contributor: Dynamics of social
organization and performance in the Gentoo community.” CHASE 2013
Use of SNA to better predict software failures
– By combining program dependency information
with social network information
Social Network Analysis
Bird et al. “Putting it All Together: Using Socio-Technical Networks
to Predict Failures.” ISSRE 2009
Pinzger et al. “Can developer-module networks predict failures?”
FSE 2008
Mirroring hypothesis
Conway’s law
Software structure tends to mirror the
organisational/social structure
A.k.a. socio-technical congruence
alignment between technical dependencies and
social coordination in a project
Mirroring hypothesis
Conway’s law
• Evidence in favor: commercial “in-house” development
• Evidence against: “community-based” development
More modular software
=> emergent “complex network” structure?
MacCormack et al. “Exploring the duality between product and
organizational architectures: A test of the mirroring hypothesis.” Research
Policy, 2012.
Colfer et al. “The mirroring hypothesis: Theory, evidence and
exceptions.” Harvard Business School, 2010.
Interdisciplinary research
Complex Systems
Interdisciplinary research
Complexity Theory
Interdisciplinary research
Complex Systems
“A new approach to science that investigates how
relationships between parts give rise to the
collective behaviors of a system and how the
system interacts and forms relationships with its
environment.”
Emergence: process whereby larger entities,
patterns, and regularities arise through interactions
among smaller or simpler entities that themselves
do not exhibit such properties.
Complexity Theory
Network Theory
Citation from Mitchell’s book:
“network thinking is providing novel ways to think
about difficult problems such as how to do efficient
search on the Web, […] how to manage large
organisations, how to preserve ecosystems, […]
and, more generally, what kind of resilience and
vulnerabilities are intrinsic to natural, social, and
technological networks, and how to exploit and
protect such systems.”
Complexity Theory
Network Theory
Some characteristics of complex networks:
Small-world property
• Low average path length between any two nodes
• Highly-clustered components linked through hubs
Skewed distributions (power law behaviour)
• Few nodes with very high in-degree (resp. out-degree),
many nodes with very small in-degree (resp. out)
Complexity Theory
Network Theory
Some characteristics of complex networks:
Scale-freeness
• Observed degree distribution is very similar
regardless of the scale of the observation
Scale-free networks are resilient
• Robust to deletion of random (non-hub) nodes
• vulnerable to the deletion of hubs
Complexity Theory
Network Theory
Examples of complex networks exhibiting these
characteristics
– World-Wide Web
– (Technical) software dependency graphs
– Social networks (e.g. Facebook)
– (Socio-technical) software ecosystems
Complexity Theory
Network Theory
Examples of software
system dependency
networks
Network Theory
Possible applications for SECOs
• Provide prediction/forecasting models
– of how SECOs emerge
– of how SECOs grow/evolve
• Estimate the resilience and sustainability of
SECOs after major disturbances
• Assess risk of deleting hub nodes  bus factor!
Network Theory
Possible applications for SECOs
How do SECOs emerge and grow?
A popular model is preferential attachment
Over time, nodes with higher degree receive more links
than nodes with lower degree.
Extensions of this model have been proposed to
simulate the growth of complex software systems
By mimicking the principle of coupling & cohesion
Barabasi et al. Emergence of Scaling in Random Networks.
Science 286, 1999
Li et al. Multi-Level Formation of Complex Software Systems.
Entropy 18(178), 2016
Network Theory
Possible applications for SECOs
Interdisciplinary research
Ecology and natural ecosystems
Ecology and natural ecosystems
Biodiversity of species
E.g. hosts – parasites / plants – pollinators
58
Mutual dependency and
functional redundancy
Disappearing species may be
compensated by others if there is
sufficient diversity in both layers.
Ecology and natural ecosystems
Diversity metrics
• species richness = number of different species in the ecosystem
• species evenness (entropy) = relative abundance of the
population of each species in the ecosystem
• Shannon diversity index (relative entropy) = specialisation of a
given species in relation to the species in the other level
• Simpson index = degree of concentration when individuals are
classified into species
5
Software Ecosystems
Diversity in software ecosystems
62
Mutual dependency and
functional redundancy
Disappearance of projects or
contributors may be
compensated by others.
Software Ecosystems
Diversity
Are software project teams diverse?
– In terms of code ownership, types of activity,
gender balance, seniority, …
How does this diversity affect …
– defect-proneness?
– productivity?
– …
Software Ecosystems
Diversity
Success story of diversity measures:
Assess defect-proneness in software projects
• More focused developers introduce fewer defects.
• Modules receiving narrowly focused activity
are more likely to contain defects.
Posnett et al. Dual Ecological Measures of Focus in Software development.
ICSE 2013
Software Ecosystems
Gender Diversity
Effect of gender diversity on productivity?
Women underrepresented in programming
– industry: 16-18% female developers
– open source: ~10%
– social coding platforms:
• GitHub: ~9%
• StackOverflow: ~7%
Vasilescu et al. Gender and tenure diversity in GitHub teams. CHI 2015
A Data Set for Social Diversity Studies of GitHub Teams (MSR’15)
Software Ecosystems
Gender Diversity
Success story of diversity measures:
– Gender and tenure diversity are positive and
significant predictors of productivity
– Teams that are more balanced in terms of gender
and seniority have higher productivity rates
Vasilescu et al. Gender and tenure diversity in GitHub teams. CHI 2015
Interdisciplinary research
Survival Analysis
Statistical technique used in many disciplines to
analyze the time until the occurrence of an
event of interest
• Medicine
– Effect of treatment or medicine to cure disease
– Effect of disease on patient mortality
• Sociology
– Factors influencing marriage or divorce
Interdisciplinary research
Survival Analysis
Interdisciplinary research
Survival Analysis
Success story:
OSS project survival
Factors positively
influencing survival:
#contributors
Project age
Basis for prediction
models
Samoladas et al. Survival analysis on the duration of open source projects.
IST 2010
SECO Research Challenges
continued…
Understanding SECOs
• How do different SECOs compare?
• How to face technical challenges?
– Big data
– Privacy versus reproducibility
Serebrenik et al. “Challenges in Software Ecosystems Research.” IWSECO-WEA 2015
Research Challenge
Comparing SECOs
• Each software ecosystem
– has specific habits, expectations, change policies
– uses specific tools
• Taking into account these differences is
important
– to support SECO maintenance and evolution
– to generalise research findings across SECOs
Bogart et al. “How to break an API: Cost negotiation and community values in
three software ecosystems.” FSE 2016
Decan et al. “On the topology of package dependency networks – A
comparison of three programming language ecosystems.” WEA 2016
Research Challenge
Big Data
Volume Velocity
Variety Veracity
4V
Research Challenge
Privacy Reproducibility
Research Challenge
Privacy vs reproducibility
How to preserve privacy of individuals?
– EU 2016/679 regulation on the protection of natural
persons with regard to the processing of personal data
and on the free movement of such data
“The principles of data protection should apply to any information
concerning an identified or identifiable natural person. “
– Appropriate anonimisation and privacy-preserving data
mining techniques needed
Fung et al. Privacy-preserving data publishing: A survey of recent
developments. ACM Computing Surveys 2010
Malik et al. Privacy preserving data mining techniques: Current scenario and
future prospects. IC3T 2012
Research Challenge
Privacy vs reproducibility
• Increase/ensure reproducible research results
– Awareness is increasing
– Solutions are being put into place
– Big data problems remain an issue
• How to reconcile privacy with reproducibility?
Gonzalez-Barahona et al. On the reproducibility of empirical software
engineering studies based on data retrieved from development repositories.
Emp. Softw. Eng. 2012
Wrap-up
Research on SECO evolution requires
– A socio-technical view
– Mixed method research
– Interdisciplinary research
Many technical challenges need to be faced
Are you willing to take up the challenge?
ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

More Related Content

What's hot

ECOS: Ecological Studies of Open Source Software Ecosystems (@ CSMR-WCRE 2014...
ECOS: Ecological Studies of Open Source Software Ecosystems (@ CSMR-WCRE 2014...ECOS: Ecological Studies of Open Source Software Ecosystems (@ CSMR-WCRE 2014...
ECOS: Ecological Studies of Open Source Software Ecosystems (@ CSMR-WCRE 2014...Tom Mens
 
Survival analysis of database technologies in open source Java projects
Survival analysis of database technologies in open source Java projectsSurvival analysis of database technologies in open source Java projects
Survival analysis of database technologies in open source Java projectsTom Mens
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainabilityDaniel S. Katz
 
Kuchinsky_Cytoscape_BOSC2009
Kuchinsky_Cytoscape_BOSC2009Kuchinsky_Cytoscape_BOSC2009
Kuchinsky_Cytoscape_BOSC2009bosc
 
'Scikit-project': How open source is empowering open science – and vice versa
'Scikit-project': How open source is empowering open science – and vice versa'Scikit-project': How open source is empowering open science – and vice versa
'Scikit-project': How open source is empowering open science – and vice versaNathan Shammah
 
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckHotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckTao Xie
 

What's hot (6)

ECOS: Ecological Studies of Open Source Software Ecosystems (@ CSMR-WCRE 2014...
ECOS: Ecological Studies of Open Source Software Ecosystems (@ CSMR-WCRE 2014...ECOS: Ecological Studies of Open Source Software Ecosystems (@ CSMR-WCRE 2014...
ECOS: Ecological Studies of Open Source Software Ecosystems (@ CSMR-WCRE 2014...
 
Survival analysis of database technologies in open source Java projects
Survival analysis of database technologies in open source Java projectsSurvival analysis of database technologies in open source Java projects
Survival analysis of database technologies in open source Java projects
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
 
Kuchinsky_Cytoscape_BOSC2009
Kuchinsky_Cytoscape_BOSC2009Kuchinsky_Cytoscape_BOSC2009
Kuchinsky_Cytoscape_BOSC2009
 
'Scikit-project': How open source is empowering open science – and vice versa
'Scikit-project': How open source is empowering open science – and vice versa'Scikit-project': How open source is empowering open science – and vice versa
'Scikit-project': How open source is empowering open science – and vice versa
 
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckHotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
 

Similar to ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Social Debt Analytics for Improving the Management of Software Evolution Tasks
Social Debt Analytics for Improving the Management of Software Evolution TasksSocial Debt Analytics for Improving the Management of Software Evolution Tasks
Social Debt Analytics for Improving the Management of Software Evolution TasksFabio Palomba
 
Implementing Web Applications as Social Machines Composition: a Case Study
Implementing Web Applications as Social Machines Composition: a Case StudyImplementing Web Applications as Social Machines Composition: a Case Study
Implementing Web Applications as Social Machines Composition: a Case StudyKellyton Brito
 
Cultivating Sustainable Software For Research
Cultivating Sustainable Software For ResearchCultivating Sustainable Software For Research
Cultivating Sustainable Software For ResearchNeil Chue Hong
 
Is software engineering research addressing software engineering problems?
Is software engineering research addressing software engineering problems?Is software engineering research addressing software engineering problems?
Is software engineering research addressing software engineering problems?Gail Murphy
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software AnalyticsMargaret-Anne Storey
 
2015-11-11 research seminar
2015-11-11 research seminar2015-11-11 research seminar
2015-11-11 research seminarifi8106tlu
 
Automating environmental impact analyses to improve urban planning in New Yor...
Automating environmental impact analyses to improve urban planning in New Yor...Automating environmental impact analyses to improve urban planning in New Yor...
Automating environmental impact analyses to improve urban planning in New Yor...mysociety
 
Leveraging the Crowd: Supporting Newcomers to Build an OSS Community
Leveraging the Crowd: Supporting Newcomers to Build an OSS CommunityLeveraging the Crowd: Supporting Newcomers to Build an OSS Community
Leveraging the Crowd: Supporting Newcomers to Build an OSS CommunityMarco Aurelio Gerosa
 
AudrisMockus_MSR22.pdf
AudrisMockus_MSR22.pdfAudrisMockus_MSR22.pdf
AudrisMockus_MSR22.pdfTapajitDey1
 
Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Sandro D'Elia
 
Micro patterns in agile software
Micro patterns in agile softwareMicro patterns in agile software
Micro patterns in agile softwareUjjwal Joshi
 
Personal dashboards for individual learning and project awareness in social s...
Personal dashboards for individual learning and project awareness in social s...Personal dashboards for individual learning and project awareness in social s...
Personal dashboards for individual learning and project awareness in social s...Wolfgang Reinhardt
 
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...IJITE
 
Non-software OSS projects
Non-software OSS projectsNon-software OSS projects
Non-software OSS projectsguest214454
 

Similar to ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution (20)

Social Debt Analytics for Improving the Management of Software Evolution Tasks
Social Debt Analytics for Improving the Management of Software Evolution TasksSocial Debt Analytics for Improving the Management of Software Evolution Tasks
Social Debt Analytics for Improving the Management of Software Evolution Tasks
 
Implementing Web Applications as Social Machines Composition: a Case Study
Implementing Web Applications as Social Machines Composition: a Case StudyImplementing Web Applications as Social Machines Composition: a Case Study
Implementing Web Applications as Social Machines Composition: a Case Study
 
lecture 1-5.pdf
lecture 1-5.pdflecture 1-5.pdf
lecture 1-5.pdf
 
Cultivating Sustainable Software For Research
Cultivating Sustainable Software For ResearchCultivating Sustainable Software For Research
Cultivating Sustainable Software For Research
 
Is software engineering research addressing software engineering problems?
Is software engineering research addressing software engineering problems?Is software engineering research addressing software engineering problems?
Is software engineering research addressing software engineering problems?
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software Analytics
 
2015-11-11 research seminar
2015-11-11 research seminar2015-11-11 research seminar
2015-11-11 research seminar
 
Automating environmental impact analyses to improve urban planning in New Yor...
Automating environmental impact analyses to improve urban planning in New Yor...Automating environmental impact analyses to improve urban planning in New Yor...
Automating environmental impact analyses to improve urban planning in New Yor...
 
Cloudengine at SEDA 2011
Cloudengine at SEDA 2011Cloudengine at SEDA 2011
Cloudengine at SEDA 2011
 
Leveraging the Crowd: Supporting Newcomers to Build an OSS Community
Leveraging the Crowd: Supporting Newcomers to Build an OSS CommunityLeveraging the Crowd: Supporting Newcomers to Build an OSS Community
Leveraging the Crowd: Supporting Newcomers to Build an OSS Community
 
Nonsoftwareoss
NonsoftwareossNonsoftwareoss
Nonsoftwareoss
 
AudrisMockus_MSR22.pdf
AudrisMockus_MSR22.pdfAudrisMockus_MSR22.pdf
AudrisMockus_MSR22.pdf
 
Howison si2 keynote
Howison si2 keynoteHowison si2 keynote
Howison si2 keynote
 
Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708
 
Micro patterns in agile software
Micro patterns in agile softwareMicro patterns in agile software
Micro patterns in agile software
 
V5 i3201613
V5 i3201613V5 i3201613
V5 i3201613
 
Lopez
LopezLopez
Lopez
 
Personal dashboards for individual learning and project awareness in social s...
Personal dashboards for individual learning and project awareness in social s...Personal dashboards for individual learning and project awareness in social s...
Personal dashboards for individual learning and project awareness in social s...
 
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
 
Non-software OSS projects
Non-software OSS projectsNon-software OSS projects
Non-software OSS projects
 

More from Tom Mens

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD studentTom Mens
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentTom Mens
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubTom Mens
 
The (r)evolution of CI/CD on GitHub
 The (r)evolution of CI/CD on GitHub The (r)evolution of CI/CD on GitHub
The (r)evolution of CI/CD on GitHubTom Mens
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureTom Mens
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Tom Mens
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubTom Mens
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networksTom Mens
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsTom Mens
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero SpaceTom Mens
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesTom Mens
 
Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Tom Mens
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Tom Mens
 
On the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsOn the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsTom Mens
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...Tom Mens
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Tom Mens
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Tom Mens
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsTom Mens
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarTom Mens
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersTom Mens
 

More from Tom Mens (20)

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD student
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software development
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHub
 
The (r)evolution of CI/CD on GitHub
 The (r)evolution of CI/CD on GitHub The (r)evolution of CI/CD on GitHub
The (r)evolution of CI/CD on GitHub
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the Future
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHub
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networks
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero Space
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messages
 
Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
 
On the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsOn the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystems
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research Achievements
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminar
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package Managers
 

Recently uploaded

Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Tamer Koksalan, PhD
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxnoordubaliya2003
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomyDrAnita Sharma
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 

Recently uploaded (20)

Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptx
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomy
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 

ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

  • 1. An Ecosystemic and Socio-Technical View on Software Maintenance & Evolution Tom Mens @tom_mens COMPLEXYS Research Institute University of Mons, Belgium
  • 2.
  • 3.
  • 5. OO design & refactoring MDSE, model transformation empirical research of software ecosystems 2004 2008 1994- 2004 1998- 2004 2010- now
  • 7. Research Context 2012-2017 ongoing research project “Ecological Studies of Open Source Software Ecosystems” - Interdisciplinary research - Use ideas from biological ecology to understand and improve evolution of software ecosystems A software ecosystem is a collection of software projects that are developed and evolve together in the same environment. Mircea Lungu (PhD, 2008)
  • 9. When things go wrong…
  • 10. CRAN Credits: http://www.designandanalytics.com/cran-gephi/ Package dependency graph > 9K active packages > 21K dependencies in April 2016
  • 11. CRAN • Increasing number of R packages hosted on GitHub “non-transparent nature of the CRAN submission / rejection process” “CRAN […] is revealing some limitations of the current design. One such problem is the general lack of dependency versioning in the infrastructure.” • Problems with breaking dependencies “It is more and more of a pain if the package I’m depending on breaks” “One recent example was the forced roll-back of the ggplot2 update to version 0.9.0, because the introduced changes caused several other packages to break.” Decan et al. “When GitHub Meets CRAN: An Analysis of Inter-Repository Package Dependency Problems.” SANER 2016
  • 12. JavaScript > 317K packages > 728K dependencies in June 2016
  • 13. JavaScript • Deliberate desire to distribute micropackages • Lots of dependencies to micropackages Example: isarray (150 direct, 77K transitive in-deps on Aug 2016) var toString = {}.toString; module.exports = Array.isArray || function (arr) { return toString.call(arr) == '[object Array]’; }; David Haney’s code blog, March 2016 http://www.haneycodes.net/npm-left-pad-have-we-forgotten-how-to-program/
  • 15. • Package leftpad function leftpad (str, len, ch) { str = String(str); var i = -1; if (!ch && ch !== 0) ch = ' '; len = len - str.length; while (++i < len) { str = ch + str; } return str; } • What happened? – Its developer unpublished all his modules from npm “This impacted many thousands of projects. [...] We began observing hundreds of failures per minute, as dependent projects – and their dependents, and their dependents... – all failed when requesting the now- unpublished package.” http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm Example: leftpad
  • 16. Departure of a central contributor • All bug handling became concentrated in 1 contributor • Contributor suddenly left project, being dissatisfied • Lasting negative impact on bug handling performance Zanetti et al. “The rise and fall of a central contributor: Dynamics of social organization and performance in the Gentoo community.” CHASE 2013
  • 17. 17 Strict policy and tools for ensuring backward compatibility • “Prime Directive: When evolving the Component API from release to release, do not break existing clients” Bogart et al. “How to break an API: Cost negotiation and community values in three software ecosystems.” FSE 2016
  • 18. 18 May lead to stagnation and drive away developers – Coordination around synchronized yearly releases slows down development “If you have hip things, then you get people who create new APIs on top of that […] These things don’t happen on the Eclipse platform anymore.” “you have to be very patient and know who to talk with […] in order to get your patches accepted, and I think it’s very intimidating for some new people to come on.” Bogart et al. “How to break an API: Cost negotiation and community values in three software ecosystems.” FSE 2016
  • 19. Socio-Technical View 20 • Software ecosystems suffer from problems because of technical factors, social reasons, or both. • A socio-technical view is therefore essential for software ecosystem evolution research.
  • 20. Socio-Technical View • Socio-technical analyses can benefit from mixed method research – Combine quantitative and qualitative methods into a single study • Empirical analysis of objective data • user surveys and interviews – Exploiting their complementarity increases confidence of the findings Johnson et al. Mixed methods research: A research paradigm whose time has come. Educational Researcher 33(7): 14–26, 2004
  • 21. Software Ecosystem (SECO) Research Challenges Understanding SECOs • How are SECOs structured? • What are their tools, habits, values, boundaries? • How do they emerge and evolve over time? • What are the mechanisms driving their dynamics? • How do different SECOs compare? • How to face technical challenges? Serebrenik et al. “Challenges in Software Ecosystems Research.” IWSECO-WEA 2015
  • 22. Software Ecosystem Research Challenges Supporting SECO communities • How can they be made more sustainable and resilient? • How can we predict their evolution? • How can we improve the SECO? – In terms of productivity, quality, diversity, maintainability, survival, popularity, … Serebrenik et al. “Challenges in Software Ecosystems Research.” IWSECO-WEA 2015
  • 23. Supporting SECOs Increasing resilience & sustainability 24 Can the SECO • resist to major disturbances? • return to a stable equilibrium after a major disturbance? Possible approach: • Estimate, predict and reduce risk of bus factor
  • 24. Bus factor Social view Specific activity concentrated in few persons. Examples: – Single responsible for bug handling in Gentoo – Only one developer knows some part of the code
  • 25. Bus factor Technical view Too much software components depend on a single software component. – Makes components more brittle to future changes – npm leftpad example
  • 26. Bus factor Active area of research At least 4 GitHub projects compute (social) bus factor. Cosentino et al. “Assessing the bus factor of Git repositories.” SANER 2015 Avelino et al. “A novel approach for estimating truck factors.” ICPC 2016
  • 27. Bus factor Experimental support on GitHub https://libraries.io/bus-factor
  • 29. Supporting SECOs Improving quality By increasing technical wealth through reducing technical debt “a concept in programming that reflects the extra development work that arises when code that is easy to implement in the short run is used instead of applying the best overall solution” (Ward Cunningham, 1992) http://legacycoderocks.libsyn.com/technical-wealth-with-declan-wheelan
  • 30. Implementation of SQALE model in SonarQube
  • 31. Supporting SECOs Improving quality Social view: Reducing social debt “Unforeseen project cost connected to sub-optimal organizational-social structures”
  • 32. Supporting SECOs Improving quality Reducing social debt by removing community smells – Organisational silo • High decoupling and lack of communication between tasks – Black cloud • lack of people able to bridge the knowledge and experience gap between distinct communities – Prima-donnas • Seemingly condescending and egotistical behaviour, irreceptiveness to collaboration – Sharing villainy • Lack of knowledge exchange incentives – Organisational skirmish • Misalignment of organisational cultures between distinct communities – …
  • 33. Interdisciplinary research “Many challenges we face are not solvable by people remaining in their single discipline silos”… www.newscientist.com/article/mg20928002-100-open-your-mind-to-interdisciplinary-research/
  • 34. Interdisciplinary research “bringing […] disciplines together in the long term is what provides the big, big breakthroughs”
  • 36. Social Network Analysis Social network centrality measures Degree Number of in- or outgoing dependencies of a node. Betweenness Quantifies number of times a node acts as a bridge along the shortest path between two other nodes. Closeness The more central a node, the lower its total distance from all other nodes. Eigenvector centrality and PageRank Measures the influence of a node in a network.
  • 37. Social Network Analysis Social network centrality measures
  • 38. Social Network Analysis Can be used to – detect social debt – identify social bus factor – predict software failures – … and many more …
  • 39. Social Network Analysis Social bus factor in Gentoo Linux – All bug handling became concentrated in one contributor – Measured by significant increase of centralization and performance. Zanetti et al. “The rise and fall of a central contributor: Dynamics of social organization and performance in the Gentoo community.” CHASE 2013
  • 40. Social Network Analysis Social bus factor in Gentoo Linux – Contributor suddenly left the project, being dissatisfied – Sentiment analysis showed correlation with negative emotions – Lasting negative impact on the bug handling performance of the community. Zanetti et al. “The rise and fall of a central contributor: Dynamics of social organization and performance in the Gentoo community.” CHASE 2013
  • 41. Use of SNA to better predict software failures – By combining program dependency information with social network information Social Network Analysis Bird et al. “Putting it All Together: Using Socio-Technical Networks to Predict Failures.” ISSRE 2009 Pinzger et al. “Can developer-module networks predict failures?” FSE 2008
  • 42. Mirroring hypothesis Conway’s law Software structure tends to mirror the organisational/social structure A.k.a. socio-technical congruence alignment between technical dependencies and social coordination in a project
  • 43. Mirroring hypothesis Conway’s law • Evidence in favor: commercial “in-house” development • Evidence against: “community-based” development More modular software => emergent “complex network” structure? MacCormack et al. “Exploring the duality between product and organizational architectures: A test of the mirroring hypothesis.” Research Policy, 2012. Colfer et al. “The mirroring hypothesis: Theory, evidence and exceptions.” Harvard Business School, 2010.
  • 46. Interdisciplinary research Complex Systems “A new approach to science that investigates how relationships between parts give rise to the collective behaviors of a system and how the system interacts and forms relationships with its environment.” Emergence: process whereby larger entities, patterns, and regularities arise through interactions among smaller or simpler entities that themselves do not exhibit such properties.
  • 47. Complexity Theory Network Theory Citation from Mitchell’s book: “network thinking is providing novel ways to think about difficult problems such as how to do efficient search on the Web, […] how to manage large organisations, how to preserve ecosystems, […] and, more generally, what kind of resilience and vulnerabilities are intrinsic to natural, social, and technological networks, and how to exploit and protect such systems.”
  • 48. Complexity Theory Network Theory Some characteristics of complex networks: Small-world property • Low average path length between any two nodes • Highly-clustered components linked through hubs Skewed distributions (power law behaviour) • Few nodes with very high in-degree (resp. out-degree), many nodes with very small in-degree (resp. out)
  • 49. Complexity Theory Network Theory Some characteristics of complex networks: Scale-freeness • Observed degree distribution is very similar regardless of the scale of the observation Scale-free networks are resilient • Robust to deletion of random (non-hub) nodes • vulnerable to the deletion of hubs
  • 50. Complexity Theory Network Theory Examples of complex networks exhibiting these characteristics – World-Wide Web – (Technical) software dependency graphs – Social networks (e.g. Facebook) – (Socio-technical) software ecosystems
  • 51. Complexity Theory Network Theory Examples of software system dependency networks
  • 52. Network Theory Possible applications for SECOs • Provide prediction/forecasting models – of how SECOs emerge – of how SECOs grow/evolve • Estimate the resilience and sustainability of SECOs after major disturbances • Assess risk of deleting hub nodes  bus factor!
  • 53. Network Theory Possible applications for SECOs How do SECOs emerge and grow? A popular model is preferential attachment Over time, nodes with higher degree receive more links than nodes with lower degree. Extensions of this model have been proposed to simulate the growth of complex software systems By mimicking the principle of coupling & cohesion Barabasi et al. Emergence of Scaling in Random Networks. Science 286, 1999 Li et al. Multi-Level Formation of Complex Software Systems. Entropy 18(178), 2016
  • 56. Ecology and natural ecosystems Biodiversity of species E.g. hosts – parasites / plants – pollinators 58 Mutual dependency and functional redundancy Disappearing species may be compensated by others if there is sufficient diversity in both layers.
  • 57. Ecology and natural ecosystems Diversity metrics • species richness = number of different species in the ecosystem • species evenness (entropy) = relative abundance of the population of each species in the ecosystem • Shannon diversity index (relative entropy) = specialisation of a given species in relation to the species in the other level • Simpson index = degree of concentration when individuals are classified into species 5
  • 58. Software Ecosystems Diversity in software ecosystems 62 Mutual dependency and functional redundancy Disappearance of projects or contributors may be compensated by others.
  • 59. Software Ecosystems Diversity Are software project teams diverse? – In terms of code ownership, types of activity, gender balance, seniority, … How does this diversity affect … – defect-proneness? – productivity? – …
  • 60. Software Ecosystems Diversity Success story of diversity measures: Assess defect-proneness in software projects • More focused developers introduce fewer defects. • Modules receiving narrowly focused activity are more likely to contain defects. Posnett et al. Dual Ecological Measures of Focus in Software development. ICSE 2013
  • 61. Software Ecosystems Gender Diversity Effect of gender diversity on productivity? Women underrepresented in programming – industry: 16-18% female developers – open source: ~10% – social coding platforms: • GitHub: ~9% • StackOverflow: ~7% Vasilescu et al. Gender and tenure diversity in GitHub teams. CHI 2015 A Data Set for Social Diversity Studies of GitHub Teams (MSR’15)
  • 62. Software Ecosystems Gender Diversity Success story of diversity measures: – Gender and tenure diversity are positive and significant predictors of productivity – Teams that are more balanced in terms of gender and seniority have higher productivity rates Vasilescu et al. Gender and tenure diversity in GitHub teams. CHI 2015
  • 63. Interdisciplinary research Survival Analysis Statistical technique used in many disciplines to analyze the time until the occurrence of an event of interest • Medicine – Effect of treatment or medicine to cure disease – Effect of disease on patient mortality • Sociology – Factors influencing marriage or divorce
  • 65. Interdisciplinary research Survival Analysis Success story: OSS project survival Factors positively influencing survival: #contributors Project age Basis for prediction models Samoladas et al. Survival analysis on the duration of open source projects. IST 2010
  • 66. SECO Research Challenges continued… Understanding SECOs • How do different SECOs compare? • How to face technical challenges? – Big data – Privacy versus reproducibility Serebrenik et al. “Challenges in Software Ecosystems Research.” IWSECO-WEA 2015
  • 67. Research Challenge Comparing SECOs • Each software ecosystem – has specific habits, expectations, change policies – uses specific tools • Taking into account these differences is important – to support SECO maintenance and evolution – to generalise research findings across SECOs Bogart et al. “How to break an API: Cost negotiation and community values in three software ecosystems.” FSE 2016 Decan et al. “On the topology of package dependency networks – A comparison of three programming language ecosystems.” WEA 2016
  • 68. Research Challenge Big Data Volume Velocity Variety Veracity 4V
  • 70. Research Challenge Privacy vs reproducibility How to preserve privacy of individuals? – EU 2016/679 regulation on the protection of natural persons with regard to the processing of personal data and on the free movement of such data “The principles of data protection should apply to any information concerning an identified or identifiable natural person. “ – Appropriate anonimisation and privacy-preserving data mining techniques needed Fung et al. Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys 2010 Malik et al. Privacy preserving data mining techniques: Current scenario and future prospects. IC3T 2012
  • 71. Research Challenge Privacy vs reproducibility • Increase/ensure reproducible research results – Awareness is increasing – Solutions are being put into place – Big data problems remain an issue • How to reconcile privacy with reproducibility? Gonzalez-Barahona et al. On the reproducibility of empirical software engineering studies based on data retrieved from development repositories. Emp. Softw. Eng. 2012
  • 72. Wrap-up Research on SECO evolution requires – A socio-technical view – Mixed method research – Interdisciplinary research Many technical challenges need to be faced Are you willing to take up the challenge?

Editor's Notes

  1. Put a picture of Belgium (comparing its size with the rest of Europe or the rest of the world), maybe with some nice picture of the important characteristics of Belgium (beer, frieten, mosselen, wafels, chocolade; kuifje, Magritte, …) Locate Gent (place where I live), Aalst (where I was born) Brussels (where I studied), Mons and Charleroi (where I work) on this map and indicate the period (Brussels:1989-1993 studies; 1993-1999 PhD; 2000-2003 postdoc; 2003-2016 prof at UMONS
  2. Put a picture of Belgium (comparing its size with the rest of Europe or the rest of the world), maybe with some nice picture of the important characteristics of Belgium (beer, frieten, mosselen, wafels, chocolade; kuifje, Magritte, …) Locate Brussels and Mons on this map and indicate the period (Brussels:1989-1993 studies; 1993-1999 PhD; 2000-2003 postdoc; 2003-2016 prof at UMONS
  3. Put a timeline of my life indicating the main milestones and compare them with important milestones in CS and SE: 1970 birth (mention twin brother) 1988 studies at VUB 1993 PhD studies at VUB 1999 PhD obtained -postdoc started 2003 position at UMONs 2016 now
  4. Talk about main research achievements/topics studied during my career: 1994 – 2004 foundations of OO programming, OO design patterns, refactoring 1998 – now : model-driven software engineering: software modeling (UML), graph transformation, model transformation, model refactoring, model-inconsistency management 2010- 2016 software ecosystems, empirical software engineering
  5. Add a slide with all my research collaborators over time - people that I have (co-)directed their PhD Ragnhild Van Der Straeten (2005), Werner Van Belle (2003) Tom Tourwé (2002), Mathieu Goeminne (2013), Jorge Pinna Puissant (2012), Romuald Deshayes (2015), Maelick Claes (2016) People I have collaborated with Alexander Serebrenik (2014), bogdan Vasilescu (2014), Serge Demeyer (2002, 2005, 2014), Ekatarina Pek (2014), Hans Vandierendonck (2011-2012), Anthony Cleve (2010, 2014), Xavier Blanc (2009), Gabriele Taentzer (2005,2007), Amnon Eden (2005, 2006), Pieter Van Gorp (2003, 2006), Dirk Jassens (2002, 2003)
  6. All of these ecosystems are quite large, containing (tens of) thousands of different software components, with many interdependencies, an evolution history of many years, a large and active community of contributors. Studying such software ecosystems can be quite challenging Developing and maintaining components within these ecosystems can also be quite challenging.
  7. CRAN only supports sequential version numbering, causing some developers to fork their own packages (e.g., ‘reshape’ to ‘reshape2’).
  8. For JavaScript, we chose its NPM package manager (see www.npmjs.com).
  9. isarray is downloaded >6M times a week, >25M times a month! “In a lot of JavaScript environments, space is at a premium. [...] Several larger libraries like Underscore (and Lodash) have actually intentionally split themselves into sub-modules because people usually only ever load them to use a single merge function.”
  10. “The package leftpad essentially contains a few lines of source code but has thousands of dependent projects, including Node and Babel. When its developer decided to unpublish all his modules for npm, this had important consequences, “almost breaking the internet “
  11. What happened? - Everything started with the disagreement over a module name “kik” Its developer unpublished *all* his 272 modules from npm, including leftpad This caused thousands of dependent projects to break, including Node and Babel The community stepped in within minutes to fix the problem. Required NPM managers to go against their own policy by un-unpublishing the module
  12. Gentoo, one of the open source Linux distributions, is another example of a popular ecosystem that has witnessed important problems during its evolution history. Zanetti et al. studied the social organisation structure, by analyzing the collaboration structure and dynamics of Gentoo’s bug tracking system over a period of ten years [16]. An increasing centralisation towards a single central contributor, followed by an unexpected departure of this contributor, caused a major disruption in the community’s bug handling performance. This case study reveals that, next to analyzing the technical aspects of an ecosystem (such as its package dependencies), it is equally important to address the social aspects. M. S. Zanetti, I. Scholtes, C. J. Tessone, and F. Schweitzer, “The rise and fall of a central contributor: Dynamics of social organization and performance in the Gentoo community,” in Int’l Workshop on Cooperative and Human Aspects of Software Engineering, May 2013, pp. 49–56. D. Garcia et al. The Role of Emotions in Contributors Activity: A Case Study of the Gentoo Community (SocialCom 2013)
  13. C.Bogart,C.Ka ̈stner,J.Herbsleb,andF.Thung,“How to break an API: Cost negotiation and community values in three software ecosystems,” in Int’l Symp. Foundations of Software Engineering, 2016.
  14. C.Bogart,C.Ka ̈stner,J.Herbsleb,andF.Thung,“How to break an API: Cost negotiation and community values in three software ecosystems,” in Int’l Symp. Foundations of Software Engineering, 2016.
  15. ADD INFO ABOUT PYPI SIZE
  16. Écosystèmes logiciels peu étudiés en tant que tels Aspects sociaux de ces écosystèmes très peu étudiés Pour comprendre le comportement d’un écosystème, il faut étudier les comportements sociaux à l’origine de son évolution. The picture only shows the relation between contributors and projects, but obviously there are also communication relations direclty between the contributors, and dependency relations between the projects.
  17. Mixed methods research is defined as “the class of research where the researcher combines quantitative and qualitative research methods or techniques into a single study”
  18. Mechanisms driving the dynamics: Which mechanisms are favorable for their quality/evolution/popularity/survival? How do SECOs compare? How can one generalise findings of one SECO to other SECOs? Which aspects of a SECO are (domain-)specific and which are generic? - Technical challenges: will be explained later on, if enough time available.
  19. How can we better predict software failures? How can we reduce the number of bugs? Need for tool support… (prediction models, dashboards, …)
  20. The Bus factor is the number of key contributors who would need to be incapacitated “get run over by a bus” to make a project unable to proceed.
  21. https://github.com/zats/github_bus_factor https://github.com/yamikuronue/BusFactor https://github.com/g-k/bus-factor https://github.com/zgrossbart/busfactor
  22. Experimental support on GitHub https://libraries.io/bus-factor
  23. - While technical debt has been studied for software systems, it of course makes sense to extend it to software ECOSYSTEMS as well. Talk about “bad smells” as possible indicators of technical debt Quote by Ward Cunningham in 1992: “Shipping first-time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite. Objects make the cost of this transaction tolerable. The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest on that debt. Entire engineering organizations can be brought to a stand-still under the debt load of an unconsolidated implementation, object-oriented or otherwise." The concept does not mean that debt should never be incurred. Just as leverage can help a company when used correctly, a quick solution can mean a faster time to market in software development. In addition, technical debt is not just poor code. Bad code is bad code, and technical debt can result from the work of good programmers under unrealistic project constraints.”
  24. Use the “community smell” of “organisational silo” as a transition to the next slide, to explain that members of the “research community” should not stay within their own silo either (their own specific research discipline), but should communicate and colloborate with (and learn from) researchers from other disciplines.
  25. Challenge: More Interdisciplinary research Talk about borrowing ideas from other disciplines Examples: (analogy with research inspired from social network science that has managed to provide interesting new results in …) draw inspiration from biology => diversity metrics draw inspiration from medicine => survival analysis studies
  26. Challenge: More Interdisciplinary research Talk about borrowing ideas from other disciplines Examples: (analogy with research inspired from social network science that has managed to provide interesting new results in …) draw inspiration from biology => diversity metrics draw inspiration from medicine => survival analysis studies
  27. Talk about borrowing ideas from other disciplines Examples: social network analysis => Study by Pinzger et al. “Can developer-module networks predict failures?” => study on Windows Vista; using network centrality measures Study by Bird et al. 117 citations in Google Scholar (Int. Symp. Software Reliability Engineering): method performs as well or better and is able to more accurately identify those software components that have more post-release failures, with precision and recall rates as high as 85%. Using so-called “network centrality measures” like betweenness centrality, closness centrality, eigenvector centrality, degree centrality Preliminary study on Windows Vista and Eclipse
  28. Talk about borrowing ideas from other disciplines Examples: social network analysis => Study by Pinzger et al. “Can developer-module networks predict failures?” => study on Windows Vista; using network centrality measures Study by Bird et al. 117 citations in Google Scholar (Int. Symp. Software Reliability Engineering): method performs as well or better and is able to more accurately identify those software components that have more post-release failures, with precision and recall rates as high as 85%. Using so-called “network centrality measures” like betweenness centrality, closness centrality, eigenvector centrality, degree centrality Preliminary study on Windows Vista and Eclipse
  29. Talk about borrowing ideas from other disciplines Examples: social network analysis => Study by Pinzger et al. “Can developer-module networks predict failures?” => study on Windows Vista; using network centrality measures Study by Bird et al. 117 citations in Google Scholar (Int. Symp. Software Reliability Engineering): method performs as well or better and is able to more accurately identify those software components that have more post-release failures, with precision and recall rates as high as 85%. Using so-called “network centrality measures” like betweenness centrality, closness centrality, eigenvector centrality, degree centrality Preliminary study on Windows Vista and Eclipse
  30. Tamburri claims that many of his “community smells”, that are indicators of social debt, could be detectable using social network analysis, i.e. by detecting specific patterns in the social network graph. Social bus factor is probably related to a combination of high betweenness and low degree centrality.
  31. Sentiment analysis was done based on messages sent to the gentoo-dev mailing list
  32. Sentiment analysis was done based on messages sent to the gentoo-dev mailing list
  33. Study by Pinzger et al. on Windows Vista; using network centrality measures Study by Bird et al. (Int. Symp. Software Reliability Engineering): method performs as well or better and is able to more accurately identify those software components that have more post-release failures, with precision and recall rates as high as 85%. Using so-called “network centrality measures” like betweenness centrality, closness centrality, eigenvector centrality, degree centrality Preliminary study on Windows Vista and Eclipse
  34. See http://blog.graphcommons.com/analyzing-the-npm-dependency-network/
  35. M. Cataldo, J. D. Herbsleb, and K. M. Carley. Socio-technical congruence: A framework for assessing the impact of technical and work dependencies on software development productivity. In Int’l Symp. Empirical Software Engineering and Measurement, pages 2–11. ACM , 2008.
  36. Another evidence against can be found in the paper “Socio-Technical Congruence in the Ruby Ecosystem” by Syeed et al. in OpenSym 2014. (Based on an analysis of the Ruby software ecosystem.)
  37. The behavior of a complex system is bigger than the sum of its parts: the behaviour of the system as a whole cannot be understood by looking at the interaction between the individual entities that compose it.
  38. The concept of a small world was originally observed in the late 1960’s by the social psychologist Stanley Milgram. - S. Milgram, “The Small World Problem,” Psychology Today, 2, 1967 pp. 60–67. - J. Travers and S. Milgram, “An Experimental Study of the Small World Problem,” Sociometry, 32(4), 1969 pp. 425–443.
  39. The concept of a small world was originally observed in the late 1960’s by the social psychologist Stanley Milgram. - S. Milgram, “The Small World Problem,” Psychology Today, 2, 1967 pp. 60–67. - J. Travers and S. Milgram, “An Experimental Study of the Small World Problem,” Sociometry, 32(4), 1969 pp. 425–443.
  40. Robustness to deletion in the sense that it does not change the structural/topological properties of the network, which remains scale-free, small-world, and skewed distribution after the deletion… The vulnerability to deletion of hub nodes could be linked easily to the aforementioned notions of technical and social bus factors. Hub nodes have a considerably higher bus factor, since the ecosystem/network is much more vulnerable to their deletion. This implies that managers of the (eco)system should take care to “protect” these hub nodes from getting deleted…
  41. Small-World Properties of Facebook Group Networks. By Jason Wohlgemuth and Mihaela Teodora Matache. In Complex Systems, 23 © 2014 Complex Systems Publications, Inc. See http://www.complex-systems.com/pdf/23-3-1.pdf
  42. Several models have been proposed that lead to scale-free networks. A popular model is “preferential attachment”. The idea of preferential attachment was proposed in 1999 by Barabasi et al. A.L. Barabasi, R. Albert, emergence of scaling in radndom networks. Science 286,1999, pp. 509-512. Li et al. [8] proposed an extended model of preferential attachment adapted to software systems, and used it to simulate growth models that mimic the well-known design principle of low coupling and high cohesion. If software developers strive towards this principle, they will naturally obtain systems containing highly cohe- sive components that are lowly coupled between them, reminiscent of the hubs and clusters structure presented in Section 3. While this growth mechanism seems plausible, other mechanisms have been proposed. It remains an open question which mechanism actually causes the scale-free networks we can observe. Preferential attachment has been used in software evolution research by several authors: Valverde et al. [20] suggest that the emergence of scal- ing arises from logical optimisation process. Myers et al. [15] proposed the process of refactoring to improve the structure of existing code as a possible explanation for the emergence of scale-free networks in software. Inspired by Darwin’s ideas of evolutionary adaptation, Venkatasubramanian et al. proposed a generic model based on network parameters such as efficiency, robustness, cost, and environmental selection pressure [21]. Using a genetic algorithm their model was able to generate different types of network structures, depending on the chosen parameters.
  43. Obtained topological network structures for varying valus of the “coupling ratio”, representing the possibility that a new edge connects two nodes in different modules, when new nodes are added to the existing network. Particularly, a larger value of Λ means a larger proportion of edges between nodes in different modules, which indicates that the nodes are more likely to connect the nodes in other modules. Conversely, a smaller value of Λ means a smaller proportion of edges between nodes in the same modules, which indicates that the nodes are more likely to connect the nodes in the same modules. Lower values of coupling ratio (e.g. 0.1 for (a)) lead to a more modular network structure.
  44. Talk about borrowing ideas from other disciplines Examples: social network analysis => Study by Pinzger et al. “Can developer-module networks predict failures?” => study on Windows Vista; using network centrality measures Study by Bird et al. 117 citations in Google Scholar (Int. Symp. Software Reliability Engineering): method performs as well or better and is able to more accurately identify those software components that have more post-release failures, with precision and recall rates as high as 85%. Using so-called “network centrality measures” like betweenness centrality, closness centrality, eigenvector centrality, degree centrality Preliminary study on Windows Vista and Eclipse
  45. Ecological systems include a large number of bi-partite relationships, such as host-parasitoid or plant-pollinator [15]. These relationships are also modeled as bi-partite graphs in which nodes rep- resent species and edges represent a specific kind of relationship. For example, figure 3.3 represents bees and flowers species, as well as the pollinating relationships. Ecological bi-partite networks are very robust to perturbations because of the large diversity [58] of species and the functional redun- dancy of species in the network [41]. In many cases, if a flower species disappears, most bee species that relied on it for pollen can find pollen in other species. The diversity of flower species increases the chances that the extinction of a particular species species will not lead to the extinction of the others, while the functional redundancy increases the chances that bees can find similar pollens in other flowers. Mutual dependency and functional redundancy: disappearance of one species may be compensated by other species if there is sufficient diversity in both layers Can be used to study resistance and resilience of natural ecosystems By studying diversity of species
  46. Based on species analogy Contributors are species that thrive in their environment of projects Projects are species that thrive in their environment of contributors (human resources)
  47. Talk about borrowing ideas from other disciplines Examples: draw inspiration from biology => diversity metrics draw inspiration from medicine => survival analysis studies
  48. Talk about borrowing ideas from other disciplines Examples: draw inspiration from biology => diversity metrics draw inspiration from medicine => survival analysis studies
  49. Projects with more contributors tend to survive longer Projects that are older (i.e. more mature) are more likely to survive than younger projects. In the beginning, the survival curve goes down rapidly, than stabilisies Effect of application domain (project type) may play a role as well, but no significant statistical evidence.
  50. Which mechanisms are favorable for their quality/evolution/popularity/survival?
  51. Volume: need to store, analyse and manipulate huge quantities of data when studying software ecosystems (containing tends of thousands of components and dependencies, a huge number of commits, thousands of contributors, millions of lines of code, … Variety: need to deal with very heterogenous data: structured data (e.g. programs); semi-structured (e.g. e-mails); unstructured (e.g. unformatted texts). Coming from wide variety of data sources including version control, issue trackers, mailing lists, Q&A, Twitter communication, surveys and interviews. Even screen capture software, video/audio recordings, photographs, field notes of software developers collaborating in situ [Socha et al, “Wide-field ethnography: Studying software engineering in 2025 and beyond,” ICSE, 2016, pp. 797–802] Veracity: Dealing with uncertain, inconsistent or missing data. Velocity: new commits are made to GitHub several times every second. This may be less of an issue for empirical studies, in which the data is typically analyzed off-line. For automated tools that support the activities of a software ecosystem community (e.g. web-based dashboards), however, it may be important to rely in the most recent data in order to make informed decisions.
  52. Therefore, appropriate techniques need to be developed and put into place to guarantee anonymity. Fung et al. presented a survey of research results and future directions in privacy-preserving data publishing [70]. Malik et al. pro- vided an overview of privacy-preserving data mining tools and techniques, and proposed future research directions [71]. B. C. M. Fung, K. Wang, R. Chen, and P. S. Yu, “Privacy-preserving data publishing: A survey of recent developments,” ACM Comput. Surv., vol. 42, no. 4, pp. 14:1–14:53, Jun. 2010. [71] M. B. Malik, M. A. Ghazi, and R. Ali, “Privacy preserving data mining techniques: Current scenario and future prospects,” in Int’l Conf. Computer and Communication Technology, Nov. 2012, pp. 26–32.