In this talk we discuss the results of the survey of software ecosystems researchers conducted in October-December 2014. Researchers have been asked to identify the current trends in ecosystems’ research as well as the challenges the research community has to address in the coming years. We augment discussion of the trends identified by the community by the review of some of the recent results on software ecosystems.
Joint work with Tom Mens.
Risk Management in Engineering Construction Project
Challenges in Software Ecosystems Research
1. Challenges in Software
Ecosystems Research
Alexander
Serebrenik
Eindhoven University
of Technology
The Netherlands @aserebrenik
Tom Mens UMons Belgium @tom_mens
4. Definition of an ecosystem
Example of an ecosystem
Trends and challenges
5. 164
authors of an article or a book
chapter on SECO, paper in
IWSECO, WEA or Big Systems 2014
141 authors with a valid email address
26* answered the survey
* response rate 18,4%, comparable with other surveys
6. Definition of an ecosystem
Respondent: “Defining everything as an
ecosystem. <…> The word is trend-ish and
it causes misunderstandings in the field.”
7. “The complex system of plant, animal, fungal, and
microorganism communities and their associated
non-living environment interacting as an ecological
unit. Ecosystems have no fixed boundaries”
8. [Lungu 2008]
[Jansen et al.
2009]
[Manikas,
Hansen 2013]
<biological>
communities
software
projects
actors actors
environment environment
shared markt for
software and
services, shared
platform
common
technological
platform
interaction
developed and
evolve together
exchange of
information,
resources &
artefacts
symbiotic
relationships
Definition of an ecosystem
15. Definition of an ecosystem
Example of an ecosystem
Respondent: “Defining
everything as an
ecosystem. <…> The
word is trend-ish and it
causes misunderstandings
in the field.”
social
economicaltechnical
Different perspectives on the same artefacts or different
artefacts all together?
17. One challenge is to be able to characterize
the wealth of the community wrt the wealth
of the software components. What is the
impact of different collaboration and
development practices on the quality of the
ecosystem?”
Trends and challenges
18. One challenge is to be able to characterize
the wealth of the community wrt the wealth
of the software components. What is the
impact of different collaboration and
development practices on the quality of the
ecosystem?”
Trends and challenges
ecosystem quality
socio-technical
19. One challenge is to be able to characterize
the wealth of the community wrt the wealth
of the software components. What is the
impact of different collaboration and
development practices on the quality of the
ecosystem?”
Trends and challenges
ecosystem quality
socio-technical
SECOs may consist of many systems.
Analysing all these systems as a whole
may raise some technical problems, due to
the quantity of data to take into account.
data analytics
amount (volume)
large databases with comparable information about the details
of a large collection of ecosystems, so that any research could
be conducted in a repeatable and comparable way.
database of
comparable inforeproducible
research
25. Non-sensitive Sensitive
Zip Age Nationality Condition
1 13053 28 Russian Heart Disease
2 13068 29 American Heart Disease
3 13068 21 Japanese Viral Infection
4 13053 23 American Viral Infection
5 14853 50 Indian Cancer
6 14853 55 Russian Heart Disease
7 14850 47 American Viral Infection
8 14850 49 American Viral Infection
9 13053 31 American Cancer
10 13053 37 Indian Cancer
11 13068 36 Japanese Cancer
12 13068 35 American Cancer
29. Second survey
• Group A: respondents of the previous survey that
have provided their email addresses
• 26 answers - 20 with mail, invited - 14 responses - 70%
• Group B: extended list of ecosystem experts
(outside Group A):
• 148 invited - 142 valid addresses - 38* responses ~ 27%
• Better response rate: 32.1% vs 18.4% (first survey)
* One of the respondents that provided an email has not been invited
30.
31. No difference between
Group A and Group B
Adonis, Unknown,
restored by Duquesnoy
(1597–1643), Louvre
• Analysis of Similarities
(ANOSIM)
• R: -0.07564
• more dissimilar closer to 1
• Permutational Multivariate
Analysis of Variance Using
Distance Matrices (ADONIS)
• p-value: 0.192
32. Ordering challenges
1. Consider both groups as one set of answers
2. Per question: #very important - #moderately
important - #slightly important
3. Lexicographic order on the triples
(#very important - #moderately important - #slightly
important)
33. Top Three
1. Reproducible and Comparable Research [Providing
databases with information about the details of a
large collection of ecosystems]
2. Reproducible and Comparable Research [Making
research results about ecosystems available in a
reproducible way]
3. Offer more advanced ecosystems analysis (e.g., case
studies, qualitative and quantitative analysis) [Use
more advanced statistical techniques (e.g., survival
analysis, econometric aggregation, contrasts)]
35. Reproducible and Comparable Research
[Providing databases with information about the
details of a large collection of ecosystems]
Enough?
Too big to share?
Up-to-date?
Still relevant?
1TB
37. Advanced statistics
3. Offer more advanced ecosystems analysis (e.g., case
studies, qualitative and quantitative analysis) [Use more
advanced statistical techniques (e.g., survival
analysis, econometric aggregation, contrasts)]
38. Advanced statistics
PAGE 2711/08/15
Two distributions:
! t-test
! Mann-Whitney
Multiple distributions:
1. ANOVA / KW
2. pairwise t-test / MW
Tests can be
inconsistent with
each other
We need a
one-phase test!
39. Advanced statistics
PAGE 3211/08/15
Idea:
"
Pair Low High
B-A -0.56 -0.44
C-A -0.50 -0.31
D-A -0.32 -0.03
C-B -0.01 0.24
D-B 0.24 0.47
D-C 0.09 0.40
A→B
A→C
A→D
D→B
D→C
Konietschke, F., Hothorn, LA, and Brunner, E.
Rank-based multiple test procedures and
simultaneous confidence intervals.
Electron. J. Stat. 6 (2012), 738–759.
~
40. T and Software Ecosystems
• Stack Overflow and GitHub - Vasilescu et al. Social
Com 2013
• Simulink models - Dajsuren et al. QoSA 2013
• GNOME - Vasilescu et al. ESE 2014
• Stack Exchange sites - Wang et al. ICSME 2014
• jEdit, ArgoUML, KOffice - Sun et al. Inf & Software
Technology 2015
~
41. Advanced statistics
Mean,
median,
sum
Gini, Theil,
Kolm…
Choice of an aggregation
technique provides different
insights but can also affect
validity of the results!
C. Gini, “Measurement of inequality of
incomes,” The Economic Journal, 1921.
H. Theil, Economics and Information Theory.
North-Holland, 1967
A.B. Atkinson, “On the measurement of
inequality,” Journal of Economic Theory,
1970.
…
43. Advanced statistics
% of entities still used
after time t?
Kaplan, E. L.; Meier, P. (1958).
"Nonparametric estimation from incomplete
observations". J. Amer. Statist. Assn. 53
(282): 457–481
44. Survival & Software Ecos
• FLOSSMetrics DB - Samoladas et al. Information &
Software Technology 2010
• Debian packages - Claes et al. MSR 2015
• Databases in Java projects - Goeminne, Mens
ICSME 2015
45. 4. Understanding and improving the design, architecture, quality and
health of software ecosystems [Socio-technical perspective, e.g.,
comparing the health of the community against the health of the
ecosystem components]
5. Ecosystem Governance [Design perspective, e.g., actively
supporting the stakeholders' decisions]
6. Understanding and improving an ecosystem's dynamics and
evolution [Generalisation perspective, e.g., transferring insights from
evolution of individual software systems to evolution of ecosystems]
7. Understanding and improving the design, architecture, quality and
health of software ecosystems [Social perspective, e.g., creating an
active community around the ecosystem]
8. Interdisciplinary research [Applying ecosystem research techniques
to non-classical software ecosystems, e.g., spreadsheets or Matlab
Simulink models]
9. Understanding and improving an ecosystem's dynamics and
evolution [Design perspective, e.g., providing upgrade strategies
when one of the ecosystem elements changes]
10.Ecosystem Governance [Generalisation perspective, e.g., going
beyond anecdotal evidence]
46. Threats to validity
• Representativeness of the respondents wrt the
research community