The operationalisation of collaboration: in search of a definition and its consequences on
analysis
Collaboration has been defined in numerous ways. Researchers interested in collaboration at the
individual or organizational level need to pay special attention to the adoption of a specific definition, as
this is likely to have major implications for the research design and outcomes. With respect to
collaboration within open source software projects, this presentation has two objectives. Firstly, this
presentation will investigate a wide variety of definitions of collaboration from the existing literature.
Secondly, the presentation will look at theoretically informed selection of a definition. Throughout the
presentation, specific emphasis will be put on the implications of adoption of several definitions of
collaboration for the application of Social Network Analysis to the study of open source software,
particularly considering data collection and analysis. Open source software is developed in the open
where anyone can view the source code and anyone with the knowledge to do so can contribute to the
project. Because people from around the world work on these projects together using online tools, it is
a relevant setting for studying collaboration. An interesting aspect of open source collaboration is that
private resources from individuals and organizations are used to develop software that is released as a
public good. Social Network Analysis can be used to understand the network relationships between the
individuals who develop this software. Given the interest in collaboration from researchers from different
backgrounds and disciplines, similar research is likely to produce considerations to stimulate further
thoughts about definitions of collaboration in several domains and research settings.
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Operationalisation of Collaboration Sunbelt 2015
1. The Operationalisation of Collaboration:
in Search of a Definition and Its Consequences On Analysis
Dawn M. Foster, Guido Conaldi, Riccardo De Vita
Sunbelt XXXV June 2015
2. The Context
Pilot Study - define collaboration
Part of Larger Research Project - PhD Dissertation
Research Question for Overall Research Project:
• How do software developers, who are paid by
organizations for their work, collaborate within an
open source software community?
2
3. The Challenge
Open source software is a collaborative effort
But, collaboration takes many forms
And is defined in various ways
Which definitions are most important?
3
4. Literature on problem solving in open source
Unlikely organizations: survival depends on willingness
to engage in decentralized problem solving.
(e.g., Crowston & Scozzi, 2008; Mockus et , 2002, Conaldi et al. 2012)
Collaboration in problem solving investigated using
digital traces: email, code, bug reports, mostly
separately, or multidimensionally.
(e.g., Von Krogh, G., Spaeth, S., & Lakhani, K. R. 2003)
Contributions close in time as proxies for collaboration.
4
5. The Approach
Small Pilot Study
• Interviewed 4 participants
• Explored possible definitions of collaboration
• Analysis of responses
Network Analysis
• Ego-centric relational event histories for each pilot participant
• Collaboration as defined in pilot study
5
6. Research Setting
Linux kernel community:
• Open source software
• Over 85% of contributors are paid
• Neutral: competing companies contribute
• 19M lines of code, 11K developers, 1200 organisations
Pilot Research Question:
• How do definitions of collaboration impact measurement and
analysis within a decentralised organisational context?
6
7. Data
Mailing list collaboration (discussion, patches, bugs)
• 4 mailing lists used by pilot participants
• Ego-net focus
• History of events reconstructed
• Basic descriptive stats
Code file collaboration
• Code files modified by pilot participants
• History of events reconstructed
• Basic descriptive stats
7
8. Methods: Activity
In our (very) preliminary analysis as actor-level measures of
activity we measured:
Mailing lists:
• Weighted degree centrality of contributors to capture their
involvement in the discussion of development topics
Code files:
• Weighted degree centrality of contributors to capture their
activity in code production
8
9. Methods: Collaboration
In our preliminary analysis as actor-level measures of
collaboration we measured:
Mailing lists:
• Number of 2-paths: to capture the amount of participation
by others in development topics discussed by contributors
Code files:
• Number of 2-paths: to capture the amount of contribution
by others to files being worked on by contributors
9
10. Results: Collaboration in the Linux kernel
In person (events)
Feedback on code contributions aka patches (mailing list)
General mailing list discussions
Feedback on bugs (mailing list)
Working on same code file(s)
10
13. Implications and Relevance
Collaboration is multiplex in the eyes of the
contributors
The inspection of activity and (potential) collaboration
in mailing lists and code show complementary
pictures
Ability to identify contributors and their actions across
multiple activities of code production is paramount if
we want to study the structuring of collaboration
13
14. Discussion and Future Work
Face-to-face collaboration: how to capture it?
Identities across multiple online repositories
Validation
14
15. Thank You and Questions
Authors:
Dawn M. Foster dawn@dawnfoster.com
Guido Conaldi G.Conaldi@greenwich.ac.uk
Riccardo De Vita R.DeVita@greenwich.ac.uk
University of Greenwich, Centre for Business Network Analysis
15
16.
17. References
Data on Linux kernel contributions:
• Corbet, J., Kroah-Hartman, G. & McPherson, A., 2015. Linux Kernel Development: How
Fast is it Going, Who is Doing It, What Are They Doing and Who is Sponsoring the
Work, Available at: http://www.linuxfoundation.org/publications/linux-foundation/who-
writes-linux-2015.
Literature:
• Crowston, K., & Scozzi B. (2008). Bug Fixing Practices within Free/Libre Open Source
Software Development Teams. Journal of Database Management. 19(2), 1–30.
• Mockus, A., Fielding, R.T. & Herbsleb, J.D., 2002. Two case studies of open source
software development: Apache and Mozilla. ACM Transactions on Software
Engineering and Methodology, 11(3), pp. 309–346.
• Conaldi, G., Lomi, A. & Tonellato, M., 2012. Dynamic models of affiliation and the
network structure of problem solving in an open source software project. Organizational
Research Methods, 15(3), pp. 385–412.
• Von Krogh, G., Spaeth, S., & Lakhani, K. R., 2003. Community, joining, and
specialization in open source software innovation: a case study. Research Policy, 32(7),
pp. 1217-1241.