Context: Software testing is a crucial step in most software development processes. Testing software is a key component to manage and assess the risk of shipping quality products to customers. But testing is also an expensive process and changes to the system need to be tested thoroughly which may take time. Thus, the quality of a software product depends on the quality of its underlying testing process and on the effectiveness and reliability of individual test cases.
Goal: In this paper, we investigate the impact of the organizational structure of test owners on the reliability and effectiveness of the corresponding test cases. Prior empirical research on organizational structure has focused only on developer activity. We expand the scope of empirical knowledge by assessing the impact of organizational structure on testing activities.
Method: We performed an empirical study on the Windows build verification test suites (BVT) and relate effectiveness and reliability measures of each test run to the complexity and size of the organizational sub-structure that enclose all owners of test cases executed.
Results: Our results show, that organizational structure impacts both test effectiveness and test execution reliability. We are also able to predict effectiveness and reliability with fairly high precision and recall values.
Conclusion: We suggest to review test suites with respect to their organizational composition. As indicated by the results of this study, this would increase the effectiveness and reliability, development speed and developer satisfaction.
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
ย
52 - The Impact of Test Ownership and Team Structure on the Reliability and Effectiveness of Quality Test Runs
1. The Impact of
Test Ownership and Team Structure on
Test Reliability and Effectiveness
Kim Herzig, Nachiappan Nagappan
2. Improving Development Processes
Quality / Risk
$
Speed
R
Cost
Development Environment
(should be well balanced)
Product / Service
Microsoft aims for shorter release cycles
โข Speed up development processes (e.g. code velocity)
โข Maintaining / increasing product quality
Empirical data to support & drive decisions
Joint effort by MSR & product teams
โข MSR Cambridge: Brendan Murphy, Kim Herzig
โข MSR Redmond: Tom Zimmermann, Chris Bird, Nachiappan Nagappan
โข TSE Redmond: Jacek Czerwonka, Michaela Greiler
โข Windows, Windows Phone, Office, Dynamics product teams
Technology
changes
Legacy
changes
New product
features
3. A Simple Test Scenario
Development branch
๏ Each developer runs โhisโ tests
๏ Most tests check functional correctness
๏ Mostly pre-check-in verification
๏ Failures are isolated: no impact on other engineers
4. Letโs go big: Complex Test Scenario for Windows
Version Development Control Branch Tree
Branch level 3
Branch level 2
Branch level 1
Branch level 0
branch
Multi-area
branch
Multi-component
branch
main trunk
branch
Quality gate
(component testing)
Quality gate
(component testing)
Quality gate
(component testing)
โข Developer have to pass quality gates (system tests)
โข Many tests check system constraints (e.g. compatibility or performance)
โข Failures not isolated (involve human inspections, causes development freeze for corresponding branch, etc.)
5. Test Effectiveness & Reliability
Effectiveness
Test suite effectivenesslow
โข Number of bugs found and fixes due to test failures (NumBugs)
โข We trace test failures to bug report to bug fixes. Using CODEMINE [1].
Reliability
high
High cost,
unknown value
โข Bug detection ratio per test suite run (BugsPerExec)
$$$$
High cost,
low value
$$$$
โข Eliminates correlation to number of executions.
โข However, absolute number for more important as each defect might be cause a disaster.
Test suite Low reliability
cost,
good value
โข Number of false test failures (FpPerExec)
$$
Low cost,
low value
$
high low
โข Test failures due to any other reason than code defects, e.g. test and infrastructure issues.
Test execution
fails
True positive
failure
False positive
failure
Mapped to
bug report?
yes
Bug report
fixed?
yes
Resolved via
code change?
yes
no
[1] Czerwonka, J., Nagappan, N., Schulte, W., and Murphy, B. CODEMINE: Building a Software Development Data Analytics Platform at Microsoft. Software, IEEE, 30, 4 (2013), 64--71.
6. Distributed Test Ownership
Test sui te
Test case 1
Test case 2
Test case 3
Test case n
Quality gates combine tests from different sources
โข Side effects between test cases
โข Limited synchronization over new tests and their interactions.
Tests have impact on other teams
โข Engineers might not know what the test is doing
โข Low quality tests slow down product development
Test evolve over time
โข Is the test still up-to-date? Can we remove it?
โข Who wrote the test and is that person still in our organization?
โข Who is maintaining the test right now?
Ownership versus Responsibility: Engineers may own the test step but not the quality gate
7. Mapping Ownership to Organization Tree
Test sui te
Metric name Short description
Number of contributors
NumOwner The number of distinct engineers that own at least one test in the test suite.
NumOwnerLeftOrg
The number of distinct engineers that own at least one test in the test suite and that are no longer in
the organizational structure of Microsoft (left the company).
PCOwnerLeftOg
The relative number of distinct developers that own at least one test case in the test suite but are no
longer part of the organizational structure: ํํขํํํํคํํํฟํํํกํถํํํํํํฆ ํํขํํํคํํํ.
Number of managers
NumL2
The number of level two (L2) managers whose corresponding sub-trees contain all engineers that
own at least one test case of the test suite.
NumL3
The number of level three (L3) managers whose corresponding sub-trees contain all engineers that
own at least one test case of the test suite.
NumL4
The number of level four (L4) managers whose corresponding sub-trees contain all engineers that
own at least one test case of the test suite.
NumL5
The number of level five (L5) managers whose corresponding sub-trees contain all engineers that
own at least one test case of the test suite.
HighestCommonManager
The highest manager level (L3 > L2) whose sub-organization contains all engineers that own at least
one test case of the test suite.
Organizational communication paths
LongestDevDistance
The longest distance between two developers in the organizational sub-tree spanned by test
contributors. The distance between two developers is measured by the length of the shortest path
between the two corresponding organizational sub-tree vertices.
MedianDevDistance
The median of distances (length of shortest paths) between all pairs of developers contributing test
content to a test suite.
MeanDevDistance
The mean of distances (length of shortest paths) between all pairs of developers contributing test
content to a test suite.
MeanPathLength The mean length of all paths between test owners.
SizeOfLargestClique
Number of vertices of organizational sub-graph such that any two test owners are connected via a
single edge.
Number of aliases
NumAliasGroups
The number of distinct mailing lists associated with test cases of the test suite. Does not include the
number of system aliases (NumSystemAlias).
MedianAliasGroupSize The median number of mailing list members referenced by mailing lists registered as test owners.
NumSystemAlias
The number of distinct aliases that point to a system aliasโnot to a mail distribution list. These
aliases are real system accounts.
8. Mapping Ownership to Organization Tree
Metric name Short description
Number of contributors
NumOwner The number of distinct engineers that own at least one test in the test suite.
NumOwnerLeftOrg
The number of distinct engineers that own at least one test in the test suite and that are no longer in
the organizational structure of Microsoft (left the company).
PCOwnerLeftOg
The relative number of distinct developers that own at least one test case in the test suite but are no
longer part of the organizational structure: ํํขํํํํคํํํฟํํํกํถํํํํํํฆ ํํขํํํคํํํ.
Number of managers
NumL2
The number of level two (L2) managers whose corresponding sub-trees contain all engineers that
own at least one test case of the test suite.
NumL3
The number of level three (L3) managers whose corresponding sub-trees contain all engineers that
own at least one test case of the test suite.
NumL4
The number of level four (L4) managers whose corresponding sub-trees contain all engineers that
own at least one test case of the test suite.
NumL5
The number of level five (L5) managers whose corresponding sub-trees contain all engineers that
own at least one test case of the test suite.
HighestCommonManager
The highest manager level (L3 > L2) whose sub-organization contains all engineers that own at least
one test case of the test suite.
Organizational communication paths
LongestDevDistance
The longest distance between two developers in the organizational sub-tree spanned by test
contributors. The distance between two developers is measured by the length of the shortest path
between the two corresponding organizational sub-tree vertices.
MedianDevDistance
The median of distances (length of shortest paths) between all pairs of developers contributing test
content to a test suite.
MeanDevDistance
The mean of distances (length of shortest paths) between all pairs of developers contributing test
content to a test suite.
MeanPathLength The mean length of all paths between test owners.
SizeOfLargestClique
Number of vertices of organizational sub-graph such that any two test owners are connected via a
single edge.
Number of aliases
NumAliasGroups
The number of distinct mailing lists associated with test cases of the test suite. Does not include the
number of system aliases (NumSystemAlias).
MedianAliasGroupSize The median number of mailing list members referenced by mailing lists registered as test owners.
NumSystemAlias
The number of distinct aliases that point to a system aliasโnot to a mail distribution list. These
aliases are real system accounts.
9. Test Ownership
Does organizational structure of test owners impact
the effectiveness or reliability of tests?
Similar: Distributed code ownership impacts code quality!
Results for 8.1
12. Correlations
Test sui te
Few group owners
Test sui te
Multiple group owners
More effective Less effective
13. Classification Models
Classifying quality gates showing above-median effectiveness.
Classifying quality gates showing above-median reliability.
ํํํํํํ ํํํ =
ํํํํํํ =
100-cross-fold stratified random sampling.
ํํ
ํํ + ํนํ
ํํ
ํํ + ํนํ
Using multiple machine learners to rule out model over fitting.
14. Can we predict test effectiveness?
Predicting the effectiveness of test suites with
precision and recall values around 0.7.
15. Can we predict test reliability?
Organizational structure metrics are
excellent predictors for test suite reliability!
16. ๏ Organizational metrics on test owners
good predictors for test effectiveness!
๏Organizational metrics on test owners
excellent predictors for test reliability!