SlideShare a Scribd company logo
1 of 84
Download to read offline
Duplicate Bug Reports
Considered Harmful…
Really?
Nicolas Bettenburg • Rahul Premraj • Tom Zimmerman • Sunghun Kim

ICSME’2018 (Madrid) • September 28th, 2018
0
75
150
225
300
Automated Severity Assessment of Software Defect Reports
Tim Menzies
Lane Department of Computer Science,
West Virginia University
PO Box 6109, Morgantown, WV, 26506
304 293 0405
tim@menzies.us
Andrian Marcus
Department of Computer Science
Wayne State University
Detroit, MI 48202
313 577 5408
amarcus@wayne.edu
Abstract
In mission critical systems, such as those developed
by NASA, it is very important that the test engineers
properly recognize the severity of each issue they
identify during testing. Proper severity assessment is
essential for appropriate resource allocation and
planning for fixing activities and additional testing.
Severity assessment is strongly influenced by the
experience of the test engineers and by the time they
spend on each issue.
The paper presents a new and automated method
named SEVERIS (SEVERity ISsue assessment), which
assists the test engineer in assigning severity levels to
defect reports. SEVERIS is based on standard text
mining and machine learning techniques applied to
existing sets of defect reports. A case study on using
SEVERIS with data from NASA’s Project and Issue
Tracking System (PITS) is presented in the paper. The
case study results indicate that SEVERIS is a good
predictor for issue severity levels, while it is easy to
use and efficient.
1. Introduction
NASA’s software Independent Verification and
Validation (IV&V) Program captures all of its findings
in a database called the Project and Issue Tracking
System (PITS). The data in PITS has been collected
for more than 10 years and includes issues on robotic
satellite missions and human-rated systems.
Nowadays, similar defect tracking systems, such as
Bugzilla1
, have become very popular, largely due to the
spread of open source software development. These
systems help to track bugs and changes in the code, to
submit and review patches, to manage quality
assurance, to support communication between
developers, etc.
As compared to newer systems, the problem with
PITS is that there is a lack of consistency in how each
1
http://www.bugzilla.org/
of the projects collected issue data. In most instances,
the specific configuration of the information captured
about an issue was tailored by the IV&V project to
meet its needs. This has created consistency problems
when metrics data is pulled across projects. While
there was a set of required data fields, the majorities of
those fields do not provide information in regards to
the quality of the issue and are not very suitable for
comparing projects.
A common issue among defect tracking systems is
that they are useful for storing day-to-day information
and generating small-scale tactical reports (e.g., “list
the bugs we found last Tuesday”), but difficult to use
for high-end business strategic analysis (e.g., “in the
past, what methods have proved most cost effective in
finding bugs?”). Another issue common to these
systems is that most of the data is unstructured (i.e.,
free text). Specific to PITS is that the database fields
in PITS keep changing, yet the nature of the
unstructured text remains constant. In consequence,
one logical choice in the analysis of defect reports is a
combination of text mining and machine learning.
In this paper we present a new approach for
extracting general conclusions from PITS data based
on text mining and machine learning methods, which
are low cost, automatic, and rapid. We designed and
built a tool named SEVERIS (SEVERity ISsue
assessment) to automatically review issue reports and
alert when a proposed severity is anomalous. The way
SEVRIS is built provides the probabilities that the
assessment is correct. These probabilities can be used
to guide decision making in this process. Assigning
the correct severity levels to issue reports is extremely
important in the process employed at NASA, as it
directly impacts resource allocation and planning of
subsequent defect fixing activities.
NASA uses a five-point scale to score issue
severity. The scale ranges one to five, worst to dullest,
respectively. A different scale is used for robotic and
human-rated missions (see Table 1).
Automated Severity Assessment of Software Defect Reports
Tim Menzies
Lane Department of Computer Science,
West Virginia University
PO Box 6109, Morgantown, WV, 26506
304 293 0405
tim@menzies.us
Andrian Marcus
Department of Computer Science
Wayne State University
Detroit, MI 48202
313 577 5408
amarcus@wayne.edu
Abstract
In mission critical systems, such as those developed
by NASA, it is very important that the test engineers
properly recognize the severity of each issue they
identify during testing. Proper severity assessment is
essential for appropriate resource allocation and
planning for fixing activities and additional testing.
Severity assessment is strongly influenced by the
experience of the test engineers and by the time they
spend on each issue.
The paper presents a new and automated method
named SEVERIS (SEVERity ISsue assessment), which
assists the test engineer in assigning severity levels to
defect reports. SEVERIS is based on standard text
mining and machine learning techniques applied to
existing sets of defect reports. A case study on using
SEVERIS with data from NASA’s Project and Issue
Tracking System (PITS) is presented in the paper. The
case study results indicate that SEVERIS is a good
predictor for issue severity levels, while it is easy to
use and efficient.
1. Introduction
NASA’s software Independent Verification and
Validation (IV&V) Program captures all of its findings
in a database called the Project and Issue Tracking
System (PITS). The data in PITS has been collected
for more than 10 years and includes issues on robotic
satellite missions and human-rated systems.
Nowadays, similar defect tracking systems, such as
Bugzilla1
, have become very popular, largely due to the
spread of open source software development. These
systems help to track bugs and changes in the code, to
submit and review patches, to manage quality
assurance, to support communication between
developers, etc.
As compared to newer systems, the problem with
PITS is that there is a lack of consistency in how each
1
http://www.bugzilla.org/
of the projects collected issue data. In most instances,
the specific configuration of the information captured
about an issue was tailored by the IV&V project to
meet its needs. This has created consistency problems
when metrics data is pulled across projects. While
there was a set of required data fields, the majorities of
those fields do not provide information in regards to
the quality of the issue and are not very suitable for
comparing projects.
A common issue among defect tracking systems is
that they are useful for storing day-to-day information
and generating small-scale tactical reports (e.g., “list
the bugs we found last Tuesday”), but difficult to use
for high-end business strategic analysis (e.g., “in the
past, what methods have proved most cost effective in
finding bugs?”). Another issue common to these
systems is that most of the data is unstructured (i.e.,
free text). Specific to PITS is that the database fields
in PITS keep changing, yet the nature of the
unstructured text remains constant. In consequence,
one logical choice in the analysis of defect reports is a
combination of text mining and machine learning.
In this paper we present a new approach for
extracting general conclusions from PITS data based
on text mining and machine learning methods, which
are low cost, automatic, and rapid. We designed and
built a tool named SEVERIS (SEVERity ISsue
assessment) to automatically review issue reports and
alert when a proposed severity is anomalous. The way
SEVRIS is built provides the probabilities that the
assessment is correct. These probabilities can be used
to guide decision making in this process. Assigning
the correct severity levels to issue reports is extremely
important in the process employed at NASA, as it
directly impacts resource allocation and planning of
subsequent defect fixing activities.
NASA uses a five-point scale to score issue
severity. The scale ranges one to five, worst to dullest,
respectively. A different scale is used for robotic and
human-rated missions (see Table 1).
Predicting which bugs
get fixed.

Guo et al.
Automated Severity Assessment of Software Defect Reports
Tim Menzies
Lane Department of Computer Science,
West Virginia University
PO Box 6109, Morgantown, WV, 26506
304 293 0405
tim@menzies.us
Andrian Marcus
Department of Computer Science
Wayne State University
Detroit, MI 48202
313 577 5408
amarcus@wayne.edu
Abstract
In mission critical systems, such as those developed
by NASA, it is very important that the test engineers
properly recognize the severity of each issue they
identify during testing. Proper severity assessment is
essential for appropriate resource allocation and
planning for fixing activities and additional testing.
Severity assessment is strongly influenced by the
experience of the test engineers and by the time they
spend on each issue.
The paper presents a new and automated method
named SEVERIS (SEVERity ISsue assessment), which
assists the test engineer in assigning severity levels to
defect reports. SEVERIS is based on standard text
mining and machine learning techniques applied to
existing sets of defect reports. A case study on using
SEVERIS with data from NASA’s Project and Issue
Tracking System (PITS) is presented in the paper. The
case study results indicate that SEVERIS is a good
predictor for issue severity levels, while it is easy to
use and efficient.
1. Introduction
NASA’s software Independent Verification and
Validation (IV&V) Program captures all of its findings
in a database called the Project and Issue Tracking
System (PITS). The data in PITS has been collected
for more than 10 years and includes issues on robotic
satellite missions and human-rated systems.
Nowadays, similar defect tracking systems, such as
Bugzilla1
, have become very popular, largely due to the
spread of open source software development. These
systems help to track bugs and changes in the code, to
submit and review patches, to manage quality
assurance, to support communication between
developers, etc.
As compared to newer systems, the problem with
PITS is that there is a lack of consistency in how each
1
http://www.bugzilla.org/
of the projects collected issue data. In most instances,
the specific configuration of the information captured
about an issue was tailored by the IV&V project to
meet its needs. This has created consistency problems
when metrics data is pulled across projects. While
there was a set of required data fields, the majorities of
those fields do not provide information in regards to
the quality of the issue and are not very suitable for
comparing projects.
A common issue among defect tracking systems is
that they are useful for storing day-to-day information
and generating small-scale tactical reports (e.g., “list
the bugs we found last Tuesday”), but difficult to use
for high-end business strategic analysis (e.g., “in the
past, what methods have proved most cost effective in
finding bugs?”). Another issue common to these
systems is that most of the data is unstructured (i.e.,
free text). Specific to PITS is that the database fields
in PITS keep changing, yet the nature of the
unstructured text remains constant. In consequence,
one logical choice in the analysis of defect reports is a
combination of text mining and machine learning.
In this paper we present a new approach for
extracting general conclusions from PITS data based
on text mining and machine learning methods, which
are low cost, automatic, and rapid. We designed and
built a tool named SEVERIS (SEVERity ISsue
assessment) to automatically review issue reports and
alert when a proposed severity is anomalous. The way
SEVRIS is built provides the probabilities that the
assessment is correct. These probabilities can be used
to guide decision making in this process. Assigning
the correct severity levels to issue reports is extremely
important in the process employed at NASA, as it
directly impacts resource allocation and planning of
subsequent defect fixing activities.
NASA uses a five-point scale to score issue
severity. The scale ranges one to five, worst to dullest,
respectively. A different scale is used for robotic and
human-rated missions (see Table 1).
Predicting which bugs
get fixed.

Guo et al.
Predicting Severity of
a reported bug.

Lamkanfi et al.
Automated Severity Assessment of Software Defect Reports
Tim Menzies
Lane Department of Computer Science,
West Virginia University
PO Box 6109, Morgantown, WV, 26506
304 293 0405
tim@menzies.us
Andrian Marcus
Department of Computer Science
Wayne State University
Detroit, MI 48202
313 577 5408
amarcus@wayne.edu
Abstract
In mission critical systems, such as those developed
by NASA, it is very important that the test engineers
properly recognize the severity of each issue they
identify during testing. Proper severity assessment is
essential for appropriate resource allocation and
planning for fixing activities and additional testing.
Severity assessment is strongly influenced by the
experience of the test engineers and by the time they
spend on each issue.
The paper presents a new and automated method
named SEVERIS (SEVERity ISsue assessment), which
assists the test engineer in assigning severity levels to
defect reports. SEVERIS is based on standard text
mining and machine learning techniques applied to
existing sets of defect reports. A case study on using
SEVERIS with data from NASA’s Project and Issue
Tracking System (PITS) is presented in the paper. The
case study results indicate that SEVERIS is a good
predictor for issue severity levels, while it is easy to
use and efficient.
1. Introduction
NASA’s software Independent Verification and
Validation (IV&V) Program captures all of its findings
in a database called the Project and Issue Tracking
System (PITS). The data in PITS has been collected
for more than 10 years and includes issues on robotic
satellite missions and human-rated systems.
Nowadays, similar defect tracking systems, such as
Bugzilla1
, have become very popular, largely due to the
spread of open source software development. These
systems help to track bugs and changes in the code, to
submit and review patches, to manage quality
assurance, to support communication between
developers, etc.
As compared to newer systems, the problem with
PITS is that there is a lack of consistency in how each
1
http://www.bugzilla.org/
of the projects collected issue data. In most instances,
the specific configuration of the information captured
about an issue was tailored by the IV&V project to
meet its needs. This has created consistency problems
when metrics data is pulled across projects. While
there was a set of required data fields, the majorities of
those fields do not provide information in regards to
the quality of the issue and are not very suitable for
comparing projects.
A common issue among defect tracking systems is
that they are useful for storing day-to-day information
and generating small-scale tactical reports (e.g., “list
the bugs we found last Tuesday”), but difficult to use
for high-end business strategic analysis (e.g., “in the
past, what methods have proved most cost effective in
finding bugs?”). Another issue common to these
systems is that most of the data is unstructured (i.e.,
free text). Specific to PITS is that the database fields
in PITS keep changing, yet the nature of the
unstructured text remains constant. In consequence,
one logical choice in the analysis of defect reports is a
combination of text mining and machine learning.
In this paper we present a new approach for
extracting general conclusions from PITS data based
on text mining and machine learning methods, which
are low cost, automatic, and rapid. We designed and
built a tool named SEVERIS (SEVERity ISsue
assessment) to automatically review issue reports and
alert when a proposed severity is anomalous. The way
SEVRIS is built provides the probabilities that the
assessment is correct. These probabilities can be used
to guide decision making in this process. Assigning
the correct severity levels to issue reports is extremely
important in the process employed at NASA, as it
directly impacts resource allocation and planning of
subsequent defect fixing activities.
NASA uses a five-point scale to score issue
severity. The scale ranges one to five, worst to dullest,
respectively. A different scale is used for robotic and
human-rated missions (see Table 1).
Predicting which bugs
get fixed.

Guo et al.
Predicting Severity of
a reported bug.

Lamkanfi et al.
Characterizing re-
opened bugs.

Zimmermann et al.
Automated Severity Assessment of Software Defect Reports
Tim Menzies
Lane Department of Computer Science,
West Virginia University
PO Box 6109, Morgantown, WV, 26506
304 293 0405
tim@menzies.us
Andrian Marcus
Department of Computer Science
Wayne State University
Detroit, MI 48202
313 577 5408
amarcus@wayne.edu
Abstract
In mission critical systems, such as those developed
by NASA, it is very important that the test engineers
properly recognize the severity of each issue they
identify during testing. Proper severity assessment is
essential for appropriate resource allocation and
planning for fixing activities and additional testing.
Severity assessment is strongly influenced by the
experience of the test engineers and by the time they
spend on each issue.
The paper presents a new and automated method
named SEVERIS (SEVERity ISsue assessment), which
assists the test engineer in assigning severity levels to
defect reports. SEVERIS is based on standard text
mining and machine learning techniques applied to
existing sets of defect reports. A case study on using
SEVERIS with data from NASA’s Project and Issue
Tracking System (PITS) is presented in the paper. The
case study results indicate that SEVERIS is a good
predictor for issue severity levels, while it is easy to
use and efficient.
1. Introduction
NASA’s software Independent Verification and
Validation (IV&V) Program captures all of its findings
in a database called the Project and Issue Tracking
System (PITS). The data in PITS has been collected
for more than 10 years and includes issues on robotic
satellite missions and human-rated systems.
Nowadays, similar defect tracking systems, such as
Bugzilla1
, have become very popular, largely due to the
spread of open source software development. These
systems help to track bugs and changes in the code, to
submit and review patches, to manage quality
assurance, to support communication between
developers, etc.
As compared to newer systems, the problem with
PITS is that there is a lack of consistency in how each
1
http://www.bugzilla.org/
of the projects collected issue data. In most instances,
the specific configuration of the information captured
about an issue was tailored by the IV&V project to
meet its needs. This has created consistency problems
when metrics data is pulled across projects. While
there was a set of required data fields, the majorities of
those fields do not provide information in regards to
the quality of the issue and are not very suitable for
comparing projects.
A common issue among defect tracking systems is
that they are useful for storing day-to-day information
and generating small-scale tactical reports (e.g., “list
the bugs we found last Tuesday”), but difficult to use
for high-end business strategic analysis (e.g., “in the
past, what methods have proved most cost effective in
finding bugs?”). Another issue common to these
systems is that most of the data is unstructured (i.e.,
free text). Specific to PITS is that the database fields
in PITS keep changing, yet the nature of the
unstructured text remains constant. In consequence,
one logical choice in the analysis of defect reports is a
combination of text mining and machine learning.
In this paper we present a new approach for
extracting general conclusions from PITS data based
on text mining and machine learning methods, which
are low cost, automatic, and rapid. We designed and
built a tool named SEVERIS (SEVERity ISsue
assessment) to automatically review issue reports and
alert when a proposed severity is anomalous. The way
SEVRIS is built provides the probabilities that the
assessment is correct. These probabilities can be used
to guide decision making in this process. Assigning
the correct severity levels to issue reports is extremely
important in the process employed at NASA, as it
directly impacts resource allocation and planning of
subsequent defect fixing activities.
NASA uses a five-point scale to score issue
severity. The scale ranges one to five, worst to dullest,
respectively. A different scale is used for robotic and
human-rated missions (see Table 1).
Predicting which bugs
get fixed.

Guo et al.
Predicting Severity of
a reported bug.

Lamkanfi et al.
Characterizing re-
opened bugs.

Zimmermann et al.
What makes a good
bug report.
Bettenburg et al.
Automated Severity Assessment of Software Defect Reports
Tim Menzies
Lane Department of Computer Science,
West Virginia University
PO Box 6109, Morgantown, WV, 26506
304 293 0405
tim@menzies.us
Andrian Marcus
Department of Computer Science
Wayne State University
Detroit, MI 48202
313 577 5408
amarcus@wayne.edu
Abstract
In mission critical systems, such as those developed
by NASA, it is very important that the test engineers
properly recognize the severity of each issue they
identify during testing. Proper severity assessment is
essential for appropriate resource allocation and
planning for fixing activities and additional testing.
Severity assessment is strongly influenced by the
experience of the test engineers and by the time they
spend on each issue.
The paper presents a new and automated method
named SEVERIS (SEVERity ISsue assessment), which
assists the test engineer in assigning severity levels to
defect reports. SEVERIS is based on standard text
mining and machine learning techniques applied to
existing sets of defect reports. A case study on using
SEVERIS with data from NASA’s Project and Issue
Tracking System (PITS) is presented in the paper. The
case study results indicate that SEVERIS is a good
predictor for issue severity levels, while it is easy to
use and efficient.
1. Introduction
NASA’s software Independent Verification and
Validation (IV&V) Program captures all of its findings
in a database called the Project and Issue Tracking
System (PITS). The data in PITS has been collected
for more than 10 years and includes issues on robotic
satellite missions and human-rated systems.
Nowadays, similar defect tracking systems, such as
Bugzilla1
, have become very popular, largely due to the
spread of open source software development. These
systems help to track bugs and changes in the code, to
submit and review patches, to manage quality
assurance, to support communication between
developers, etc.
As compared to newer systems, the problem with
PITS is that there is a lack of consistency in how each
1
http://www.bugzilla.org/
of the projects collected issue data. In most instances,
the specific configuration of the information captured
about an issue was tailored by the IV&V project to
meet its needs. This has created consistency problems
when metrics data is pulled across projects. While
there was a set of required data fields, the majorities of
those fields do not provide information in regards to
the quality of the issue and are not very suitable for
comparing projects.
A common issue among defect tracking systems is
that they are useful for storing day-to-day information
and generating small-scale tactical reports (e.g., “list
the bugs we found last Tuesday”), but difficult to use
for high-end business strategic analysis (e.g., “in the
past, what methods have proved most cost effective in
finding bugs?”). Another issue common to these
systems is that most of the data is unstructured (i.e.,
free text). Specific to PITS is that the database fields
in PITS keep changing, yet the nature of the
unstructured text remains constant. In consequence,
one logical choice in the analysis of defect reports is a
combination of text mining and machine learning.
In this paper we present a new approach for
extracting general conclusions from PITS data based
on text mining and machine learning methods, which
are low cost, automatic, and rapid. We designed and
built a tool named SEVERIS (SEVERity ISsue
assessment) to automatically review issue reports and
alert when a proposed severity is anomalous. The way
SEVRIS is built provides the probabilities that the
assessment is correct. These probabilities can be used
to guide decision making in this process. Assigning
the correct severity levels to issue reports is extremely
important in the process employed at NASA, as it
directly impacts resource allocation and planning of
subsequent defect fixing activities.
NASA uses a five-point scale to score issue
severity. The scale ranges one to five, worst to dullest,
respectively. A different scale is used for robotic and
human-rated missions (see Table 1).
Predicting which bugs
get fixed.

Guo et al.
Predicting Severity of
a reported bug.

Lamkanfi et al.
Characterizing re-
opened bugs.

Zimmermann et al.
What makes a good
bug report.
Bettenburg et al.
Automated Severity Assessment of Software Defect Reports
Tim Menzies
Lane Department of Computer Science,
West Virginia University
PO Box 6109, Morgantown, WV, 26506
304 293 0405
tim@menzies.us
Andrian Marcus
Department of Computer Science
Wayne State University
Detroit, MI 48202
313 577 5408
amarcus@wayne.edu
Abstract
In mission critical systems, such as those developed
by NASA, it is very important that the test engineers
properly recognize the severity of each issue they
identify during testing. Proper severity assessment is
essential for appropriate resource allocation and
planning for fixing activities and additional testing.
Severity assessment is strongly influenced by the
experience of the test engineers and by the time they
spend on each issue.
The paper presents a new and automated method
named SEVERIS (SEVERity ISsue assessment), which
assists the test engineer in assigning severity levels to
defect reports. SEVERIS is based on standard text
mining and machine learning techniques applied to
existing sets of defect reports. A case study on using
SEVERIS with data from NASA’s Project and Issue
Tracking System (PITS) is presented in the paper. The
case study results indicate that SEVERIS is a good
predictor for issue severity levels, while it is easy to
use and efficient.
1. Introduction
NASA’s software Independent Verification and
Validation (IV&V) Program captures all of its findings
in a database called the Project and Issue Tracking
System (PITS). The data in PITS has been collected
for more than 10 years and includes issues on robotic
satellite missions and human-rated systems.
Nowadays, similar defect tracking systems, such as
Bugzilla1
, have become very popular, largely due to the
spread of open source software development. These
systems help to track bugs and changes in the code, to
submit and review patches, to manage quality
assurance, to support communication between
developers, etc.
As compared to newer systems, the problem with
PITS is that there is a lack of consistency in how each
1
http://www.bugzilla.org/
of the projects collected issue data. In most instances,
the specific configuration of the information captured
about an issue was tailored by the IV&V project to
meet its needs. This has created consistency problems
when metrics data is pulled across projects. While
there was a set of required data fields, the majorities of
those fields do not provide information in regards to
the quality of the issue and are not very suitable for
comparing projects.
A common issue among defect tracking systems is
that they are useful for storing day-to-day information
and generating small-scale tactical reports (e.g., “list
the bugs we found last Tuesday”), but difficult to use
for high-end business strategic analysis (e.g., “in the
past, what methods have proved most cost effective in
finding bugs?”). Another issue common to these
systems is that most of the data is unstructured (i.e.,
free text). Specific to PITS is that the database fields
in PITS keep changing, yet the nature of the
unstructured text remains constant. In consequence,
one logical choice in the analysis of defect reports is a
combination of text mining and machine learning.
In this paper we present a new approach for
extracting general conclusions from PITS data based
on text mining and machine learning methods, which
are low cost, automatic, and rapid. We designed and
built a tool named SEVERIS (SEVERity ISsue
assessment) to automatically review issue reports and
alert when a proposed severity is anomalous. The way
SEVRIS is built provides the probabilities that the
assessment is correct. These probabilities can be used
to guide decision making in this process. Assigning
the correct severity levels to issue reports is extremely
important in the process employed at NASA, as it
directly impacts resource allocation and planning of
subsequent defect fixing activities.
NASA uses a five-point scale to score issue
severity. The scale ranges one to five, worst to dullest,
respectively. A different scale is used for robotic and
human-rated missions (see Table 1).
Predicting which bugs
get fixed.

Guo et al.
Predicting Severity of
a reported bug.

Lamkanfi et al.
Characterizing re-
opened bugs.

Zimmermann et al.
What makes a good
bug report.
Bettenburg et al.
Do clones matter?
Juergens et al.
Automated Severity Assessment of Software Defect Reports
Tim Menzies
Lane Department of Computer Science,
West Virginia University
PO Box 6109, Morgantown, WV, 26506
304 293 0405
tim@menzies.us
Andrian Marcus
Department of Computer Science
Wayne State University
Detroit, MI 48202
313 577 5408
amarcus@wayne.edu
Abstract
In mission critical systems, such as those developed
by NASA, it is very important that the test engineers
properly recognize the severity of each issue they
identify during testing. Proper severity assessment is
essential for appropriate resource allocation and
planning for fixing activities and additional testing.
Severity assessment is strongly influenced by the
experience of the test engineers and by the time they
spend on each issue.
The paper presents a new and automated method
named SEVERIS (SEVERity ISsue assessment), which
assists the test engineer in assigning severity levels to
defect reports. SEVERIS is based on standard text
mining and machine learning techniques applied to
existing sets of defect reports. A case study on using
SEVERIS with data from NASA’s Project and Issue
Tracking System (PITS) is presented in the paper. The
case study results indicate that SEVERIS is a good
predictor for issue severity levels, while it is easy to
use and efficient.
1. Introduction
NASA’s software Independent Verification and
Validation (IV&V) Program captures all of its findings
in a database called the Project and Issue Tracking
System (PITS). The data in PITS has been collected
for more than 10 years and includes issues on robotic
satellite missions and human-rated systems.
Nowadays, similar defect tracking systems, such as
Bugzilla1
, have become very popular, largely due to the
spread of open source software development. These
systems help to track bugs and changes in the code, to
submit and review patches, to manage quality
assurance, to support communication between
developers, etc.
As compared to newer systems, the problem with
PITS is that there is a lack of consistency in how each
1
http://www.bugzilla.org/
of the projects collected issue data. In most instances,
the specific configuration of the information captured
about an issue was tailored by the IV&V project to
meet its needs. This has created consistency problems
when metrics data is pulled across projects. While
there was a set of required data fields, the majorities of
those fields do not provide information in regards to
the quality of the issue and are not very suitable for
comparing projects.
A common issue among defect tracking systems is
that they are useful for storing day-to-day information
and generating small-scale tactical reports (e.g., “list
the bugs we found last Tuesday”), but difficult to use
for high-end business strategic analysis (e.g., “in the
past, what methods have proved most cost effective in
finding bugs?”). Another issue common to these
systems is that most of the data is unstructured (i.e.,
free text). Specific to PITS is that the database fields
in PITS keep changing, yet the nature of the
unstructured text remains constant. In consequence,
one logical choice in the analysis of defect reports is a
combination of text mining and machine learning.
In this paper we present a new approach for
extracting general conclusions from PITS data based
on text mining and machine learning methods, which
are low cost, automatic, and rapid. We designed and
built a tool named SEVERIS (SEVERity ISsue
assessment) to automatically review issue reports and
alert when a proposed severity is anomalous. The way
SEVRIS is built provides the probabilities that the
assessment is correct. These probabilities can be used
to guide decision making in this process. Assigning
the correct severity levels to issue reports is extremely
important in the process employed at NASA, as it
directly impacts resource allocation and planning of
subsequent defect fixing activities.
NASA uses a five-point scale to score issue
severity. The scale ranges one to five, worst to dullest,
respectively. A different scale is used for robotic and
human-rated missions (see Table 1).
Predicting which bugs
get fixed.

Guo et al.
Predicting Severity of
a reported bug.

Lamkanfi et al.
Characterizing re-
opened bugs.

Zimmermann et al.
What makes a good
bug report.
Bettenburg et al.
Do clones matter?
Juergens et al.
Frequency and Risks
of changes to clones.
Göde et al.
Automated Severity Assessment of Software Defect Reports
Tim Menzies
Lane Department of Computer Science,
West Virginia University
PO Box 6109, Morgantown, WV, 26506
304 293 0405
tim@menzies.us
Andrian Marcus
Department of Computer Science
Wayne State University
Detroit, MI 48202
313 577 5408
amarcus@wayne.edu
Abstract
In mission critical systems, such as those developed
by NASA, it is very important that the test engineers
properly recognize the severity of each issue they
identify during testing. Proper severity assessment is
essential for appropriate resource allocation and
planning for fixing activities and additional testing.
Severity assessment is strongly influenced by the
experience of the test engineers and by the time they
spend on each issue.
The paper presents a new and automated method
named SEVERIS (SEVERity ISsue assessment), which
assists the test engineer in assigning severity levels to
defect reports. SEVERIS is based on standard text
mining and machine learning techniques applied to
existing sets of defect reports. A case study on using
SEVERIS with data from NASA’s Project and Issue
Tracking System (PITS) is presented in the paper. The
case study results indicate that SEVERIS is a good
predictor for issue severity levels, while it is easy to
use and efficient.
1. Introduction
NASA’s software Independent Verification and
Validation (IV&V) Program captures all of its findings
in a database called the Project and Issue Tracking
System (PITS). The data in PITS has been collected
for more than 10 years and includes issues on robotic
satellite missions and human-rated systems.
Nowadays, similar defect tracking systems, such as
Bugzilla1
, have become very popular, largely due to the
spread of open source software development. These
systems help to track bugs and changes in the code, to
submit and review patches, to manage quality
assurance, to support communication between
developers, etc.
As compared to newer systems, the problem with
PITS is that there is a lack of consistency in how each
1
http://www.bugzilla.org/
of the projects collected issue data. In most instances,
the specific configuration of the information captured
about an issue was tailored by the IV&V project to
meet its needs. This has created consistency problems
when metrics data is pulled across projects. While
there was a set of required data fields, the majorities of
those fields do not provide information in regards to
the quality of the issue and are not very suitable for
comparing projects.
A common issue among defect tracking systems is
that they are useful for storing day-to-day information
and generating small-scale tactical reports (e.g., “list
the bugs we found last Tuesday”), but difficult to use
for high-end business strategic analysis (e.g., “in the
past, what methods have proved most cost effective in
finding bugs?”). Another issue common to these
systems is that most of the data is unstructured (i.e.,
free text). Specific to PITS is that the database fields
in PITS keep changing, yet the nature of the
unstructured text remains constant. In consequence,
one logical choice in the analysis of defect reports is a
combination of text mining and machine learning.
In this paper we present a new approach for
extracting general conclusions from PITS data based
on text mining and machine learning methods, which
are low cost, automatic, and rapid. We designed and
built a tool named SEVERIS (SEVERity ISsue
assessment) to automatically review issue reports and
alert when a proposed severity is anomalous. The way
SEVRIS is built provides the probabilities that the
assessment is correct. These probabilities can be used
to guide decision making in this process. Assigning
the correct severity levels to issue reports is extremely
important in the process employed at NASA, as it
directly impacts resource allocation and planning of
subsequent defect fixing activities.
NASA uses a five-point scale to score issue
severity. The scale ranges one to five, worst to dullest,
respectively. A different scale is used for robotic and
human-rated missions (see Table 1).
Predicting which bugs
get fixed.

Guo et al.
Predicting Severity of
a reported bug.

Lamkanfi et al.
Characterizing re-
opened bugs.

Zimmermann et al.
What makes a good
bug report.
Bettenburg et al.
Do clones matter?
Juergens et al.
Frequency and Risks
of changes to clones.
Göde et al.
Do developers care
about code smells?
Yamashita et al.
Automated Severity Assessment of Software Defect Reports
Tim Menzies
Lane Department of Computer Science,
West Virginia University
PO Box 6109, Morgantown, WV, 26506
304 293 0405
tim@menzies.us
Andrian Marcus
Department of Computer Science
Wayne State University
Detroit, MI 48202
313 577 5408
amarcus@wayne.edu
Abstract
In mission critical systems, such as those developed
by NASA, it is very important that the test engineers
properly recognize the severity of each issue they
identify during testing. Proper severity assessment is
essential for appropriate resource allocation and
planning for fixing activities and additional testing.
Severity assessment is strongly influenced by the
experience of the test engineers and by the time they
spend on each issue.
The paper presents a new and automated method
named SEVERIS (SEVERity ISsue assessment), which
assists the test engineer in assigning severity levels to
defect reports. SEVERIS is based on standard text
mining and machine learning techniques applied to
existing sets of defect reports. A case study on using
SEVERIS with data from NASA’s Project and Issue
Tracking System (PITS) is presented in the paper. The
case study results indicate that SEVERIS is a good
predictor for issue severity levels, while it is easy to
use and efficient.
1. Introduction
NASA’s software Independent Verification and
Validation (IV&V) Program captures all of its findings
in a database called the Project and Issue Tracking
System (PITS). The data in PITS has been collected
for more than 10 years and includes issues on robotic
satellite missions and human-rated systems.
Nowadays, similar defect tracking systems, such as
Bugzilla1
, have become very popular, largely due to the
spread of open source software development. These
systems help to track bugs and changes in the code, to
submit and review patches, to manage quality
assurance, to support communication between
developers, etc.
As compared to newer systems, the problem with
PITS is that there is a lack of consistency in how each
1
http://www.bugzilla.org/
of the projects collected issue data. In most instances,
the specific configuration of the information captured
about an issue was tailored by the IV&V project to
meet its needs. This has created consistency problems
when metrics data is pulled across projects. While
there was a set of required data fields, the majorities of
those fields do not provide information in regards to
the quality of the issue and are not very suitable for
comparing projects.
A common issue among defect tracking systems is
that they are useful for storing day-to-day information
and generating small-scale tactical reports (e.g., “list
the bugs we found last Tuesday”), but difficult to use
for high-end business strategic analysis (e.g., “in the
past, what methods have proved most cost effective in
finding bugs?”). Another issue common to these
systems is that most of the data is unstructured (i.e.,
free text). Specific to PITS is that the database fields
in PITS keep changing, yet the nature of the
unstructured text remains constant. In consequence,
one logical choice in the analysis of defect reports is a
combination of text mining and machine learning.
In this paper we present a new approach for
extracting general conclusions from PITS data based
on text mining and machine learning methods, which
are low cost, automatic, and rapid. We designed and
built a tool named SEVERIS (SEVERity ISsue
assessment) to automatically review issue reports and
alert when a proposed severity is anomalous. The way
SEVRIS is built provides the probabilities that the
assessment is correct. These probabilities can be used
to guide decision making in this process. Assigning
the correct severity levels to issue reports is extremely
important in the process employed at NASA, as it
directly impacts resource allocation and planning of
subsequent defect fixing activities.
NASA uses a five-point scale to score issue
severity. The scale ranges one to five, worst to dullest,
respectively. A different scale is used for robotic and
human-rated missions (see Table 1).
Predicting which bugs
get fixed.

Guo et al.
Predicting Severity of
a reported bug.

Lamkanfi et al.
Characterizing re-
opened bugs.

Zimmermann et al.
What makes a good
bug report.
Bettenburg et al.
Do clones matter?
Juergens et al.
Frequency and Risks
of changes to clones.
Göde et al.
Do developers care
about code smells?
Yamashita et al.
Inconsistent Changes
to Clones at Release
Level. Bettenburg et
al.
March OctoberJanuary June November
2007
March OctoberJanuary June November
2007
Meet the Rebels!
Challenge conventional wisdom
There are many, varied stories behind
the observed SE artifacts.
Ignoring available data could lead to
missing fundamentally important
insights
CHALLENGE THE ASSUMPTIONS
When the same bug is reported several times
in Bugzilla, developers are slowed down
https://fedoraproject.org/wiki/How_to_file_a_bug_report#Avoiding_Duplicate_Bug_Reports
When the same bug is reported several times
in Bugzilla, developers are slowed down
https://fedoraproject.org/wiki/How_to_file_a_bug_report#Avoiding_Duplicate_Bug_Reports
A duplicate bug is a burden in the testing cycle.
https://www.softwaretestinghelp.com/how-to-write-good-bug-report/
When the same bug is reported several times
in Bugzilla, developers are slowed down
https://fedoraproject.org/wiki/How_to_file_a_bug_report#Avoiding_Duplicate_Bug_Reports
Several duplicate bug reports just cause an
administration headache for developers
http://wicket.apache.org/help/reportabug.html
A duplicate bug is a burden in the testing cycle.
https://www.softwaretestinghelp.com/how-to-write-good-bug-report/
When the same bug is reported several times
in Bugzilla, developers are slowed down
https://fedoraproject.org/wiki/How_to_file_a_bug_report#Avoiding_Duplicate_Bug_Reports
Duplicate bug reports, […] consume time of bug triagers
and software developers that might better be spent
working on reports that describe unique requests.
Lyndon Hiew , MSc. Thesis, 2006, UBC
Several duplicate bug reports just cause an
administration headache for developers
http://wicket.apache.org/help/reportabug.html
A duplicate bug is a burden in the testing cycle.
https://www.softwaretestinghelp.com/how-to-write-good-bug-report/
DON’T BE
THAT GUY
who submitted a
DUPLICATE
It doesn't even mean that that the resolved
bug report can now be ignored, since we
have seen instances of late- identification
of duplicates (e.g., BR-C in Figure 2) in
which accumulated knowledge and
dialogue may still be relevant to the
resolution of the other bug reports in the
BRN.
Robert J. Sandusky, Les Gasser, and Gabriel Ripoche. Bug report networks: Varieties, strategies, and impacts in an oss
development community. In Proc. of ICSE Workshop on Mining Software Repositories, 2004.
“Duplicates are not really problems.
They often add useful information.
That this information were filed under
a new report is not ideal though.”
N. Bettenburg, S. Just, A. Schröter, C. Weiss, R. Premraj, and T. Zimmermann. What makes a good bug report? In
Proceedings of the 16th International Symposium on Foundations of Software Engineering, November 2008.
can
gly
m-
ese
to
in-
item h hm P(hm | h)
steps to reproduce 47 42 0.8936
stack traces 45 35 0.7778
screenshots 42 17 0.4048
test cases 39 11 0.2821
observed behavior 44 12 0.2727
code examples 38 9 0.2368
error reports 33 3 0.0909
build information 34 3 0.0882
summary 36 3 0.0833
expected behavior 41 3 0.0732
version 38 1 0.0236
component 34 0 0.0000
hardware 13 0 0.0000
operating system 34 0 0.0000
product 30 0 0.0000
severity 26 0 0.0000
Table 1. Lists all items from the first survey part with
the count how often they helped (h), how often they helped
the most (hm), and the probability that an item helped most
under the condition that it helped.
5. Metric
Now that we got an idea about important information contained
in a bug report and have a sample of reports ranked by experts we
item a am P(am|a)
errors in steps to reproduce 34 29 0.8235
incomplete information 44 35 0.7727
wrong observed behavior 15 11 0.6667
wrong version number 21 8 0.2857
errors in test cases 14 4 0.2857
unstructured text 19 7 0.2632
wrong operating system 8 3 0.2500
wrong expected behavior 18 7 0.2222
non-technical language 14 3 0.2143
too long text 11 2 0.1818
errors in code examples 11 2 0.1818
bad grammar 29 5 0.1724
wrong component name 22 2 0.0909
prose text 12 2 0.0833
duplicates 31 2 0.0645
no spellcheck 8 0 0.0000
wrong hardware 5 0 0.0000
spam 1 0 0.0000
wrong product name 11 0 0.0000
errors in strack traces 2 0 0.0000
Table 2. Lists all items from the second part with the
count how often they harmed (a), how often they harmed
the most (am), and the probability that an item harmed most
under the condition that it harmed.
was filled out by 48 out of 365 developers in total. Secondly we
present the results of our metric which we compare to the expert
opinions we gained from the ugly reports study.
T
r
m
repo
tom
acce
W
vide
have
0.3.
7. D
S
tion.
the r
we c
item a am P(am|a)
errors in steps to reproduce 34 29 0.8235
incomplete information 44 35 0.7727
wrong observed behavior 15 11 0.6667
wrong version number 21 8 0.2857
errors in test cases 14 4 0.2857
unstructured text 19 7 0.2632
wrong operating system 8 3 0.2500
wrong expected behavior 18 7 0.2222
non-technical language 14 3 0.2143
too long text 11 2 0.1818
errors in code examples 11 2 0.1818
bad grammar 29 5 0.1724
wrong component name 22 2 0.0909
prose text 12 2 0.0833
duplicates 31 2 0.0645
no spellcheck 8 0 0.0000
wrong hardware 5 0 0.0000
spam 1 0 0.0000
wrong product name 11 0 0.0000
errors in strack traces 2 0 0.0000
Table 2. Lists all items from the second part with the
count how often they harmed (a), how often they harmed
the most (am), and the probability that an item harmed most
under the condition that it harmed.
was filled out by 48 out of 365 developers in total. Secondly we
rep
tom
acc
vid
hav
0.3
7.
What Helps the Most? What Harms the Most?
N. Bettenburg, S. Just, A. Schröter, C. Weiss, R. Premraj, and T. Zimmermann. What makes a good bug report? In Proceedings of the 16th International
Symposium on Foundations of Software Engineering, November 2008.
PART 1
Is there extra information
in duplicate reports and
if so, can we quantify
how much?
PART 2
Is that extra information

helpful for carrying out
software engineering
tasks?
●
●
●
● ●
●
●
●
●
● ●
●
●
●
●
●
●
●
● ● ●
●
●
●
●
●
●
●
●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●
●
●
● ●
●
● ●
●
Reports/Month
●
● ●
● ●
●
●
●
●
● ● ● ●
● ● ●
●
●
● ● ● ● ● ● ● ● ● ●
●
● ●
● ●
●
●
●
● ● ● ● ● ● ●
● ●
● ● ● ●
●
● ●
●
●
● ●
●
● ● ● ● ● ● ● ●
● ●
●
● ● ● ● ● ●
●
010002000300040005000
1.0 2.0 2.1 2.1.1 2.1.2 2.1.3 3.0 3.0.1 3.0.2 3.1 3.1.1 3.1.2 3.2 3.2.2 3.3 Milestones
Oct'01 Jan'02 Apr'02 Jul'02 Oct'02 Jan'03 Apr'03 Jul'03 Oct'03 Jan'04 Apr'04 Jul'04 Oct'04 Jan'05 Apr'05 Jul'05 Oct'05 Jan'06 Apr'06 Jul'06 Oct'06 Jan'07 Apr'07 Jul'07 Oct'07
● Reports submitted (total) ● Duplicates submitted
~ 3,000 reports submitted per month

~ 13% duplicate bug reports
First, we need DATA … lot’s of DATA!
100,000
200,000
300,000
400,000
500,000
Mozilla
Bug Reports without duplicates Duplicate Reports Master Reports
269,222
116,727
36,697
50,000
100,000
150,000
200,000
250,000
Eclipse
167,494
27,838
16,511
Figure 4.1: Graphical representation of the collected bug report data.
The MOZILLA database was mined using a tool that reads the XML repre-
Inverse Duplicate Problem
27% (Mozilla)

31% (Eclipse)
Bug 137808
Summary: Exceptions from createFromString lock-up the editor
Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com>
Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com>
Status: VERIFIED FIXED QA Contact:
Severity: normal
Priority: P3 CC: merks@ca.ibm.com
Version: 2.2
Target Milestone: ---
Hardware: PC
OS: Windows XP
Whiteboard:
Description:
Opened: 2006-04-20 14:25 -
0400
As discussed on the newsgroup under the Thread with the same name I am opening
this bug entry. Here is a history of the thread.
-- From Ed Merks
Patrick,
The value is checked before it's applied and can't be applied until it's valid.
But this BigDecimal cases behaves oddly because the exception thrown by
new BigDecimal("badvalue")
has a null message and the property editor relies on returning a non-null
message string to indicate there is an error.
Please open a bugzilla which I'll fix like this:
### Eclipse Workspace Patch 1.0
#P org.eclipse.emf.edit.ui
Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java
===================================================================
RCS file:
/cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v
retrieving revision 1.10
diff -u -r1.10 PropertyDescriptor.java
--- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006
16:42:30 -0000 1.10
+++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006
11:59:10 -0000
@@ -162,7 +162,8 @@
}
catch (Exception exception)
{
- return exception.getMessage();
+ String message = exception.getMessage();
+ return message == null ? exception.toString() : message;
}
}
Diagnostic diagnostic =
Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value);
Patrick Sodre wrote:
Hi,
It seems that if the user inputs an invalid parameter that gets created from
"createFromString" the Editor locks-up until the user explicitly calls "restore
Default Value".
Is this the expected behavior or could something better be done? For
instance if an exception is thrown restore the value back to what it was before
after displaying a pop-up error message.
I understand that for DataTypes defined by the user he/she should take care
of catching the exceptions but for the default ones like BigInteger/BigDecimal
I think the EMF runtime could do some of the grunt work...
If you think this is something worth pursuing I could post an entry in
Bugzilla.
Regards,
Patrick Sodre
Below is the stack trace that I got from the Editor...
java.lang.NumberFormatException
at java.math.BigDecimal.<init>(BigDecimal.java:368)
at java.math.BigDecimal.<init>(BigDecimal.java:647)
at
org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559)
at
org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116)
at
org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183)
at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449)
at
org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135)
at
org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249)
at
------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 -------
The fix has been committed to CVS. Thanks for reporting this problem.
------- Comment #2 From Marcelo Paternostro 2006-04-27 10:44:24 -0400 -------
Fixed in the I200604270000 built
------- Comment #3 From Nick Boldt 2008-01-28 16:46:51 -0400 -------
Move to verified as per bug 206558.
Extracting Structural Information from Bug Reports (MSR 2008)
Bug 137808
Summary: Exceptions from createFromString lock-up the editor
Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com>
Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com>
Status: VERIFIED FIXED QA Contact:
Severity: normal
Priority: P3 CC: merks@ca.ibm.com
Version: 2.2
Target Milestone: ---
Hardware: PC
OS: Windows XP
Whiteboard:
Description:
Opened: 2006-04-20 14:25 -
0400
As discussed on the newsgroup under the Thread with the same name I am opening
this bug entry. Here is a history of the thread.
-- From Ed Merks
Patrick,
The value is checked before it's applied and can't be applied until it's valid.
But this BigDecimal cases behaves oddly because the exception thrown by
new BigDecimal("badvalue")
has a null message and the property editor relies on returning a non-null
message string to indicate there is an error.
Please open a bugzilla which I'll fix like this:
### Eclipse Workspace Patch 1.0
#P org.eclipse.emf.edit.ui
Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java
===================================================================
RCS file:
/cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v
retrieving revision 1.10
diff -u -r1.10 PropertyDescriptor.java
--- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006
16:42:30 -0000 1.10
+++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006
11:59:10 -0000
@@ -162,7 +162,8 @@
}
catch (Exception exception)
{
- return exception.getMessage();
+ String message = exception.getMessage();
+ return message == null ? exception.toString() : message;
}
}
Diagnostic diagnostic =
Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value);
Patrick Sodre wrote:
Hi,
It seems that if the user inputs an invalid parameter that gets created from
"createFromString" the Editor locks-up until the user explicitly calls "restore
Default Value".
Is this the expected behavior or could something better be done? For
instance if an exception is thrown restore the value back to what it was before
after displaying a pop-up error message.
I understand that for DataTypes defined by the user he/she should take care
of catching the exceptions but for the default ones like BigInteger/BigDecimal
I think the EMF runtime could do some of the grunt work...
If you think this is something worth pursuing I could post an entry in
Bugzilla.
Regards,
Patrick Sodre
Below is the stack trace that I got from the Editor...
java.lang.NumberFormatException
at java.math.BigDecimal.<init>(BigDecimal.java:368)
at java.math.BigDecimal.<init>(BigDecimal.java:647)
at
org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559)
at
org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116)
at
org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183)
at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449)
at
org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135)
at
org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249)
at
------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 -------
The fix has been committed to CVS. Thanks for reporting this problem.
------- Comment #2 From Marcelo Paternostro 2006-04-27 10:44:24 -0400 -------
Fixed in the I200604270000 built
------- Comment #3 From Nick Boldt 2008-01-28 16:46:51 -0400 -------
Move to verified as per bug 206558.
Extracting Structural Information from Bug Reports (MSR 2008)
METADATA
Bug 137808
Summary: Exceptions from createFromString lock-up the editor
Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com>
Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com>
Status: VERIFIED FIXED QA Contact:
Severity: normal
Priority: P3 CC: merks@ca.ibm.com
Version: 2.2
Target Milestone: ---
Hardware: PC
OS: Windows XP
Whiteboard:
Description:
Opened: 2006-04-20 14:25 -
0400
As discussed on the newsgroup under the Thread with the same name I am opening
this bug entry. Here is a history of the thread.
-- From Ed Merks
Patrick,
The value is checked before it's applied and can't be applied until it's valid.
But this BigDecimal cases behaves oddly because the exception thrown by
new BigDecimal("badvalue")
has a null message and the property editor relies on returning a non-null
message string to indicate there is an error.
Please open a bugzilla which I'll fix like this:
### Eclipse Workspace Patch 1.0
#P org.eclipse.emf.edit.ui
Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java
===================================================================
RCS file:
/cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v
retrieving revision 1.10
diff -u -r1.10 PropertyDescriptor.java
--- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006
16:42:30 -0000 1.10
+++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006
11:59:10 -0000
@@ -162,7 +162,8 @@
}
catch (Exception exception)
{
- return exception.getMessage();
+ String message = exception.getMessage();
+ return message == null ? exception.toString() : message;
}
}
Diagnostic diagnostic =
Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value);
Patrick Sodre wrote:
Hi,
It seems that if the user inputs an invalid parameter that gets created from
"createFromString" the Editor locks-up until the user explicitly calls "restore
Default Value".
Is this the expected behavior or could something better be done? For
instance if an exception is thrown restore the value back to what it was before
after displaying a pop-up error message.
I understand that for DataTypes defined by the user he/she should take care
of catching the exceptions but for the default ones like BigInteger/BigDecimal
I think the EMF runtime could do some of the grunt work...
If you think this is something worth pursuing I could post an entry in
Bugzilla.
Regards,
Patrick Sodre
Below is the stack trace that I got from the Editor...
java.lang.NumberFormatException
at java.math.BigDecimal.<init>(BigDecimal.java:368)
at java.math.BigDecimal.<init>(BigDecimal.java:647)
at
org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559)
at
org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116)
at
org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183)
at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449)
at
org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135)
at
org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249)
at
------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 -------
The fix has been committed to CVS. Thanks for reporting this problem.
------- Comment #2 From Marcelo Paternostro 2006-04-27 10:44:24 -0400 -------
Fixed in the I200604270000 built
------- Comment #3 From Nick Boldt 2008-01-28 16:46:51 -0400 -------
Move to verified as per bug 206558.
SOURCE CODE
Extracting Structural Information from Bug Reports (MSR 2008)
METADATA
Bug 137808
Summary: Exceptions from createFromString lock-up the editor
Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com>
Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com>
Status: VERIFIED FIXED QA Contact:
Severity: normal
Priority: P3 CC: merks@ca.ibm.com
Version: 2.2
Target Milestone: ---
Hardware: PC
OS: Windows XP
Whiteboard:
Description:
Opened: 2006-04-20 14:25 -
0400
As discussed on the newsgroup under the Thread with the same name I am opening
this bug entry. Here is a history of the thread.
-- From Ed Merks
Patrick,
The value is checked before it's applied and can't be applied until it's valid.
But this BigDecimal cases behaves oddly because the exception thrown by
new BigDecimal("badvalue")
has a null message and the property editor relies on returning a non-null
message string to indicate there is an error.
Please open a bugzilla which I'll fix like this:
### Eclipse Workspace Patch 1.0
#P org.eclipse.emf.edit.ui
Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java
===================================================================
RCS file:
/cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v
retrieving revision 1.10
diff -u -r1.10 PropertyDescriptor.java
--- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006
16:42:30 -0000 1.10
+++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006
11:59:10 -0000
@@ -162,7 +162,8 @@
}
catch (Exception exception)
{
- return exception.getMessage();
+ String message = exception.getMessage();
+ return message == null ? exception.toString() : message;
}
}
Diagnostic diagnostic =
Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value);
Patrick Sodre wrote:
Hi,
It seems that if the user inputs an invalid parameter that gets created from
"createFromString" the Editor locks-up until the user explicitly calls "restore
Default Value".
Is this the expected behavior or could something better be done? For
instance if an exception is thrown restore the value back to what it was before
after displaying a pop-up error message.
I understand that for DataTypes defined by the user he/she should take care
of catching the exceptions but for the default ones like BigInteger/BigDecimal
I think the EMF runtime could do some of the grunt work...
If you think this is something worth pursuing I could post an entry in
Bugzilla.
Regards,
Patrick Sodre
Below is the stack trace that I got from the Editor...
java.lang.NumberFormatException
at java.math.BigDecimal.<init>(BigDecimal.java:368)
at java.math.BigDecimal.<init>(BigDecimal.java:647)
at
org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559)
at
org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116)
at
org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183)
at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449)
at
org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135)
at
org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249)
at
------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 -------
The fix has been committed to CVS. Thanks for reporting this problem.
------- Comment #2 From Marcelo Paternostro 2006-04-27 10:44:24 -0400 -------
Fixed in the I200604270000 built
------- Comment #3 From Nick Boldt 2008-01-28 16:46:51 -0400 -------
Move to verified as per bug 206558.
SOURCE CODE
PATCHES
Extracting Structural Information from Bug Reports (MSR 2008)
METADATA
Bug 137808
Summary: Exceptions from createFromString lock-up the editor
Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com>
Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com>
Status: VERIFIED FIXED QA Contact:
Severity: normal
Priority: P3 CC: merks@ca.ibm.com
Version: 2.2
Target Milestone: ---
Hardware: PC
OS: Windows XP
Whiteboard:
Description:
Opened: 2006-04-20 14:25 -
0400
As discussed on the newsgroup under the Thread with the same name I am opening
this bug entry. Here is a history of the thread.
-- From Ed Merks
Patrick,
The value is checked before it's applied and can't be applied until it's valid.
But this BigDecimal cases behaves oddly because the exception thrown by
new BigDecimal("badvalue")
has a null message and the property editor relies on returning a non-null
message string to indicate there is an error.
Please open a bugzilla which I'll fix like this:
### Eclipse Workspace Patch 1.0
#P org.eclipse.emf.edit.ui
Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java
===================================================================
RCS file:
/cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v
retrieving revision 1.10
diff -u -r1.10 PropertyDescriptor.java
--- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006
16:42:30 -0000 1.10
+++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006
11:59:10 -0000
@@ -162,7 +162,8 @@
}
catch (Exception exception)
{
- return exception.getMessage();
+ String message = exception.getMessage();
+ return message == null ? exception.toString() : message;
}
}
Diagnostic diagnostic =
Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value);
Patrick Sodre wrote:
Hi,
It seems that if the user inputs an invalid parameter that gets created from
"createFromString" the Editor locks-up until the user explicitly calls "restore
Default Value".
Is this the expected behavior or could something better be done? For
instance if an exception is thrown restore the value back to what it was before
after displaying a pop-up error message.
I understand that for DataTypes defined by the user he/she should take care
of catching the exceptions but for the default ones like BigInteger/BigDecimal
I think the EMF runtime could do some of the grunt work...
If you think this is something worth pursuing I could post an entry in
Bugzilla.
Regards,
Patrick Sodre
Below is the stack trace that I got from the Editor...
java.lang.NumberFormatException
at java.math.BigDecimal.<init>(BigDecimal.java:368)
at java.math.BigDecimal.<init>(BigDecimal.java:647)
at
org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559)
at
org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116)
at
org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183)
at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449)
at
org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135)
at
org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249)
at
------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 -------
The fix has been committed to CVS. Thanks for reporting this problem.
------- Comment #2 From Marcelo Paternostro 2006-04-27 10:44:24 -0400 -------
Fixed in the I200604270000 built
------- Comment #3 From Nick Boldt 2008-01-28 16:46:51 -0400 -------
Move to verified as per bug 206558.
SCREENSHOTS
SOURCE CODE
PATCHES
Extracting Structural Information from Bug Reports (MSR 2008)
METADATA
Bug 137808
Summary: Exceptions from createFromString lock-up the editor
Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com>
Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com>
Status: VERIFIED FIXED QA Contact:
Severity: normal
Priority: P3 CC: merks@ca.ibm.com
Version: 2.2
Target Milestone: ---
Hardware: PC
OS: Windows XP
Whiteboard:
Description:
Opened: 2006-04-20 14:25 -
0400
As discussed on the newsgroup under the Thread with the same name I am opening
this bug entry. Here is a history of the thread.
-- From Ed Merks
Patrick,
The value is checked before it's applied and can't be applied until it's valid.
But this BigDecimal cases behaves oddly because the exception thrown by
new BigDecimal("badvalue")
has a null message and the property editor relies on returning a non-null
message string to indicate there is an error.
Please open a bugzilla which I'll fix like this:
### Eclipse Workspace Patch 1.0
#P org.eclipse.emf.edit.ui
Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java
===================================================================
RCS file:
/cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v
retrieving revision 1.10
diff -u -r1.10 PropertyDescriptor.java
--- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006
16:42:30 -0000 1.10
+++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006
11:59:10 -0000
@@ -162,7 +162,8 @@
}
catch (Exception exception)
{
- return exception.getMessage();
+ String message = exception.getMessage();
+ return message == null ? exception.toString() : message;
}
}
Diagnostic diagnostic =
Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value);
Patrick Sodre wrote:
Hi,
It seems that if the user inputs an invalid parameter that gets created from
"createFromString" the Editor locks-up until the user explicitly calls "restore
Default Value".
Is this the expected behavior or could something better be done? For
instance if an exception is thrown restore the value back to what it was before
after displaying a pop-up error message.
I understand that for DataTypes defined by the user he/she should take care
of catching the exceptions but for the default ones like BigInteger/BigDecimal
I think the EMF runtime could do some of the grunt work...
If you think this is something worth pursuing I could post an entry in
Bugzilla.
Regards,
Patrick Sodre
Below is the stack trace that I got from the Editor...
java.lang.NumberFormatException
at java.math.BigDecimal.<init>(BigDecimal.java:368)
at java.math.BigDecimal.<init>(BigDecimal.java:647)
at
org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559)
at
org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116)
at
org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183)
at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449)
at
org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135)
at
org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249)
at
------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 -------
The fix has been committed to CVS. Thanks for reporting this problem.
------- Comment #2 From Marcelo Paternostro 2006-04-27 10:44:24 -0400 -------
Fixed in the I200604270000 built
------- Comment #3 From Nick Boldt 2008-01-28 16:46:51 -0400 -------
Move to verified as per bug 206558.
SCREENSHOTS
SOURCE CODE
PATCHES
STACK TRACES
Extracting Structural Information from Bug Reports (MSR 2008)
METADATA
3.6 Order of Extraction
PATCHES STACK TRACES SOURCE CODE ENUMERATIONS
loremm ipsum dolor met e4a
this is a public String {
dosomeThing();
}
We have the following
problem:
- first you have to do
- then you must do
We propos the following
patch file to be used:
Index: someFile.java
=====================
INPUT Index: PatchFilter.java
==================
RCS File: PatchFilter.java
--- PatchFilter.java
23.10.2007
+++ PatchFilter.java
24.10.2007
@@+7,13-7,14@@
This is a sample context
line
- This line will be removed
+ this line will be added
instead
PATCH
Index: PatchFilter.java
==================
RCS File: PatchFilter.java
--- PatchFilter.java
23.10.2007
+++ PatchFilter.java
24.10.2007
@@+7,13-7,14@@
This is a sample context
line
- This line will be removed
+ this line will be added
instead
TRACE
Index: PatchFilter.java
==================
RCS File: PatchFilter.java
--- PatchFilter.java
23.10.2007
+++ PatchFilter.java
24.10.2007
@@+7,13-7,14@@
This is a sample context
line
- This line will be removed
+ this line will be added
instead
CODE
loremm ipsum dolor met e4a
this is a public String {
dosomeThing();
}
We have the following
problem:
- first you have to do
- then you must do
We propos the following
patch file to be used:
Index: someFile.java
=====================
OUTPUT
Figure 3.10: We extract structural elements in a fixed sequence.
The order in which the detection and extraction of elements is executed, is
of great importance. Several structural elements interfere:
• Patches vs. Enumerations
Enumerations, especially itemization interfere with the hunk lines in
patches. Both use the symbols “+” and “-”.
3.6 Order of Extraction
PATCHES STACK TRACES SOURCE CODE ENUMERATIONS
loremm ipsum dolor met e4a
this is a public String {
dosomeThing();
}
We have the following
problem:
- first you have to do
- then you must do
We propos the following
patch file to be used:
Index: someFile.java
=====================
INPUT Index: PatchFilter.java
==================
RCS File: PatchFilter.java
--- PatchFilter.java
23.10.2007
+++ PatchFilter.java
24.10.2007
@@+7,13-7,14@@
This is a sample context
line
- This line will be removed
+ this line will be added
instead
PATCH
Index: PatchFilter.java
==================
RCS File: PatchFilter.java
--- PatchFilter.java
23.10.2007
+++ PatchFilter.java
24.10.2007
@@+7,13-7,14@@
This is a sample context
line
- This line will be removed
+ this line will be added
instead
TRACE
Index: PatchFilter.java
==================
RCS File: PatchFilter.java
--- PatchFilter.java
23.10.2007
+++ PatchFilter.java
24.10.2007
@@+7,13-7,14@@
This is a sample context
line
- This line will be removed
+ this line will be added
instead
CODE
loremm ipsum dolor met e4a
this is a public String {
dosomeThing();
}
We have the following
problem:
- first you have to do
- then you must do
We propos the following
patch file to be used:
Index: someFile.java
=====================
OUTPUT
Figure 3.10: We extract structural elements in a fixed sequence.
The order in which the detection and extraction of elements is executed, is
of great importance. Several structural elements interfere:
• Patches vs. Enumerations
Enumerations, especially itemization interfere with the hunk lines in
patches. Both use the symbols “+” and “-”.
reports. The evaluation is split into two parts: first, we want to focus on
the correct identification of the presence of enumerations, patches, stack
traces and source code in bug reports. Knowing the the reliability of our
approach, we can then proceed in identifying how good the detected elements
are extracted by our methods.
Evaluation Setup
We parsed 161,500 bug reports from the ECLIPSE project which were submit-
ted between October 2001 and December 2007. For each report, INFOZILLA
verified the presence of each of the four structural element types. For each
element, it classified the report into one of two bins: B1 (report has Element)
and B2 (report does not have Element).
loremm ipsum dolor met e4a
this is a public String {
dosomeThing();
}
We have the following
problem:
- first you have to do
- then you must do
We propos the following
patch file to be used:
Index: someFile.java
=====================
INPUT
Has
Element?
No
Yes B1
B2
Figure 3.11: For each element we classified the report into two bins.
Master Report
BUGthisasd
asdlknasdklnasdlk
askdnaklsdn
aksdnlaksdnlkasdkn
asd
sadddda
asdaddasd
aksdnlaskdnlkansd
Elements
Extended Report
BUGthisasd
asdlknasdklnasdlk
askdnaklsdn
aksdnlaksdnlkasdkn
asd
sadddda
asdaddasd
aksdnlaskdnlkansd BUGthisasd
asdlknasdklnasdlk
askdnaklsdn
aksdnlkasdkn
asdasdasdasdasd
a
s
adddda
a
daddasd
asdasdasdasdasd
askdnlkansd
Elements
compare
5.2 Results 35
Average per master report
Information item Master Extended Change⇤
Predefined fields
– product 1.000 1.127 +0.127
– component 1.000 1.287 +0.287
– operating system 1.000 1.631 +0.631
– reported platform 1.000 1.241 +0.241
– version 0.927 1.413 +0.486
– reporter 1.000 2.412 +1.412
– priority 1.000 1.291 +0.291
– target milestone 0.654 0.794 +0.140
Patches
– total 1.828 1.942 +0.113
– unique: patched files 1.061 1.124 +0.062
Screenshots
– total 0.139 0.285 +0.145
– unique: filename, filesize 0.138 0.281 +0.143
Stacktraces
– total 0.504 1.422 +0.918
– unique: exception 0.195 0.314 +0.118
– unique: exception, top frame 0.223 0.431 +0.207
– unique: exception, top 2 frames 0.229 0.458 +0.229
– unique: exception, top 3 frames 0.234 0.483 +0.248
– unique: exception, top 4 frames 0.239 0.504 +0.265
– unique: exception, top 5 frames 0.244 0.525 +0.281
⇤
For all information items the increase is significant at p < .001.
Table 5.1: Average amount of information added by duplicates.
A reporter’s reputation can go a long way in influencing the future course of a
36 5. Additional Information in Duplicate Reports
Average per master report
Information item Master Extended Change⇤
Predefined fields
– product 1.000 1.400 +0.400
– component 1.000 1.953 +0.953
– operating system 1.000 2.102 +1.102
– reported platform 1.000 1.544 +0.544
– version 0.814 0.979 +0.165
– reporter 1.000 3.705 +2.705
– priority 0.377 0.499 +0.122
– target milestone 0.433 0.558 +0.125
Patches
– total 5.038 5.184 +0.146
– unique: patched files 2.003 2.067 +0.064
Screenshots
– total 0.200 0.391 +0.191
– unique: filename, filesize 0.197 0.385 +0.187
Stacktraces
– total 0.100 0.185 +0.085
– unique: exception 0.033 0.047 +0.014
– unique: exception, top frame 0.069 0.130 +0.061
– unique: exception, top 2 frames 0.072 0.136 +0.064
– unique: exception, top 3 frames 0.073 0.139 +0.066
– unique: exception, top 4 frames 0.074 0.141 +0.067
– unique: exception, top 5 frames 0.075 0.143 +0.068
⇤
For all information items the increase is significant at p < .001.
Table 5.2: Average amount of information added by duplicates.
We compared stack traces considering the exception that was thrown and
ECLIPSE MOZILLA
ADDITIONAL INFORMATION
Duplicate bug reports can provide useful additional information.

For example, we can find up to three times the stack traces
which are helpful in fixing bugs
There is significant evidence of
additional information in duplicate
bug reports that is uniquely different
from the information already reported.
PART 1
Is there extra information
in duplicate reports and
if so, can we quantify
how much?
PART 2
Is that extra information

helpful for carrying out
software engineering
tasks?
Developer
The Triage Problem
DeveloperReport
BUG
The Triage Problem
DeveloperReport
BUG
Fixed
BUG
✓
The Triage Problem
BUG
DeveloperReport
BUG
Fixed
BUG
✓
BUG
BUG
BUG
BUG
BUG
BUG
The Triage Problem
BUG
DeveloperReport
BUG
Fixed
BUG
✓
BUG
BUG
BUG
BUG
BUG
BUG Triager
The Triage Problem
BUG
DeveloperReport
BUG
Fixed
BUG
✓
BUG
BUG
BUG
BUG
BUG
BUG Triager
The Triage Problem
A1
A2
An
...
MASTER
Class 3
A1
A2
An
...
DUPLICATE n
Class 2
A1
A2
An
...
DUPLICATE 1
Class 3
A1
A2
An
...
DUPLICATE n
Class 3
...
A1
A2
An
...
MASTER
Class 2
A1
A2
An
...
MASTER
Class 3
...
A1
A2
An
...
DUPLICATE n
Class 1
A1
A2
An
...
DUPLICATE 1
Class 2
A1
A2
An
...
DUPLICATE n
Class 2
A1
A2
An
...
DUPLICATE 1
Class 3
A1
A2
An
...
DUPLICATE n
Class 3
...
A1
A2
An
...
MASTER
Class 2
A1
A2
An
...
MASTER
Class 1
A1
A2
An
...
MASTER
Class 3
...
A1
A2
An
...
DUPLICATE 1
Class 1
A1
A2
An
...
DUPLICATE n
Class 1
...
A1
A2
An
...
DUPLICATE 1
Class 2
A1
A2
An
...
DUPLICATE n
Class 2
A1
A2
An
...
DUPLICATE 1
Class 3
A1
A2
An
...
DUPLICATE n
Class 3
...
“Whoever was assigned to the Master
should have been assigned to any of the
Duplicates.”
“Only the person who was originally
assigned to a report can fix it.”
“Any person assigned to any of the reports in
the duplicate group can provide a fix.”
Master reports, sorted chronologically
Training
Training
Training Testing
Fold 1 Fold 2 Fold 3 Fold 11
Testing
Testing
. . . . . . . . . . . . . . . . . .
.......
Split into
Run 1
Run 2
Run 10
46 6. Additional Information can Help Developers
Table 6.1: Percentages of reports correctly triaged to ECLIPSE developers.
Run
Model Result Training 1 2 3 4 5 6 7 8 9 10 All
SVM
Top 1
Master 15.45 19.28 19.03 19.80 25.80 26.44 22.09 27.08 27.71 29.12 23.18
Extended 18.39⇤
20.95 22.22⇤
21.46 27.84 28.48 23.37 30.52⇤
30.78⇤
30.52 25.45⇤
Top 3
Master 32.44 37.42 40.87 39.72 46.10 46.36 38.95 44.70 48.53 47.25 42.23
Extended 38.70⇤
42.78⇤
43.30 39.34 50.83⇤
49.55⇤
42.40⇤
50.32⇤
50.32 55.04⇤
46.25⇤
Top 5
Master 41.89 46.87 47.38 47.64 54.66 56.96 47.51 52.36 56.58 56.45 50.83
Extended 47.38⇤
52.11⇤
53.00⇤
51.85⇤
60.54⇤
59.90⇤
51.09⇤
58.11⇤
60.28⇤
65.26⇤
55.95⇤
Bayes
Top 1
Master 14.81 16.60 17.75 17.75 22.73 21.20 20.56 23.50 27.71 28.22 21.08
Extended 15.45 17.11 20.56⇤
18.01 19.80⇤
19.80 22.99 27.08⇤
26.82 30.40⇤
21.80
Top 3
Master 29.12 32.31 35.12 34.99 40.36 38.06 35.76 43.55 45.59 46.87 38.17
Extended 36.53⇤
33.08 38.83⇤
35.50 39.08 39.08 39.97⇤
46.23 45.85 50.45⇤
40.46⇤
Top 5
Master 38.44 42.40 45.72 45.21 50.70 47.64 44.06 51.85 54.92 55.17 47.61
Extended 45.72⇤
44.70 48.02 43.55 48.91 50.45⇤
49.43⇤
55.30⇤
54.28 58.49⇤
49.88⇤
⇤ Increase in accuracy is significant at p = .05
Table 6.2: Percentages of reports correctly triaged to MOZILLA developers.
Run
Model Result Training 1 2 3 4 5 6 7 8 9 10 All
Top 1
Master 14.57 14.30 14.16 18.29 18.83 19.17 21.00 19.65 19.99 22.15 18.21
Extended 15.31 14.43 17.95 19.44 19.78 19.51 21.82 23.10 18.29 19.31 18.90
onal Information can Help Developers
rectly triaged to ECLIPSE developers.
Run
4 5 6 7 8 9 10 All
19.80 25.80 26.44 22.09 27.08 27.71 29.12 23.18
21.46 27.84 28.48 23.37 30.52⇤
30.78⇤
30.52 25.45⇤
39.72 46.10 46.36 38.95 44.70 48.53 47.25 42.23
39.34 50.83⇤
49.55⇤
42.40⇤
50.32⇤
50.32 55.04⇤
46.25⇤
47.64 54.66 56.96 47.51 52.36 56.58 56.45 50.83
51.85⇤
60.54⇤
59.90⇤
51.09⇤
58.11⇤
60.28⇤
65.26⇤
55.95⇤
17.75 22.73 21.20 20.56 23.50 27.71 28.22 21.08
18.01 19.80⇤
19.80 22.99 27.08⇤
26.82 30.40⇤
21.80
34.99 40.36 38.06 35.76 43.55 45.59 46.87 38.17
35.50 39.08 39.08 39.97⇤
46.23 45.85 50.45⇤
40.46⇤
45.21 50.70 47.64 44.06 51.85 54.92 55.17 47.61
43.55 48.91 50.45⇤
49.43⇤
55.30⇤
54.28 58.49⇤
49.88⇤
rectly triaged to MOZILLA developers.
Run
4 5 6 7 8 9 10 All
18.29 18.83 19.17 21.00 19.65 19.99 22.15 18.21
19.44 19.78 19.51 21.82 23.10 18.29 19.31 18.90
Bayes Top 3
Master 29.12 32.31 35.12 34.99 40.36 38.06 35.76 43.55 45.59 46.87 38.17
Extended 36.53⇤
33.08 38.83⇤
35.50 39.08 39.08 39.97⇤
46.23 45.85 50.45⇤
40.46⇤
Top 5
Master 38.44 42.40 45.72 45.21 50.70 47.64 44.06 51.85 54.92 55.17 47.61
Extended 45.72⇤
44.70 48.02 43.55 48.91 50.45⇤
49.43⇤
55.30⇤
54.28 58.49⇤
49.88⇤
⇤ Increase in accuracy is significant at p = .05
Table 6.2: Percentages of reports correctly triaged to MOZILLA developers.
Run
Model Result Training 1 2 3 4 5 6 7 8 9 10 All
SVM
Top 1
Master 14.57 14.30 14.16 18.29 18.83 19.17 21.00 19.65 19.99 22.15 18.21
Extended 15.31 14.43 17.95 19.44 19.78 19.51 21.82 23.10 18.29 19.31 18.90
Top 3
Master 28.59 28.46 31.84 37.53 36.52 39.30 41.26 44.58 42.82 43.09 37.40
Extended 32.38 30.15 36.86 39.70 37.26 40.72 43.29 47.83 42.48 39.36 39.00
Top 5
Master 37.13 36.04 41.67 46.41 44.99 48.92 50.75 56.03 53.52 51.22 46.67
Extended 42.48 39.77 46.07 49.80 49.05 54.27 53.32 60.57 54.74 49.66 49.98
Bayes
Top 1
Master 15.11 12.60 16.94 17.62 17.01 19.44 18.22 25.81 25.47 27.98 19.62
Extended 15.24 13.75 18.50 20.39 19.78 23.51 23.31 26.22 24.46 25.88 21.10
Top 3
Master 27.71 29.67 34.42 37.94 35.70 40.18 40.04 44.58 45.33 43.90 37.94
Extended 32.11 29.40 36.72 39.50 39.70 44.24 44.24 48.85 45.87 44.17 40.48
Top 5
Master 35.77 37.74 43.09 47.09 44.99 51.90 49.46 54.13 55.15 51.83 47.11
Extended 40.72 39.63 45.05 49.66 48.58 54.47 54.74 59.49 55.76 53.52 50.16
Importantly, all but the Top 1 results using Naïve Bayes in the last column
were significant, too. Thus, the results demonstrate that bug reports can be
better triaged by considering a larger set of existing bug reports by including
duplicate reports.
Bayes Top 3
Master 29.12 32.31 35.12 34.
Extended 36.53⇤
33.08 38.83⇤
35.
Top 5
Master 38.44 42.40 45.72 45.
Extended 45.72⇤
44.70 48.02 43.
⇤ Increase in accuracy is significant at p = .05
Table 6.2: Percentages of reports correc
Model Result Training 1 2 3
SVM
Top 1
Master 14.57 14.30 14.16 18.
Extended 15.31 14.43 17.95 19.
Top 3
Master 28.59 28.46 31.84 37.
Extended 32.38 30.15 36.86 39.
Top 5
Master 37.13 36.04 41.67 46.
Extended 42.48 39.77 46.07 49.
Bayes
Top 1
Master 15.11 12.60 16.94 17.
Extended 15.24 13.75 18.50 20.
Top 3
Master 27.71 29.67 34.42 37.
Extended 32.11 29.40 36.72 39.
Top 5
Master 35.77 37.74 43.09 47.
Extended 40.72 39.63 45.05 49.
Importantly, all but the Top 1 results us
were significant, too. Thus, the results d
better triaged by considering a larger set
duplicate reports.
ECLIPSE MOZILLA
The information contained in Duplicate
reports the improves accuracy of
Machine Learning algorithms when
solving for the Bug Triage problem.
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?
10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?

More Related Content

What's hot

Optimization of different objective function in risk assessment system
Optimization of different objective function in risk assessment  systemOptimization of different objective function in risk assessment  system
Optimization of different objective function in risk assessment systemAlexander Decker
 
Computer Worms Based on Monitoring Replication and Damage: Experiment and Eva...
Computer Worms Based on Monitoring Replication and Damage: Experiment and Eva...Computer Worms Based on Monitoring Replication and Damage: Experiment and Eva...
Computer Worms Based on Monitoring Replication and Damage: Experiment and Eva...IOSRjournaljce
 
Improvement of Software Maintenance and Reliability using Data Mining Techniques
Improvement of Software Maintenance and Reliability using Data Mining TechniquesImprovement of Software Maintenance and Reliability using Data Mining Techniques
Improvement of Software Maintenance and Reliability using Data Mining Techniquesijdmtaiir
 
Machine learning in health data analytics and pharmacovigilance
Machine learning in health data analytics and pharmacovigilanceMachine learning in health data analytics and pharmacovigilance
Machine learning in health data analytics and pharmacovigilanceRevathi Boyina
 
Ids 014 anomaly detection
Ids 014 anomaly detectionIds 014 anomaly detection
Ids 014 anomaly detectionjyoti_lakhani
 
Metadata Analyser: measuring metadata quality
Metadata Analyser: measuring metadata qualityMetadata Analyser: measuring metadata quality
Metadata Analyser: measuring metadata qualityFrancisco Couto
 
User-centered Design of a PHR: Traditional Web Forms vs. Wizard Forms [5 Cr2 ...
User-centered Design of a PHR: Traditional Web Forms vs. Wizard Forms [5 Cr2 ...User-centered Design of a PHR: Traditional Web Forms vs. Wizard Forms [5 Cr2 ...
User-centered Design of a PHR: Traditional Web Forms vs. Wizard Forms [5 Cr2 ...Gunther Eysenbach
 
Machine Learning in Medicine A Primer
Machine Learning in Medicine A PrimerMachine Learning in Medicine A Primer
Machine Learning in Medicine A Primerijtsrd
 
IRJET - Neural Network based Leaf Disease Detection and Remedy Recommenda...
IRJET -  	  Neural Network based Leaf Disease Detection and Remedy Recommenda...IRJET -  	  Neural Network based Leaf Disease Detection and Remedy Recommenda...
IRJET - Neural Network based Leaf Disease Detection and Remedy Recommenda...IRJET Journal
 
AUTOMATED BUG TRIAGE USING ADVANCED DATA REDUCTION TECHNIQUES
AUTOMATED BUG TRIAGE USING ADVANCED DATA REDUCTION TECHNIQUESAUTOMATED BUG TRIAGE USING ADVANCED DATA REDUCTION TECHNIQUES
AUTOMATED BUG TRIAGE USING ADVANCED DATA REDUCTION TECHNIQUESJournal For Research
 

What's hot (13)

Optimization of different objective function in risk assessment system
Optimization of different objective function in risk assessment  systemOptimization of different objective function in risk assessment  system
Optimization of different objective function in risk assessment system
 
Computer Worms Based on Monitoring Replication and Damage: Experiment and Eva...
Computer Worms Based on Monitoring Replication and Damage: Experiment and Eva...Computer Worms Based on Monitoring Replication and Damage: Experiment and Eva...
Computer Worms Based on Monitoring Replication and Damage: Experiment and Eva...
 
Improvement of Software Maintenance and Reliability using Data Mining Techniques
Improvement of Software Maintenance and Reliability using Data Mining TechniquesImprovement of Software Maintenance and Reliability using Data Mining Techniques
Improvement of Software Maintenance and Reliability using Data Mining Techniques
 
Igene - PhD SICSA Poster Presentation
Igene - PhD SICSA Poster PresentationIgene - PhD SICSA Poster Presentation
Igene - PhD SICSA Poster Presentation
 
IJET-V2I6P28
IJET-V2I6P28IJET-V2I6P28
IJET-V2I6P28
 
Machine learning in health data analytics and pharmacovigilance
Machine learning in health data analytics and pharmacovigilanceMachine learning in health data analytics and pharmacovigilance
Machine learning in health data analytics and pharmacovigilance
 
Ids 014 anomaly detection
Ids 014 anomaly detectionIds 014 anomaly detection
Ids 014 anomaly detection
 
Metadata Analyser: measuring metadata quality
Metadata Analyser: measuring metadata qualityMetadata Analyser: measuring metadata quality
Metadata Analyser: measuring metadata quality
 
User-centered Design of a PHR: Traditional Web Forms vs. Wizard Forms [5 Cr2 ...
User-centered Design of a PHR: Traditional Web Forms vs. Wizard Forms [5 Cr2 ...User-centered Design of a PHR: Traditional Web Forms vs. Wizard Forms [5 Cr2 ...
User-centered Design of a PHR: Traditional Web Forms vs. Wizard Forms [5 Cr2 ...
 
50320130403010
5032013040301050320130403010
50320130403010
 
Machine Learning in Medicine A Primer
Machine Learning in Medicine A PrimerMachine Learning in Medicine A Primer
Machine Learning in Medicine A Primer
 
IRJET - Neural Network based Leaf Disease Detection and Remedy Recommenda...
IRJET -  	  Neural Network based Leaf Disease Detection and Remedy Recommenda...IRJET -  	  Neural Network based Leaf Disease Detection and Remedy Recommenda...
IRJET - Neural Network based Leaf Disease Detection and Remedy Recommenda...
 
AUTOMATED BUG TRIAGE USING ADVANCED DATA REDUCTION TECHNIQUES
AUTOMATED BUG TRIAGE USING ADVANCED DATA REDUCTION TECHNIQUESAUTOMATED BUG TRIAGE USING ADVANCED DATA REDUCTION TECHNIQUES
AUTOMATED BUG TRIAGE USING ADVANCED DATA REDUCTION TECHNIQUES
 

Similar to 10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?

Abstract.doc
Abstract.docAbstract.doc
Abstract.docbutest
 
Comparative performance analysis
Comparative performance analysisComparative performance analysis
Comparative performance analysiscsandit
 
Benchmarking machine learning techniques
Benchmarking machine learning techniquesBenchmarking machine learning techniques
Benchmarking machine learning techniquesijseajournal
 
A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...IJECEIAES
 
Development of software defect prediction system using artificial neural network
Development of software defect prediction system using artificial neural networkDevelopment of software defect prediction system using artificial neural network
Development of software defect prediction system using artificial neural networkIJAAS Team
 
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...ijseajournal
 
Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...
Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...
Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...IJCNCJournal
 
Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...
Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...
Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...IJCNCJournal
 
ANALYTIC HIERARCHY PROCESS-BASED FUZZY MEASUREMENT TO QUANTIFY VULNERABILITIE...
ANALYTIC HIERARCHY PROCESS-BASED FUZZY MEASUREMENT TO QUANTIFY VULNERABILITIE...ANALYTIC HIERARCHY PROCESS-BASED FUZZY MEASUREMENT TO QUANTIFY VULNERABILITIE...
ANALYTIC HIERARCHY PROCESS-BASED FUZZY MEASUREMENT TO QUANTIFY VULNERABILITIE...IJCNCJournal
 
Predicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine LearningPredicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine LearningGuido A. Ciollaro
 
Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...
Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...
Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...IOSR Journals
 
Bug Triage: An Automated Process
Bug Triage: An Automated ProcessBug Triage: An Automated Process
Bug Triage: An Automated ProcessIRJET Journal
 
Software testing defect prediction model a practical approach
Software testing defect prediction model   a practical approachSoftware testing defect prediction model   a practical approach
Software testing defect prediction model a practical approacheSAT Journals
 
Advancing Foundation and Practice of Software Analytics
Advancing Foundation and Practice of Software AnalyticsAdvancing Foundation and Practice of Software Analytics
Advancing Foundation and Practice of Software AnalyticsTao Xie
 
Practical Guidelines to Improve Defect Prediction Model – A Review
Practical Guidelines to Improve Defect Prediction Model – A ReviewPractical Guidelines to Improve Defect Prediction Model – A Review
Practical Guidelines to Improve Defect Prediction Model – A Reviewinventionjournals
 
A Systems Approach To Qualitative Data Management And Analysis
A Systems Approach To Qualitative Data Management And AnalysisA Systems Approach To Qualitative Data Management And Analysis
A Systems Approach To Qualitative Data Management And AnalysisMichele Thomas
 
Software Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeSoftware Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeEditor IJMTER
 
A Survey on Bug Tracking System for Effective Bug Clearance
A Survey on Bug Tracking System for Effective Bug ClearanceA Survey on Bug Tracking System for Effective Bug Clearance
A Survey on Bug Tracking System for Effective Bug ClearanceIRJET Journal
 

Similar to 10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really? (20)

Abstract.doc
Abstract.docAbstract.doc
Abstract.doc
 
Comparative performance analysis
Comparative performance analysisComparative performance analysis
Comparative performance analysis
 
Benchmarking machine learning techniques
Benchmarking machine learning techniquesBenchmarking machine learning techniques
Benchmarking machine learning techniques
 
A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...
 
Development of software defect prediction system using artificial neural network
Development of software defect prediction system using artificial neural networkDevelopment of software defect prediction system using artificial neural network
Development of software defect prediction system using artificial neural network
 
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
 
ONE HIDDEN LAYER ANFIS MODEL FOR OOS DEVELOPMENT EFFORT ESTIMATION
ONE HIDDEN LAYER ANFIS MODEL FOR OOS DEVELOPMENT EFFORT ESTIMATIONONE HIDDEN LAYER ANFIS MODEL FOR OOS DEVELOPMENT EFFORT ESTIMATION
ONE HIDDEN LAYER ANFIS MODEL FOR OOS DEVELOPMENT EFFORT ESTIMATION
 
Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...
Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...
Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...
 
Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...
Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...
Analytic Hierarchy Process-based Fuzzy Measurement to Quantify Vulnerabilitie...
 
ANALYTIC HIERARCHY PROCESS-BASED FUZZY MEASUREMENT TO QUANTIFY VULNERABILITIE...
ANALYTIC HIERARCHY PROCESS-BASED FUZZY MEASUREMENT TO QUANTIFY VULNERABILITIE...ANALYTIC HIERARCHY PROCESS-BASED FUZZY MEASUREMENT TO QUANTIFY VULNERABILITIE...
ANALYTIC HIERARCHY PROCESS-BASED FUZZY MEASUREMENT TO QUANTIFY VULNERABILITIE...
 
Predicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine LearningPredicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine Learning
 
Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...
Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...
Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...
 
Bug Triage: An Automated Process
Bug Triage: An Automated ProcessBug Triage: An Automated Process
Bug Triage: An Automated Process
 
Software testing defect prediction model a practical approach
Software testing defect prediction model   a practical approachSoftware testing defect prediction model   a practical approach
Software testing defect prediction model a practical approach
 
Advancing Foundation and Practice of Software Analytics
Advancing Foundation and Practice of Software AnalyticsAdvancing Foundation and Practice of Software Analytics
Advancing Foundation and Practice of Software Analytics
 
J034057065
J034057065J034057065
J034057065
 
Practical Guidelines to Improve Defect Prediction Model – A Review
Practical Guidelines to Improve Defect Prediction Model – A ReviewPractical Guidelines to Improve Defect Prediction Model – A Review
Practical Guidelines to Improve Defect Prediction Model – A Review
 
A Systems Approach To Qualitative Data Management And Analysis
A Systems Approach To Qualitative Data Management And AnalysisA Systems Approach To Qualitative Data Management And Analysis
A Systems Approach To Qualitative Data Management And Analysis
 
Software Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeSoftware Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking Scheme
 
A Survey on Bug Tracking System for Effective Bug Clearance
A Survey on Bug Tracking System for Effective Bug ClearanceA Survey on Bug Tracking System for Effective Bug Clearance
A Survey on Bug Tracking System for Effective Bug Clearance
 

More from Nicolas Bettenburg

Ph.D. Dissertation - Studying the Impact of Developer Communication on the Qu...
Ph.D. Dissertation - Studying the Impact of Developer Communication on the Qu...Ph.D. Dissertation - Studying the Impact of Developer Communication on the Qu...
Ph.D. Dissertation - Studying the Impact of Developer Communication on the Qu...Nicolas Bettenburg
 
Think Locally, Act Gobally - Improving Defect and Effort Prediction Models
Think Locally, Act Gobally - Improving Defect and Effort Prediction ModelsThink Locally, Act Gobally - Improving Defect and Effort Prediction Models
Think Locally, Act Gobally - Improving Defect and Effort Prediction ModelsNicolas Bettenburg
 
Mining Development Repositories to Study the Impact of Collaboration on Softw...
Mining Development Repositories to Study the Impact of Collaboration on Softw...Mining Development Repositories to Study the Impact of Collaboration on Softw...
Mining Development Repositories to Study the Impact of Collaboration on Softw...Nicolas Bettenburg
 
Using Fuzzy Code Search to Link Code Fragments in Discussions to Source Code
Using Fuzzy Code Search to Link Code Fragments in Discussions to Source CodeUsing Fuzzy Code Search to Link Code Fragments in Discussions to Source Code
Using Fuzzy Code Search to Link Code Fragments in Discussions to Source CodeNicolas Bettenburg
 
A Lightweight Approach to Uncover Technical Information in Unstructured Data
A Lightweight Approach to Uncover Technical Information in Unstructured DataA Lightweight Approach to Uncover Technical Information in Unstructured Data
A Lightweight Approach to Uncover Technical Information in Unstructured DataNicolas Bettenburg
 
Managing Community Contributions: Lessons Learned from a Case Study on Andro...
Managing Community Contributions:  Lessons Learned from a Case Study on Andro...Managing Community Contributions:  Lessons Learned from a Case Study on Andro...
Managing Community Contributions: Lessons Learned from a Case Study on Andro...Nicolas Bettenburg
 
Studying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software QualityStudying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software QualityNicolas Bettenburg
 
An Empirical Study on Inconsistent Changes to Code Clones at Release Level
An Empirical Study on Inconsistent Changes to Code Clones at Release LevelAn Empirical Study on Inconsistent Changes to Code Clones at Release Level
An Empirical Study on Inconsistent Changes to Code Clones at Release LevelNicolas Bettenburg
 
An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Process...
An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Process...An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Process...
An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Process...Nicolas Bettenburg
 
Finding Paths in Large Spaces - A* and Hierarchical A*
Finding Paths in Large Spaces - A* and Hierarchical A*Finding Paths in Large Spaces - A* and Hierarchical A*
Finding Paths in Large Spaces - A* and Hierarchical A*Nicolas Bettenburg
 
Automatic Identification of Bug Introducing Changes
Automatic Identification of Bug Introducing ChangesAutomatic Identification of Bug Introducing Changes
Automatic Identification of Bug Introducing ChangesNicolas Bettenburg
 
Cloning Considered Harmful Considered Harmful
Cloning Considered Harmful Considered HarmfulCloning Considered Harmful Considered Harmful
Cloning Considered Harmful Considered HarmfulNicolas Bettenburg
 
Predictors of Customer Perceived Quality
Predictors of Customer Perceived QualityPredictors of Customer Perceived Quality
Predictors of Customer Perceived QualityNicolas Bettenburg
 
Extracting Structural Information from Bug Reports.
Extracting Structural Information from Bug Reports.Extracting Structural Information from Bug Reports.
Extracting Structural Information from Bug Reports.Nicolas Bettenburg
 
Computing Accuracy Precision And Recall
Computing Accuracy Precision And RecallComputing Accuracy Precision And Recall
Computing Accuracy Precision And RecallNicolas Bettenburg
 
Duplicate Bug Reports Considered Harmful ... Really?
Duplicate Bug Reports Considered Harmful ... Really?Duplicate Bug Reports Considered Harmful ... Really?
Duplicate Bug Reports Considered Harmful ... Really?Nicolas Bettenburg
 
The Quality of Bug Reports in Eclipse ETX'07
The Quality of Bug Reports in Eclipse ETX'07The Quality of Bug Reports in Eclipse ETX'07
The Quality of Bug Reports in Eclipse ETX'07Nicolas Bettenburg
 

More from Nicolas Bettenburg (20)

Ph.D. Dissertation - Studying the Impact of Developer Communication on the Qu...
Ph.D. Dissertation - Studying the Impact of Developer Communication on the Qu...Ph.D. Dissertation - Studying the Impact of Developer Communication on the Qu...
Ph.D. Dissertation - Studying the Impact of Developer Communication on the Qu...
 
Think Locally, Act Gobally - Improving Defect and Effort Prediction Models
Think Locally, Act Gobally - Improving Defect and Effort Prediction ModelsThink Locally, Act Gobally - Improving Defect and Effort Prediction Models
Think Locally, Act Gobally - Improving Defect and Effort Prediction Models
 
Mining Development Repositories to Study the Impact of Collaboration on Softw...
Mining Development Repositories to Study the Impact of Collaboration on Softw...Mining Development Repositories to Study the Impact of Collaboration on Softw...
Mining Development Repositories to Study the Impact of Collaboration on Softw...
 
Using Fuzzy Code Search to Link Code Fragments in Discussions to Source Code
Using Fuzzy Code Search to Link Code Fragments in Discussions to Source CodeUsing Fuzzy Code Search to Link Code Fragments in Discussions to Source Code
Using Fuzzy Code Search to Link Code Fragments in Discussions to Source Code
 
A Lightweight Approach to Uncover Technical Information in Unstructured Data
A Lightweight Approach to Uncover Technical Information in Unstructured DataA Lightweight Approach to Uncover Technical Information in Unstructured Data
A Lightweight Approach to Uncover Technical Information in Unstructured Data
 
Managing Community Contributions: Lessons Learned from a Case Study on Andro...
Managing Community Contributions:  Lessons Learned from a Case Study on Andro...Managing Community Contributions:  Lessons Learned from a Case Study on Andro...
Managing Community Contributions: Lessons Learned from a Case Study on Andro...
 
Mud flash
Mud flashMud flash
Mud flash
 
Studying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software QualityStudying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software Quality
 
An Empirical Study on Inconsistent Changes to Code Clones at Release Level
An Empirical Study on Inconsistent Changes to Code Clones at Release LevelAn Empirical Study on Inconsistent Changes to Code Clones at Release Level
An Empirical Study on Inconsistent Changes to Code Clones at Release Level
 
An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Process...
An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Process...An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Process...
An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Process...
 
Fuzzy Logic in Smart Homes
Fuzzy Logic in Smart HomesFuzzy Logic in Smart Homes
Fuzzy Logic in Smart Homes
 
Finding Paths in Large Spaces - A* and Hierarchical A*
Finding Paths in Large Spaces - A* and Hierarchical A*Finding Paths in Large Spaces - A* and Hierarchical A*
Finding Paths in Large Spaces - A* and Hierarchical A*
 
Automatic Identification of Bug Introducing Changes
Automatic Identification of Bug Introducing ChangesAutomatic Identification of Bug Introducing Changes
Automatic Identification of Bug Introducing Changes
 
Cloning Considered Harmful Considered Harmful
Cloning Considered Harmful Considered HarmfulCloning Considered Harmful Considered Harmful
Cloning Considered Harmful Considered Harmful
 
Approximation Algorithms
Approximation AlgorithmsApproximation Algorithms
Approximation Algorithms
 
Predictors of Customer Perceived Quality
Predictors of Customer Perceived QualityPredictors of Customer Perceived Quality
Predictors of Customer Perceived Quality
 
Extracting Structural Information from Bug Reports.
Extracting Structural Information from Bug Reports.Extracting Structural Information from Bug Reports.
Extracting Structural Information from Bug Reports.
 
Computing Accuracy Precision And Recall
Computing Accuracy Precision And RecallComputing Accuracy Precision And Recall
Computing Accuracy Precision And Recall
 
Duplicate Bug Reports Considered Harmful ... Really?
Duplicate Bug Reports Considered Harmful ... Really?Duplicate Bug Reports Considered Harmful ... Really?
Duplicate Bug Reports Considered Harmful ... Really?
 
The Quality of Bug Reports in Eclipse ETX'07
The Quality of Bug Reports in Eclipse ETX'07The Quality of Bug Reports in Eclipse ETX'07
The Quality of Bug Reports in Eclipse ETX'07
 

Recently uploaded

Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 

Recently uploaded (20)

Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 

10 Year Impact Award Presentation - Duplicate Bug Reports Considered Harmful ... Really?

  • 1. Duplicate Bug Reports Considered Harmful… Really? Nicolas Bettenburg • Rahul Premraj • Tom Zimmerman • Sunghun Kim
 ICSME’2018 (Madrid) • September 28th, 2018
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 10.
  • 11. Automated Severity Assessment of Software Defect Reports Tim Menzies Lane Department of Computer Science, West Virginia University PO Box 6109, Morgantown, WV, 26506 304 293 0405 tim@menzies.us Andrian Marcus Department of Computer Science Wayne State University Detroit, MI 48202 313 577 5408 amarcus@wayne.edu Abstract In mission critical systems, such as those developed by NASA, it is very important that the test engineers properly recognize the severity of each issue they identify during testing. Proper severity assessment is essential for appropriate resource allocation and planning for fixing activities and additional testing. Severity assessment is strongly influenced by the experience of the test engineers and by the time they spend on each issue. The paper presents a new and automated method named SEVERIS (SEVERity ISsue assessment), which assists the test engineer in assigning severity levels to defect reports. SEVERIS is based on standard text mining and machine learning techniques applied to existing sets of defect reports. A case study on using SEVERIS with data from NASA’s Project and Issue Tracking System (PITS) is presented in the paper. The case study results indicate that SEVERIS is a good predictor for issue severity levels, while it is easy to use and efficient. 1. Introduction NASA’s software Independent Verification and Validation (IV&V) Program captures all of its findings in a database called the Project and Issue Tracking System (PITS). The data in PITS has been collected for more than 10 years and includes issues on robotic satellite missions and human-rated systems. Nowadays, similar defect tracking systems, such as Bugzilla1 , have become very popular, largely due to the spread of open source software development. These systems help to track bugs and changes in the code, to submit and review patches, to manage quality assurance, to support communication between developers, etc. As compared to newer systems, the problem with PITS is that there is a lack of consistency in how each 1 http://www.bugzilla.org/ of the projects collected issue data. In most instances, the specific configuration of the information captured about an issue was tailored by the IV&V project to meet its needs. This has created consistency problems when metrics data is pulled across projects. While there was a set of required data fields, the majorities of those fields do not provide information in regards to the quality of the issue and are not very suitable for comparing projects. A common issue among defect tracking systems is that they are useful for storing day-to-day information and generating small-scale tactical reports (e.g., “list the bugs we found last Tuesday”), but difficult to use for high-end business strategic analysis (e.g., “in the past, what methods have proved most cost effective in finding bugs?”). Another issue common to these systems is that most of the data is unstructured (i.e., free text). Specific to PITS is that the database fields in PITS keep changing, yet the nature of the unstructured text remains constant. In consequence, one logical choice in the analysis of defect reports is a combination of text mining and machine learning. In this paper we present a new approach for extracting general conclusions from PITS data based on text mining and machine learning methods, which are low cost, automatic, and rapid. We designed and built a tool named SEVERIS (SEVERity ISsue assessment) to automatically review issue reports and alert when a proposed severity is anomalous. The way SEVRIS is built provides the probabilities that the assessment is correct. These probabilities can be used to guide decision making in this process. Assigning the correct severity levels to issue reports is extremely important in the process employed at NASA, as it directly impacts resource allocation and planning of subsequent defect fixing activities. NASA uses a five-point scale to score issue severity. The scale ranges one to five, worst to dullest, respectively. A different scale is used for robotic and human-rated missions (see Table 1).
  • 12. Automated Severity Assessment of Software Defect Reports Tim Menzies Lane Department of Computer Science, West Virginia University PO Box 6109, Morgantown, WV, 26506 304 293 0405 tim@menzies.us Andrian Marcus Department of Computer Science Wayne State University Detroit, MI 48202 313 577 5408 amarcus@wayne.edu Abstract In mission critical systems, such as those developed by NASA, it is very important that the test engineers properly recognize the severity of each issue they identify during testing. Proper severity assessment is essential for appropriate resource allocation and planning for fixing activities and additional testing. Severity assessment is strongly influenced by the experience of the test engineers and by the time they spend on each issue. The paper presents a new and automated method named SEVERIS (SEVERity ISsue assessment), which assists the test engineer in assigning severity levels to defect reports. SEVERIS is based on standard text mining and machine learning techniques applied to existing sets of defect reports. A case study on using SEVERIS with data from NASA’s Project and Issue Tracking System (PITS) is presented in the paper. The case study results indicate that SEVERIS is a good predictor for issue severity levels, while it is easy to use and efficient. 1. Introduction NASA’s software Independent Verification and Validation (IV&V) Program captures all of its findings in a database called the Project and Issue Tracking System (PITS). The data in PITS has been collected for more than 10 years and includes issues on robotic satellite missions and human-rated systems. Nowadays, similar defect tracking systems, such as Bugzilla1 , have become very popular, largely due to the spread of open source software development. These systems help to track bugs and changes in the code, to submit and review patches, to manage quality assurance, to support communication between developers, etc. As compared to newer systems, the problem with PITS is that there is a lack of consistency in how each 1 http://www.bugzilla.org/ of the projects collected issue data. In most instances, the specific configuration of the information captured about an issue was tailored by the IV&V project to meet its needs. This has created consistency problems when metrics data is pulled across projects. While there was a set of required data fields, the majorities of those fields do not provide information in regards to the quality of the issue and are not very suitable for comparing projects. A common issue among defect tracking systems is that they are useful for storing day-to-day information and generating small-scale tactical reports (e.g., “list the bugs we found last Tuesday”), but difficult to use for high-end business strategic analysis (e.g., “in the past, what methods have proved most cost effective in finding bugs?”). Another issue common to these systems is that most of the data is unstructured (i.e., free text). Specific to PITS is that the database fields in PITS keep changing, yet the nature of the unstructured text remains constant. In consequence, one logical choice in the analysis of defect reports is a combination of text mining and machine learning. In this paper we present a new approach for extracting general conclusions from PITS data based on text mining and machine learning methods, which are low cost, automatic, and rapid. We designed and built a tool named SEVERIS (SEVERity ISsue assessment) to automatically review issue reports and alert when a proposed severity is anomalous. The way SEVRIS is built provides the probabilities that the assessment is correct. These probabilities can be used to guide decision making in this process. Assigning the correct severity levels to issue reports is extremely important in the process employed at NASA, as it directly impacts resource allocation and planning of subsequent defect fixing activities. NASA uses a five-point scale to score issue severity. The scale ranges one to five, worst to dullest, respectively. A different scale is used for robotic and human-rated missions (see Table 1). Predicting which bugs get fixed.
 Guo et al.
  • 13. Automated Severity Assessment of Software Defect Reports Tim Menzies Lane Department of Computer Science, West Virginia University PO Box 6109, Morgantown, WV, 26506 304 293 0405 tim@menzies.us Andrian Marcus Department of Computer Science Wayne State University Detroit, MI 48202 313 577 5408 amarcus@wayne.edu Abstract In mission critical systems, such as those developed by NASA, it is very important that the test engineers properly recognize the severity of each issue they identify during testing. Proper severity assessment is essential for appropriate resource allocation and planning for fixing activities and additional testing. Severity assessment is strongly influenced by the experience of the test engineers and by the time they spend on each issue. The paper presents a new and automated method named SEVERIS (SEVERity ISsue assessment), which assists the test engineer in assigning severity levels to defect reports. SEVERIS is based on standard text mining and machine learning techniques applied to existing sets of defect reports. A case study on using SEVERIS with data from NASA’s Project and Issue Tracking System (PITS) is presented in the paper. The case study results indicate that SEVERIS is a good predictor for issue severity levels, while it is easy to use and efficient. 1. Introduction NASA’s software Independent Verification and Validation (IV&V) Program captures all of its findings in a database called the Project and Issue Tracking System (PITS). The data in PITS has been collected for more than 10 years and includes issues on robotic satellite missions and human-rated systems. Nowadays, similar defect tracking systems, such as Bugzilla1 , have become very popular, largely due to the spread of open source software development. These systems help to track bugs and changes in the code, to submit and review patches, to manage quality assurance, to support communication between developers, etc. As compared to newer systems, the problem with PITS is that there is a lack of consistency in how each 1 http://www.bugzilla.org/ of the projects collected issue data. In most instances, the specific configuration of the information captured about an issue was tailored by the IV&V project to meet its needs. This has created consistency problems when metrics data is pulled across projects. While there was a set of required data fields, the majorities of those fields do not provide information in regards to the quality of the issue and are not very suitable for comparing projects. A common issue among defect tracking systems is that they are useful for storing day-to-day information and generating small-scale tactical reports (e.g., “list the bugs we found last Tuesday”), but difficult to use for high-end business strategic analysis (e.g., “in the past, what methods have proved most cost effective in finding bugs?”). Another issue common to these systems is that most of the data is unstructured (i.e., free text). Specific to PITS is that the database fields in PITS keep changing, yet the nature of the unstructured text remains constant. In consequence, one logical choice in the analysis of defect reports is a combination of text mining and machine learning. In this paper we present a new approach for extracting general conclusions from PITS data based on text mining and machine learning methods, which are low cost, automatic, and rapid. We designed and built a tool named SEVERIS (SEVERity ISsue assessment) to automatically review issue reports and alert when a proposed severity is anomalous. The way SEVRIS is built provides the probabilities that the assessment is correct. These probabilities can be used to guide decision making in this process. Assigning the correct severity levels to issue reports is extremely important in the process employed at NASA, as it directly impacts resource allocation and planning of subsequent defect fixing activities. NASA uses a five-point scale to score issue severity. The scale ranges one to five, worst to dullest, respectively. A different scale is used for robotic and human-rated missions (see Table 1). Predicting which bugs get fixed.
 Guo et al. Predicting Severity of a reported bug.
 Lamkanfi et al.
  • 14. Automated Severity Assessment of Software Defect Reports Tim Menzies Lane Department of Computer Science, West Virginia University PO Box 6109, Morgantown, WV, 26506 304 293 0405 tim@menzies.us Andrian Marcus Department of Computer Science Wayne State University Detroit, MI 48202 313 577 5408 amarcus@wayne.edu Abstract In mission critical systems, such as those developed by NASA, it is very important that the test engineers properly recognize the severity of each issue they identify during testing. Proper severity assessment is essential for appropriate resource allocation and planning for fixing activities and additional testing. Severity assessment is strongly influenced by the experience of the test engineers and by the time they spend on each issue. The paper presents a new and automated method named SEVERIS (SEVERity ISsue assessment), which assists the test engineer in assigning severity levels to defect reports. SEVERIS is based on standard text mining and machine learning techniques applied to existing sets of defect reports. A case study on using SEVERIS with data from NASA’s Project and Issue Tracking System (PITS) is presented in the paper. The case study results indicate that SEVERIS is a good predictor for issue severity levels, while it is easy to use and efficient. 1. Introduction NASA’s software Independent Verification and Validation (IV&V) Program captures all of its findings in a database called the Project and Issue Tracking System (PITS). The data in PITS has been collected for more than 10 years and includes issues on robotic satellite missions and human-rated systems. Nowadays, similar defect tracking systems, such as Bugzilla1 , have become very popular, largely due to the spread of open source software development. These systems help to track bugs and changes in the code, to submit and review patches, to manage quality assurance, to support communication between developers, etc. As compared to newer systems, the problem with PITS is that there is a lack of consistency in how each 1 http://www.bugzilla.org/ of the projects collected issue data. In most instances, the specific configuration of the information captured about an issue was tailored by the IV&V project to meet its needs. This has created consistency problems when metrics data is pulled across projects. While there was a set of required data fields, the majorities of those fields do not provide information in regards to the quality of the issue and are not very suitable for comparing projects. A common issue among defect tracking systems is that they are useful for storing day-to-day information and generating small-scale tactical reports (e.g., “list the bugs we found last Tuesday”), but difficult to use for high-end business strategic analysis (e.g., “in the past, what methods have proved most cost effective in finding bugs?”). Another issue common to these systems is that most of the data is unstructured (i.e., free text). Specific to PITS is that the database fields in PITS keep changing, yet the nature of the unstructured text remains constant. In consequence, one logical choice in the analysis of defect reports is a combination of text mining and machine learning. In this paper we present a new approach for extracting general conclusions from PITS data based on text mining and machine learning methods, which are low cost, automatic, and rapid. We designed and built a tool named SEVERIS (SEVERity ISsue assessment) to automatically review issue reports and alert when a proposed severity is anomalous. The way SEVRIS is built provides the probabilities that the assessment is correct. These probabilities can be used to guide decision making in this process. Assigning the correct severity levels to issue reports is extremely important in the process employed at NASA, as it directly impacts resource allocation and planning of subsequent defect fixing activities. NASA uses a five-point scale to score issue severity. The scale ranges one to five, worst to dullest, respectively. A different scale is used for robotic and human-rated missions (see Table 1). Predicting which bugs get fixed.
 Guo et al. Predicting Severity of a reported bug.
 Lamkanfi et al. Characterizing re- opened bugs.
 Zimmermann et al.
  • 15. Automated Severity Assessment of Software Defect Reports Tim Menzies Lane Department of Computer Science, West Virginia University PO Box 6109, Morgantown, WV, 26506 304 293 0405 tim@menzies.us Andrian Marcus Department of Computer Science Wayne State University Detroit, MI 48202 313 577 5408 amarcus@wayne.edu Abstract In mission critical systems, such as those developed by NASA, it is very important that the test engineers properly recognize the severity of each issue they identify during testing. Proper severity assessment is essential for appropriate resource allocation and planning for fixing activities and additional testing. Severity assessment is strongly influenced by the experience of the test engineers and by the time they spend on each issue. The paper presents a new and automated method named SEVERIS (SEVERity ISsue assessment), which assists the test engineer in assigning severity levels to defect reports. SEVERIS is based on standard text mining and machine learning techniques applied to existing sets of defect reports. A case study on using SEVERIS with data from NASA’s Project and Issue Tracking System (PITS) is presented in the paper. The case study results indicate that SEVERIS is a good predictor for issue severity levels, while it is easy to use and efficient. 1. Introduction NASA’s software Independent Verification and Validation (IV&V) Program captures all of its findings in a database called the Project and Issue Tracking System (PITS). The data in PITS has been collected for more than 10 years and includes issues on robotic satellite missions and human-rated systems. Nowadays, similar defect tracking systems, such as Bugzilla1 , have become very popular, largely due to the spread of open source software development. These systems help to track bugs and changes in the code, to submit and review patches, to manage quality assurance, to support communication between developers, etc. As compared to newer systems, the problem with PITS is that there is a lack of consistency in how each 1 http://www.bugzilla.org/ of the projects collected issue data. In most instances, the specific configuration of the information captured about an issue was tailored by the IV&V project to meet its needs. This has created consistency problems when metrics data is pulled across projects. While there was a set of required data fields, the majorities of those fields do not provide information in regards to the quality of the issue and are not very suitable for comparing projects. A common issue among defect tracking systems is that they are useful for storing day-to-day information and generating small-scale tactical reports (e.g., “list the bugs we found last Tuesday”), but difficult to use for high-end business strategic analysis (e.g., “in the past, what methods have proved most cost effective in finding bugs?”). Another issue common to these systems is that most of the data is unstructured (i.e., free text). Specific to PITS is that the database fields in PITS keep changing, yet the nature of the unstructured text remains constant. In consequence, one logical choice in the analysis of defect reports is a combination of text mining and machine learning. In this paper we present a new approach for extracting general conclusions from PITS data based on text mining and machine learning methods, which are low cost, automatic, and rapid. We designed and built a tool named SEVERIS (SEVERity ISsue assessment) to automatically review issue reports and alert when a proposed severity is anomalous. The way SEVRIS is built provides the probabilities that the assessment is correct. These probabilities can be used to guide decision making in this process. Assigning the correct severity levels to issue reports is extremely important in the process employed at NASA, as it directly impacts resource allocation and planning of subsequent defect fixing activities. NASA uses a five-point scale to score issue severity. The scale ranges one to five, worst to dullest, respectively. A different scale is used for robotic and human-rated missions (see Table 1). Predicting which bugs get fixed.
 Guo et al. Predicting Severity of a reported bug.
 Lamkanfi et al. Characterizing re- opened bugs.
 Zimmermann et al. What makes a good bug report. Bettenburg et al.
  • 16. Automated Severity Assessment of Software Defect Reports Tim Menzies Lane Department of Computer Science, West Virginia University PO Box 6109, Morgantown, WV, 26506 304 293 0405 tim@menzies.us Andrian Marcus Department of Computer Science Wayne State University Detroit, MI 48202 313 577 5408 amarcus@wayne.edu Abstract In mission critical systems, such as those developed by NASA, it is very important that the test engineers properly recognize the severity of each issue they identify during testing. Proper severity assessment is essential for appropriate resource allocation and planning for fixing activities and additional testing. Severity assessment is strongly influenced by the experience of the test engineers and by the time they spend on each issue. The paper presents a new and automated method named SEVERIS (SEVERity ISsue assessment), which assists the test engineer in assigning severity levels to defect reports. SEVERIS is based on standard text mining and machine learning techniques applied to existing sets of defect reports. A case study on using SEVERIS with data from NASA’s Project and Issue Tracking System (PITS) is presented in the paper. The case study results indicate that SEVERIS is a good predictor for issue severity levels, while it is easy to use and efficient. 1. Introduction NASA’s software Independent Verification and Validation (IV&V) Program captures all of its findings in a database called the Project and Issue Tracking System (PITS). The data in PITS has been collected for more than 10 years and includes issues on robotic satellite missions and human-rated systems. Nowadays, similar defect tracking systems, such as Bugzilla1 , have become very popular, largely due to the spread of open source software development. These systems help to track bugs and changes in the code, to submit and review patches, to manage quality assurance, to support communication between developers, etc. As compared to newer systems, the problem with PITS is that there is a lack of consistency in how each 1 http://www.bugzilla.org/ of the projects collected issue data. In most instances, the specific configuration of the information captured about an issue was tailored by the IV&V project to meet its needs. This has created consistency problems when metrics data is pulled across projects. While there was a set of required data fields, the majorities of those fields do not provide information in regards to the quality of the issue and are not very suitable for comparing projects. A common issue among defect tracking systems is that they are useful for storing day-to-day information and generating small-scale tactical reports (e.g., “list the bugs we found last Tuesday”), but difficult to use for high-end business strategic analysis (e.g., “in the past, what methods have proved most cost effective in finding bugs?”). Another issue common to these systems is that most of the data is unstructured (i.e., free text). Specific to PITS is that the database fields in PITS keep changing, yet the nature of the unstructured text remains constant. In consequence, one logical choice in the analysis of defect reports is a combination of text mining and machine learning. In this paper we present a new approach for extracting general conclusions from PITS data based on text mining and machine learning methods, which are low cost, automatic, and rapid. We designed and built a tool named SEVERIS (SEVERity ISsue assessment) to automatically review issue reports and alert when a proposed severity is anomalous. The way SEVRIS is built provides the probabilities that the assessment is correct. These probabilities can be used to guide decision making in this process. Assigning the correct severity levels to issue reports is extremely important in the process employed at NASA, as it directly impacts resource allocation and planning of subsequent defect fixing activities. NASA uses a five-point scale to score issue severity. The scale ranges one to five, worst to dullest, respectively. A different scale is used for robotic and human-rated missions (see Table 1). Predicting which bugs get fixed.
 Guo et al. Predicting Severity of a reported bug.
 Lamkanfi et al. Characterizing re- opened bugs.
 Zimmermann et al. What makes a good bug report. Bettenburg et al.
  • 17. Automated Severity Assessment of Software Defect Reports Tim Menzies Lane Department of Computer Science, West Virginia University PO Box 6109, Morgantown, WV, 26506 304 293 0405 tim@menzies.us Andrian Marcus Department of Computer Science Wayne State University Detroit, MI 48202 313 577 5408 amarcus@wayne.edu Abstract In mission critical systems, such as those developed by NASA, it is very important that the test engineers properly recognize the severity of each issue they identify during testing. Proper severity assessment is essential for appropriate resource allocation and planning for fixing activities and additional testing. Severity assessment is strongly influenced by the experience of the test engineers and by the time they spend on each issue. The paper presents a new and automated method named SEVERIS (SEVERity ISsue assessment), which assists the test engineer in assigning severity levels to defect reports. SEVERIS is based on standard text mining and machine learning techniques applied to existing sets of defect reports. A case study on using SEVERIS with data from NASA’s Project and Issue Tracking System (PITS) is presented in the paper. The case study results indicate that SEVERIS is a good predictor for issue severity levels, while it is easy to use and efficient. 1. Introduction NASA’s software Independent Verification and Validation (IV&V) Program captures all of its findings in a database called the Project and Issue Tracking System (PITS). The data in PITS has been collected for more than 10 years and includes issues on robotic satellite missions and human-rated systems. Nowadays, similar defect tracking systems, such as Bugzilla1 , have become very popular, largely due to the spread of open source software development. These systems help to track bugs and changes in the code, to submit and review patches, to manage quality assurance, to support communication between developers, etc. As compared to newer systems, the problem with PITS is that there is a lack of consistency in how each 1 http://www.bugzilla.org/ of the projects collected issue data. In most instances, the specific configuration of the information captured about an issue was tailored by the IV&V project to meet its needs. This has created consistency problems when metrics data is pulled across projects. While there was a set of required data fields, the majorities of those fields do not provide information in regards to the quality of the issue and are not very suitable for comparing projects. A common issue among defect tracking systems is that they are useful for storing day-to-day information and generating small-scale tactical reports (e.g., “list the bugs we found last Tuesday”), but difficult to use for high-end business strategic analysis (e.g., “in the past, what methods have proved most cost effective in finding bugs?”). Another issue common to these systems is that most of the data is unstructured (i.e., free text). Specific to PITS is that the database fields in PITS keep changing, yet the nature of the unstructured text remains constant. In consequence, one logical choice in the analysis of defect reports is a combination of text mining and machine learning. In this paper we present a new approach for extracting general conclusions from PITS data based on text mining and machine learning methods, which are low cost, automatic, and rapid. We designed and built a tool named SEVERIS (SEVERity ISsue assessment) to automatically review issue reports and alert when a proposed severity is anomalous. The way SEVRIS is built provides the probabilities that the assessment is correct. These probabilities can be used to guide decision making in this process. Assigning the correct severity levels to issue reports is extremely important in the process employed at NASA, as it directly impacts resource allocation and planning of subsequent defect fixing activities. NASA uses a five-point scale to score issue severity. The scale ranges one to five, worst to dullest, respectively. A different scale is used for robotic and human-rated missions (see Table 1). Predicting which bugs get fixed.
 Guo et al. Predicting Severity of a reported bug.
 Lamkanfi et al. Characterizing re- opened bugs.
 Zimmermann et al. What makes a good bug report. Bettenburg et al. Do clones matter? Juergens et al.
  • 18. Automated Severity Assessment of Software Defect Reports Tim Menzies Lane Department of Computer Science, West Virginia University PO Box 6109, Morgantown, WV, 26506 304 293 0405 tim@menzies.us Andrian Marcus Department of Computer Science Wayne State University Detroit, MI 48202 313 577 5408 amarcus@wayne.edu Abstract In mission critical systems, such as those developed by NASA, it is very important that the test engineers properly recognize the severity of each issue they identify during testing. Proper severity assessment is essential for appropriate resource allocation and planning for fixing activities and additional testing. Severity assessment is strongly influenced by the experience of the test engineers and by the time they spend on each issue. The paper presents a new and automated method named SEVERIS (SEVERity ISsue assessment), which assists the test engineer in assigning severity levels to defect reports. SEVERIS is based on standard text mining and machine learning techniques applied to existing sets of defect reports. A case study on using SEVERIS with data from NASA’s Project and Issue Tracking System (PITS) is presented in the paper. The case study results indicate that SEVERIS is a good predictor for issue severity levels, while it is easy to use and efficient. 1. Introduction NASA’s software Independent Verification and Validation (IV&V) Program captures all of its findings in a database called the Project and Issue Tracking System (PITS). The data in PITS has been collected for more than 10 years and includes issues on robotic satellite missions and human-rated systems. Nowadays, similar defect tracking systems, such as Bugzilla1 , have become very popular, largely due to the spread of open source software development. These systems help to track bugs and changes in the code, to submit and review patches, to manage quality assurance, to support communication between developers, etc. As compared to newer systems, the problem with PITS is that there is a lack of consistency in how each 1 http://www.bugzilla.org/ of the projects collected issue data. In most instances, the specific configuration of the information captured about an issue was tailored by the IV&V project to meet its needs. This has created consistency problems when metrics data is pulled across projects. While there was a set of required data fields, the majorities of those fields do not provide information in regards to the quality of the issue and are not very suitable for comparing projects. A common issue among defect tracking systems is that they are useful for storing day-to-day information and generating small-scale tactical reports (e.g., “list the bugs we found last Tuesday”), but difficult to use for high-end business strategic analysis (e.g., “in the past, what methods have proved most cost effective in finding bugs?”). Another issue common to these systems is that most of the data is unstructured (i.e., free text). Specific to PITS is that the database fields in PITS keep changing, yet the nature of the unstructured text remains constant. In consequence, one logical choice in the analysis of defect reports is a combination of text mining and machine learning. In this paper we present a new approach for extracting general conclusions from PITS data based on text mining and machine learning methods, which are low cost, automatic, and rapid. We designed and built a tool named SEVERIS (SEVERity ISsue assessment) to automatically review issue reports and alert when a proposed severity is anomalous. The way SEVRIS is built provides the probabilities that the assessment is correct. These probabilities can be used to guide decision making in this process. Assigning the correct severity levels to issue reports is extremely important in the process employed at NASA, as it directly impacts resource allocation and planning of subsequent defect fixing activities. NASA uses a five-point scale to score issue severity. The scale ranges one to five, worst to dullest, respectively. A different scale is used for robotic and human-rated missions (see Table 1). Predicting which bugs get fixed.
 Guo et al. Predicting Severity of a reported bug.
 Lamkanfi et al. Characterizing re- opened bugs.
 Zimmermann et al. What makes a good bug report. Bettenburg et al. Do clones matter? Juergens et al. Frequency and Risks of changes to clones. Göde et al.
  • 19. Automated Severity Assessment of Software Defect Reports Tim Menzies Lane Department of Computer Science, West Virginia University PO Box 6109, Morgantown, WV, 26506 304 293 0405 tim@menzies.us Andrian Marcus Department of Computer Science Wayne State University Detroit, MI 48202 313 577 5408 amarcus@wayne.edu Abstract In mission critical systems, such as those developed by NASA, it is very important that the test engineers properly recognize the severity of each issue they identify during testing. Proper severity assessment is essential for appropriate resource allocation and planning for fixing activities and additional testing. Severity assessment is strongly influenced by the experience of the test engineers and by the time they spend on each issue. The paper presents a new and automated method named SEVERIS (SEVERity ISsue assessment), which assists the test engineer in assigning severity levels to defect reports. SEVERIS is based on standard text mining and machine learning techniques applied to existing sets of defect reports. A case study on using SEVERIS with data from NASA’s Project and Issue Tracking System (PITS) is presented in the paper. The case study results indicate that SEVERIS is a good predictor for issue severity levels, while it is easy to use and efficient. 1. Introduction NASA’s software Independent Verification and Validation (IV&V) Program captures all of its findings in a database called the Project and Issue Tracking System (PITS). The data in PITS has been collected for more than 10 years and includes issues on robotic satellite missions and human-rated systems. Nowadays, similar defect tracking systems, such as Bugzilla1 , have become very popular, largely due to the spread of open source software development. These systems help to track bugs and changes in the code, to submit and review patches, to manage quality assurance, to support communication between developers, etc. As compared to newer systems, the problem with PITS is that there is a lack of consistency in how each 1 http://www.bugzilla.org/ of the projects collected issue data. In most instances, the specific configuration of the information captured about an issue was tailored by the IV&V project to meet its needs. This has created consistency problems when metrics data is pulled across projects. While there was a set of required data fields, the majorities of those fields do not provide information in regards to the quality of the issue and are not very suitable for comparing projects. A common issue among defect tracking systems is that they are useful for storing day-to-day information and generating small-scale tactical reports (e.g., “list the bugs we found last Tuesday”), but difficult to use for high-end business strategic analysis (e.g., “in the past, what methods have proved most cost effective in finding bugs?”). Another issue common to these systems is that most of the data is unstructured (i.e., free text). Specific to PITS is that the database fields in PITS keep changing, yet the nature of the unstructured text remains constant. In consequence, one logical choice in the analysis of defect reports is a combination of text mining and machine learning. In this paper we present a new approach for extracting general conclusions from PITS data based on text mining and machine learning methods, which are low cost, automatic, and rapid. We designed and built a tool named SEVERIS (SEVERity ISsue assessment) to automatically review issue reports and alert when a proposed severity is anomalous. The way SEVRIS is built provides the probabilities that the assessment is correct. These probabilities can be used to guide decision making in this process. Assigning the correct severity levels to issue reports is extremely important in the process employed at NASA, as it directly impacts resource allocation and planning of subsequent defect fixing activities. NASA uses a five-point scale to score issue severity. The scale ranges one to five, worst to dullest, respectively. A different scale is used for robotic and human-rated missions (see Table 1). Predicting which bugs get fixed.
 Guo et al. Predicting Severity of a reported bug.
 Lamkanfi et al. Characterizing re- opened bugs.
 Zimmermann et al. What makes a good bug report. Bettenburg et al. Do clones matter? Juergens et al. Frequency and Risks of changes to clones. Göde et al. Do developers care about code smells? Yamashita et al.
  • 20. Automated Severity Assessment of Software Defect Reports Tim Menzies Lane Department of Computer Science, West Virginia University PO Box 6109, Morgantown, WV, 26506 304 293 0405 tim@menzies.us Andrian Marcus Department of Computer Science Wayne State University Detroit, MI 48202 313 577 5408 amarcus@wayne.edu Abstract In mission critical systems, such as those developed by NASA, it is very important that the test engineers properly recognize the severity of each issue they identify during testing. Proper severity assessment is essential for appropriate resource allocation and planning for fixing activities and additional testing. Severity assessment is strongly influenced by the experience of the test engineers and by the time they spend on each issue. The paper presents a new and automated method named SEVERIS (SEVERity ISsue assessment), which assists the test engineer in assigning severity levels to defect reports. SEVERIS is based on standard text mining and machine learning techniques applied to existing sets of defect reports. A case study on using SEVERIS with data from NASA’s Project and Issue Tracking System (PITS) is presented in the paper. The case study results indicate that SEVERIS is a good predictor for issue severity levels, while it is easy to use and efficient. 1. Introduction NASA’s software Independent Verification and Validation (IV&V) Program captures all of its findings in a database called the Project and Issue Tracking System (PITS). The data in PITS has been collected for more than 10 years and includes issues on robotic satellite missions and human-rated systems. Nowadays, similar defect tracking systems, such as Bugzilla1 , have become very popular, largely due to the spread of open source software development. These systems help to track bugs and changes in the code, to submit and review patches, to manage quality assurance, to support communication between developers, etc. As compared to newer systems, the problem with PITS is that there is a lack of consistency in how each 1 http://www.bugzilla.org/ of the projects collected issue data. In most instances, the specific configuration of the information captured about an issue was tailored by the IV&V project to meet its needs. This has created consistency problems when metrics data is pulled across projects. While there was a set of required data fields, the majorities of those fields do not provide information in regards to the quality of the issue and are not very suitable for comparing projects. A common issue among defect tracking systems is that they are useful for storing day-to-day information and generating small-scale tactical reports (e.g., “list the bugs we found last Tuesday”), but difficult to use for high-end business strategic analysis (e.g., “in the past, what methods have proved most cost effective in finding bugs?”). Another issue common to these systems is that most of the data is unstructured (i.e., free text). Specific to PITS is that the database fields in PITS keep changing, yet the nature of the unstructured text remains constant. In consequence, one logical choice in the analysis of defect reports is a combination of text mining and machine learning. In this paper we present a new approach for extracting general conclusions from PITS data based on text mining and machine learning methods, which are low cost, automatic, and rapid. We designed and built a tool named SEVERIS (SEVERity ISsue assessment) to automatically review issue reports and alert when a proposed severity is anomalous. The way SEVRIS is built provides the probabilities that the assessment is correct. These probabilities can be used to guide decision making in this process. Assigning the correct severity levels to issue reports is extremely important in the process employed at NASA, as it directly impacts resource allocation and planning of subsequent defect fixing activities. NASA uses a five-point scale to score issue severity. The scale ranges one to five, worst to dullest, respectively. A different scale is used for robotic and human-rated missions (see Table 1). Predicting which bugs get fixed.
 Guo et al. Predicting Severity of a reported bug.
 Lamkanfi et al. Characterizing re- opened bugs.
 Zimmermann et al. What makes a good bug report. Bettenburg et al. Do clones matter? Juergens et al. Frequency and Risks of changes to clones. Göde et al. Do developers care about code smells? Yamashita et al. Inconsistent Changes to Clones at Release Level. Bettenburg et al.
  • 21. March OctoberJanuary June November 2007
  • 22. March OctoberJanuary June November 2007
  • 24.
  • 26.
  • 27. There are many, varied stories behind the observed SE artifacts.
  • 28.
  • 29. Ignoring available data could lead to missing fundamentally important insights
  • 31.
  • 32. When the same bug is reported several times in Bugzilla, developers are slowed down https://fedoraproject.org/wiki/How_to_file_a_bug_report#Avoiding_Duplicate_Bug_Reports
  • 33. When the same bug is reported several times in Bugzilla, developers are slowed down https://fedoraproject.org/wiki/How_to_file_a_bug_report#Avoiding_Duplicate_Bug_Reports A duplicate bug is a burden in the testing cycle. https://www.softwaretestinghelp.com/how-to-write-good-bug-report/
  • 34. When the same bug is reported several times in Bugzilla, developers are slowed down https://fedoraproject.org/wiki/How_to_file_a_bug_report#Avoiding_Duplicate_Bug_Reports Several duplicate bug reports just cause an administration headache for developers http://wicket.apache.org/help/reportabug.html A duplicate bug is a burden in the testing cycle. https://www.softwaretestinghelp.com/how-to-write-good-bug-report/
  • 35. When the same bug is reported several times in Bugzilla, developers are slowed down https://fedoraproject.org/wiki/How_to_file_a_bug_report#Avoiding_Duplicate_Bug_Reports Duplicate bug reports, […] consume time of bug triagers and software developers that might better be spent working on reports that describe unique requests. Lyndon Hiew , MSc. Thesis, 2006, UBC Several duplicate bug reports just cause an administration headache for developers http://wicket.apache.org/help/reportabug.html A duplicate bug is a burden in the testing cycle. https://www.softwaretestinghelp.com/how-to-write-good-bug-report/
  • 36. DON’T BE THAT GUY who submitted a DUPLICATE
  • 37. It doesn't even mean that that the resolved bug report can now be ignored, since we have seen instances of late- identification of duplicates (e.g., BR-C in Figure 2) in which accumulated knowledge and dialogue may still be relevant to the resolution of the other bug reports in the BRN. Robert J. Sandusky, Les Gasser, and Gabriel Ripoche. Bug report networks: Varieties, strategies, and impacts in an oss development community. In Proc. of ICSE Workshop on Mining Software Repositories, 2004.
  • 38. “Duplicates are not really problems. They often add useful information. That this information were filed under a new report is not ideal though.” N. Bettenburg, S. Just, A. Schröter, C. Weiss, R. Premraj, and T. Zimmermann. What makes a good bug report? In Proceedings of the 16th International Symposium on Foundations of Software Engineering, November 2008.
  • 39. can gly m- ese to in- item h hm P(hm | h) steps to reproduce 47 42 0.8936 stack traces 45 35 0.7778 screenshots 42 17 0.4048 test cases 39 11 0.2821 observed behavior 44 12 0.2727 code examples 38 9 0.2368 error reports 33 3 0.0909 build information 34 3 0.0882 summary 36 3 0.0833 expected behavior 41 3 0.0732 version 38 1 0.0236 component 34 0 0.0000 hardware 13 0 0.0000 operating system 34 0 0.0000 product 30 0 0.0000 severity 26 0 0.0000 Table 1. Lists all items from the first survey part with the count how often they helped (h), how often they helped the most (hm), and the probability that an item helped most under the condition that it helped. 5. Metric Now that we got an idea about important information contained in a bug report and have a sample of reports ranked by experts we item a am P(am|a) errors in steps to reproduce 34 29 0.8235 incomplete information 44 35 0.7727 wrong observed behavior 15 11 0.6667 wrong version number 21 8 0.2857 errors in test cases 14 4 0.2857 unstructured text 19 7 0.2632 wrong operating system 8 3 0.2500 wrong expected behavior 18 7 0.2222 non-technical language 14 3 0.2143 too long text 11 2 0.1818 errors in code examples 11 2 0.1818 bad grammar 29 5 0.1724 wrong component name 22 2 0.0909 prose text 12 2 0.0833 duplicates 31 2 0.0645 no spellcheck 8 0 0.0000 wrong hardware 5 0 0.0000 spam 1 0 0.0000 wrong product name 11 0 0.0000 errors in strack traces 2 0 0.0000 Table 2. Lists all items from the second part with the count how often they harmed (a), how often they harmed the most (am), and the probability that an item harmed most under the condition that it harmed. was filled out by 48 out of 365 developers in total. Secondly we present the results of our metric which we compare to the expert opinions we gained from the ugly reports study. T r m repo tom acce W vide have 0.3. 7. D S tion. the r we c item a am P(am|a) errors in steps to reproduce 34 29 0.8235 incomplete information 44 35 0.7727 wrong observed behavior 15 11 0.6667 wrong version number 21 8 0.2857 errors in test cases 14 4 0.2857 unstructured text 19 7 0.2632 wrong operating system 8 3 0.2500 wrong expected behavior 18 7 0.2222 non-technical language 14 3 0.2143 too long text 11 2 0.1818 errors in code examples 11 2 0.1818 bad grammar 29 5 0.1724 wrong component name 22 2 0.0909 prose text 12 2 0.0833 duplicates 31 2 0.0645 no spellcheck 8 0 0.0000 wrong hardware 5 0 0.0000 spam 1 0 0.0000 wrong product name 11 0 0.0000 errors in strack traces 2 0 0.0000 Table 2. Lists all items from the second part with the count how often they harmed (a), how often they harmed the most (am), and the probability that an item harmed most under the condition that it harmed. was filled out by 48 out of 365 developers in total. Secondly we rep tom acc vid hav 0.3 7. What Helps the Most? What Harms the Most? N. Bettenburg, S. Just, A. Schröter, C. Weiss, R. Premraj, and T. Zimmermann. What makes a good bug report? In Proceedings of the 16th International Symposium on Foundations of Software Engineering, November 2008.
  • 40.
  • 41. PART 1 Is there extra information in duplicate reports and if so, can we quantify how much? PART 2 Is that extra information helpful for carrying out software engineering tasks?
  • 42.
  • 43. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Reports/Month ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 010002000300040005000 1.0 2.0 2.1 2.1.1 2.1.2 2.1.3 3.0 3.0.1 3.0.2 3.1 3.1.1 3.1.2 3.2 3.2.2 3.3 Milestones Oct'01 Jan'02 Apr'02 Jul'02 Oct'02 Jan'03 Apr'03 Jul'03 Oct'03 Jan'04 Apr'04 Jul'04 Oct'04 Jan'05 Apr'05 Jul'05 Oct'05 Jan'06 Apr'06 Jul'06 Oct'06 Jan'07 Apr'07 Jul'07 Oct'07 ● Reports submitted (total) ● Duplicates submitted ~ 3,000 reports submitted per month ~ 13% duplicate bug reports First, we need DATA … lot’s of DATA!
  • 44. 100,000 200,000 300,000 400,000 500,000 Mozilla Bug Reports without duplicates Duplicate Reports Master Reports 269,222 116,727 36,697 50,000 100,000 150,000 200,000 250,000 Eclipse 167,494 27,838 16,511 Figure 4.1: Graphical representation of the collected bug report data. The MOZILLA database was mined using a tool that reads the XML repre- Inverse Duplicate Problem 27% (Mozilla) 31% (Eclipse)
  • 45. Bug 137808 Summary: Exceptions from createFromString lock-up the editor Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com> Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com> Status: VERIFIED FIXED QA Contact: Severity: normal Priority: P3 CC: merks@ca.ibm.com Version: 2.2 Target Milestone: --- Hardware: PC OS: Windows XP Whiteboard: Description: Opened: 2006-04-20 14:25 - 0400 As discussed on the newsgroup under the Thread with the same name I am opening this bug entry. Here is a history of the thread. -- From Ed Merks Patrick, The value is checked before it's applied and can't be applied until it's valid. But this BigDecimal cases behaves oddly because the exception thrown by new BigDecimal("badvalue") has a null message and the property editor relies on returning a non-null message string to indicate there is an error. Please open a bugzilla which I'll fix like this: ### Eclipse Workspace Patch 1.0 #P org.eclipse.emf.edit.ui Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java =================================================================== RCS file: /cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v retrieving revision 1.10 diff -u -r1.10 PropertyDescriptor.java --- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006 16:42:30 -0000 1.10 +++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006 11:59:10 -0000 @@ -162,7 +162,8 @@ } catch (Exception exception) { - return exception.getMessage(); + String message = exception.getMessage(); + return message == null ? exception.toString() : message; } } Diagnostic diagnostic = Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value); Patrick Sodre wrote: Hi, It seems that if the user inputs an invalid parameter that gets created from "createFromString" the Editor locks-up until the user explicitly calls "restore Default Value". Is this the expected behavior or could something better be done? For instance if an exception is thrown restore the value back to what it was before after displaying a pop-up error message. I understand that for DataTypes defined by the user he/she should take care of catching the exceptions but for the default ones like BigInteger/BigDecimal I think the EMF runtime could do some of the grunt work... If you think this is something worth pursuing I could post an entry in Bugzilla. Regards, Patrick Sodre Below is the stack trace that I got from the Editor... java.lang.NumberFormatException at java.math.BigDecimal.<init>(BigDecimal.java:368) at java.math.BigDecimal.<init>(BigDecimal.java:647) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116) at org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183) at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449) at org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135) at org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249) at ------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 ------- The fix has been committed to CVS. Thanks for reporting this problem. ------- Comment #2 From Marcelo Paternostro 2006-04-27 10:44:24 -0400 ------- Fixed in the I200604270000 built ------- Comment #3 From Nick Boldt 2008-01-28 16:46:51 -0400 ------- Move to verified as per bug 206558. Extracting Structural Information from Bug Reports (MSR 2008)
  • 46. Bug 137808 Summary: Exceptions from createFromString lock-up the editor Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com> Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com> Status: VERIFIED FIXED QA Contact: Severity: normal Priority: P3 CC: merks@ca.ibm.com Version: 2.2 Target Milestone: --- Hardware: PC OS: Windows XP Whiteboard: Description: Opened: 2006-04-20 14:25 - 0400 As discussed on the newsgroup under the Thread with the same name I am opening this bug entry. Here is a history of the thread. -- From Ed Merks Patrick, The value is checked before it's applied and can't be applied until it's valid. But this BigDecimal cases behaves oddly because the exception thrown by new BigDecimal("badvalue") has a null message and the property editor relies on returning a non-null message string to indicate there is an error. Please open a bugzilla which I'll fix like this: ### Eclipse Workspace Patch 1.0 #P org.eclipse.emf.edit.ui Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java =================================================================== RCS file: /cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v retrieving revision 1.10 diff -u -r1.10 PropertyDescriptor.java --- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006 16:42:30 -0000 1.10 +++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006 11:59:10 -0000 @@ -162,7 +162,8 @@ } catch (Exception exception) { - return exception.getMessage(); + String message = exception.getMessage(); + return message == null ? exception.toString() : message; } } Diagnostic diagnostic = Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value); Patrick Sodre wrote: Hi, It seems that if the user inputs an invalid parameter that gets created from "createFromString" the Editor locks-up until the user explicitly calls "restore Default Value". Is this the expected behavior or could something better be done? For instance if an exception is thrown restore the value back to what it was before after displaying a pop-up error message. I understand that for DataTypes defined by the user he/she should take care of catching the exceptions but for the default ones like BigInteger/BigDecimal I think the EMF runtime could do some of the grunt work... If you think this is something worth pursuing I could post an entry in Bugzilla. Regards, Patrick Sodre Below is the stack trace that I got from the Editor... java.lang.NumberFormatException at java.math.BigDecimal.<init>(BigDecimal.java:368) at java.math.BigDecimal.<init>(BigDecimal.java:647) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116) at org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183) at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449) at org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135) at org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249) at ------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 ------- The fix has been committed to CVS. Thanks for reporting this problem. ------- Comment #2 From Marcelo Paternostro 2006-04-27 10:44:24 -0400 ------- Fixed in the I200604270000 built ------- Comment #3 From Nick Boldt 2008-01-28 16:46:51 -0400 ------- Move to verified as per bug 206558. Extracting Structural Information from Bug Reports (MSR 2008) METADATA
  • 47. Bug 137808 Summary: Exceptions from createFromString lock-up the editor Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com> Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com> Status: VERIFIED FIXED QA Contact: Severity: normal Priority: P3 CC: merks@ca.ibm.com Version: 2.2 Target Milestone: --- Hardware: PC OS: Windows XP Whiteboard: Description: Opened: 2006-04-20 14:25 - 0400 As discussed on the newsgroup under the Thread with the same name I am opening this bug entry. Here is a history of the thread. -- From Ed Merks Patrick, The value is checked before it's applied and can't be applied until it's valid. But this BigDecimal cases behaves oddly because the exception thrown by new BigDecimal("badvalue") has a null message and the property editor relies on returning a non-null message string to indicate there is an error. Please open a bugzilla which I'll fix like this: ### Eclipse Workspace Patch 1.0 #P org.eclipse.emf.edit.ui Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java =================================================================== RCS file: /cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v retrieving revision 1.10 diff -u -r1.10 PropertyDescriptor.java --- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006 16:42:30 -0000 1.10 +++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006 11:59:10 -0000 @@ -162,7 +162,8 @@ } catch (Exception exception) { - return exception.getMessage(); + String message = exception.getMessage(); + return message == null ? exception.toString() : message; } } Diagnostic diagnostic = Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value); Patrick Sodre wrote: Hi, It seems that if the user inputs an invalid parameter that gets created from "createFromString" the Editor locks-up until the user explicitly calls "restore Default Value". Is this the expected behavior or could something better be done? For instance if an exception is thrown restore the value back to what it was before after displaying a pop-up error message. I understand that for DataTypes defined by the user he/she should take care of catching the exceptions but for the default ones like BigInteger/BigDecimal I think the EMF runtime could do some of the grunt work... If you think this is something worth pursuing I could post an entry in Bugzilla. Regards, Patrick Sodre Below is the stack trace that I got from the Editor... java.lang.NumberFormatException at java.math.BigDecimal.<init>(BigDecimal.java:368) at java.math.BigDecimal.<init>(BigDecimal.java:647) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116) at org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183) at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449) at org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135) at org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249) at ------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 ------- The fix has been committed to CVS. Thanks for reporting this problem. ------- Comment #2 From Marcelo Paternostro 2006-04-27 10:44:24 -0400 ------- Fixed in the I200604270000 built ------- Comment #3 From Nick Boldt 2008-01-28 16:46:51 -0400 ------- Move to verified as per bug 206558. SOURCE CODE Extracting Structural Information from Bug Reports (MSR 2008) METADATA
  • 48. Bug 137808 Summary: Exceptions from createFromString lock-up the editor Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com> Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com> Status: VERIFIED FIXED QA Contact: Severity: normal Priority: P3 CC: merks@ca.ibm.com Version: 2.2 Target Milestone: --- Hardware: PC OS: Windows XP Whiteboard: Description: Opened: 2006-04-20 14:25 - 0400 As discussed on the newsgroup under the Thread with the same name I am opening this bug entry. Here is a history of the thread. -- From Ed Merks Patrick, The value is checked before it's applied and can't be applied until it's valid. But this BigDecimal cases behaves oddly because the exception thrown by new BigDecimal("badvalue") has a null message and the property editor relies on returning a non-null message string to indicate there is an error. Please open a bugzilla which I'll fix like this: ### Eclipse Workspace Patch 1.0 #P org.eclipse.emf.edit.ui Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java =================================================================== RCS file: /cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v retrieving revision 1.10 diff -u -r1.10 PropertyDescriptor.java --- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006 16:42:30 -0000 1.10 +++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006 11:59:10 -0000 @@ -162,7 +162,8 @@ } catch (Exception exception) { - return exception.getMessage(); + String message = exception.getMessage(); + return message == null ? exception.toString() : message; } } Diagnostic diagnostic = Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value); Patrick Sodre wrote: Hi, It seems that if the user inputs an invalid parameter that gets created from "createFromString" the Editor locks-up until the user explicitly calls "restore Default Value". Is this the expected behavior or could something better be done? For instance if an exception is thrown restore the value back to what it was before after displaying a pop-up error message. I understand that for DataTypes defined by the user he/she should take care of catching the exceptions but for the default ones like BigInteger/BigDecimal I think the EMF runtime could do some of the grunt work... If you think this is something worth pursuing I could post an entry in Bugzilla. Regards, Patrick Sodre Below is the stack trace that I got from the Editor... java.lang.NumberFormatException at java.math.BigDecimal.<init>(BigDecimal.java:368) at java.math.BigDecimal.<init>(BigDecimal.java:647) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116) at org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183) at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449) at org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135) at org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249) at ------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 ------- The fix has been committed to CVS. Thanks for reporting this problem. ------- Comment #2 From Marcelo Paternostro 2006-04-27 10:44:24 -0400 ------- Fixed in the I200604270000 built ------- Comment #3 From Nick Boldt 2008-01-28 16:46:51 -0400 ------- Move to verified as per bug 206558. SOURCE CODE PATCHES Extracting Structural Information from Bug Reports (MSR 2008) METADATA
  • 49. Bug 137808 Summary: Exceptions from createFromString lock-up the editor Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com> Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com> Status: VERIFIED FIXED QA Contact: Severity: normal Priority: P3 CC: merks@ca.ibm.com Version: 2.2 Target Milestone: --- Hardware: PC OS: Windows XP Whiteboard: Description: Opened: 2006-04-20 14:25 - 0400 As discussed on the newsgroup under the Thread with the same name I am opening this bug entry. Here is a history of the thread. -- From Ed Merks Patrick, The value is checked before it's applied and can't be applied until it's valid. But this BigDecimal cases behaves oddly because the exception thrown by new BigDecimal("badvalue") has a null message and the property editor relies on returning a non-null message string to indicate there is an error. Please open a bugzilla which I'll fix like this: ### Eclipse Workspace Patch 1.0 #P org.eclipse.emf.edit.ui Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java =================================================================== RCS file: /cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v retrieving revision 1.10 diff -u -r1.10 PropertyDescriptor.java --- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006 16:42:30 -0000 1.10 +++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006 11:59:10 -0000 @@ -162,7 +162,8 @@ } catch (Exception exception) { - return exception.getMessage(); + String message = exception.getMessage(); + return message == null ? exception.toString() : message; } } Diagnostic diagnostic = Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value); Patrick Sodre wrote: Hi, It seems that if the user inputs an invalid parameter that gets created from "createFromString" the Editor locks-up until the user explicitly calls "restore Default Value". Is this the expected behavior or could something better be done? For instance if an exception is thrown restore the value back to what it was before after displaying a pop-up error message. I understand that for DataTypes defined by the user he/she should take care of catching the exceptions but for the default ones like BigInteger/BigDecimal I think the EMF runtime could do some of the grunt work... If you think this is something worth pursuing I could post an entry in Bugzilla. Regards, Patrick Sodre Below is the stack trace that I got from the Editor... java.lang.NumberFormatException at java.math.BigDecimal.<init>(BigDecimal.java:368) at java.math.BigDecimal.<init>(BigDecimal.java:647) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116) at org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183) at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449) at org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135) at org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249) at ------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 ------- The fix has been committed to CVS. Thanks for reporting this problem. ------- Comment #2 From Marcelo Paternostro 2006-04-27 10:44:24 -0400 ------- Fixed in the I200604270000 built ------- Comment #3 From Nick Boldt 2008-01-28 16:46:51 -0400 ------- Move to verified as per bug 206558. SCREENSHOTS SOURCE CODE PATCHES Extracting Structural Information from Bug Reports (MSR 2008) METADATA
  • 50. Bug 137808 Summary: Exceptions from createFromString lock-up the editor Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com> Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com> Status: VERIFIED FIXED QA Contact: Severity: normal Priority: P3 CC: merks@ca.ibm.com Version: 2.2 Target Milestone: --- Hardware: PC OS: Windows XP Whiteboard: Description: Opened: 2006-04-20 14:25 - 0400 As discussed on the newsgroup under the Thread with the same name I am opening this bug entry. Here is a history of the thread. -- From Ed Merks Patrick, The value is checked before it's applied and can't be applied until it's valid. But this BigDecimal cases behaves oddly because the exception thrown by new BigDecimal("badvalue") has a null message and the property editor relies on returning a non-null message string to indicate there is an error. Please open a bugzilla which I'll fix like this: ### Eclipse Workspace Patch 1.0 #P org.eclipse.emf.edit.ui Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java =================================================================== RCS file: /cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v retrieving revision 1.10 diff -u -r1.10 PropertyDescriptor.java --- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006 16:42:30 -0000 1.10 +++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006 11:59:10 -0000 @@ -162,7 +162,8 @@ } catch (Exception exception) { - return exception.getMessage(); + String message = exception.getMessage(); + return message == null ? exception.toString() : message; } } Diagnostic diagnostic = Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value); Patrick Sodre wrote: Hi, It seems that if the user inputs an invalid parameter that gets created from "createFromString" the Editor locks-up until the user explicitly calls "restore Default Value". Is this the expected behavior or could something better be done? For instance if an exception is thrown restore the value back to what it was before after displaying a pop-up error message. I understand that for DataTypes defined by the user he/she should take care of catching the exceptions but for the default ones like BigInteger/BigDecimal I think the EMF runtime could do some of the grunt work... If you think this is something worth pursuing I could post an entry in Bugzilla. Regards, Patrick Sodre Below is the stack trace that I got from the Editor... java.lang.NumberFormatException at java.math.BigDecimal.<init>(BigDecimal.java:368) at java.math.BigDecimal.<init>(BigDecimal.java:647) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116) at org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183) at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449) at org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135) at org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249) at ------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 ------- The fix has been committed to CVS. Thanks for reporting this problem. ------- Comment #2 From Marcelo Paternostro 2006-04-27 10:44:24 -0400 ------- Fixed in the I200604270000 built ------- Comment #3 From Nick Boldt 2008-01-28 16:46:51 -0400 ------- Move to verified as per bug 206558. SCREENSHOTS SOURCE CODE PATCHES STACK TRACES Extracting Structural Information from Bug Reports (MSR 2008) METADATA
  • 51. 3.6 Order of Extraction PATCHES STACK TRACES SOURCE CODE ENUMERATIONS loremm ipsum dolor met e4a this is a public String { dosomeThing(); } We have the following problem: - first you have to do - then you must do We propos the following patch file to be used: Index: someFile.java ===================== INPUT Index: PatchFilter.java ================== RCS File: PatchFilter.java --- PatchFilter.java 23.10.2007 +++ PatchFilter.java 24.10.2007 @@+7,13-7,14@@ This is a sample context line - This line will be removed + this line will be added instead PATCH Index: PatchFilter.java ================== RCS File: PatchFilter.java --- PatchFilter.java 23.10.2007 +++ PatchFilter.java 24.10.2007 @@+7,13-7,14@@ This is a sample context line - This line will be removed + this line will be added instead TRACE Index: PatchFilter.java ================== RCS File: PatchFilter.java --- PatchFilter.java 23.10.2007 +++ PatchFilter.java 24.10.2007 @@+7,13-7,14@@ This is a sample context line - This line will be removed + this line will be added instead CODE loremm ipsum dolor met e4a this is a public String { dosomeThing(); } We have the following problem: - first you have to do - then you must do We propos the following patch file to be used: Index: someFile.java ===================== OUTPUT Figure 3.10: We extract structural elements in a fixed sequence. The order in which the detection and extraction of elements is executed, is of great importance. Several structural elements interfere: • Patches vs. Enumerations Enumerations, especially itemization interfere with the hunk lines in patches. Both use the symbols “+” and “-”.
  • 52. 3.6 Order of Extraction PATCHES STACK TRACES SOURCE CODE ENUMERATIONS loremm ipsum dolor met e4a this is a public String { dosomeThing(); } We have the following problem: - first you have to do - then you must do We propos the following patch file to be used: Index: someFile.java ===================== INPUT Index: PatchFilter.java ================== RCS File: PatchFilter.java --- PatchFilter.java 23.10.2007 +++ PatchFilter.java 24.10.2007 @@+7,13-7,14@@ This is a sample context line - This line will be removed + this line will be added instead PATCH Index: PatchFilter.java ================== RCS File: PatchFilter.java --- PatchFilter.java 23.10.2007 +++ PatchFilter.java 24.10.2007 @@+7,13-7,14@@ This is a sample context line - This line will be removed + this line will be added instead TRACE Index: PatchFilter.java ================== RCS File: PatchFilter.java --- PatchFilter.java 23.10.2007 +++ PatchFilter.java 24.10.2007 @@+7,13-7,14@@ This is a sample context line - This line will be removed + this line will be added instead CODE loremm ipsum dolor met e4a this is a public String { dosomeThing(); } We have the following problem: - first you have to do - then you must do We propos the following patch file to be used: Index: someFile.java ===================== OUTPUT Figure 3.10: We extract structural elements in a fixed sequence. The order in which the detection and extraction of elements is executed, is of great importance. Several structural elements interfere: • Patches vs. Enumerations Enumerations, especially itemization interfere with the hunk lines in patches. Both use the symbols “+” and “-”. reports. The evaluation is split into two parts: first, we want to focus on the correct identification of the presence of enumerations, patches, stack traces and source code in bug reports. Knowing the the reliability of our approach, we can then proceed in identifying how good the detected elements are extracted by our methods. Evaluation Setup We parsed 161,500 bug reports from the ECLIPSE project which were submit- ted between October 2001 and December 2007. For each report, INFOZILLA verified the presence of each of the four structural element types. For each element, it classified the report into one of two bins: B1 (report has Element) and B2 (report does not have Element). loremm ipsum dolor met e4a this is a public String { dosomeThing(); } We have the following problem: - first you have to do - then you must do We propos the following patch file to be used: Index: someFile.java ===================== INPUT Has Element? No Yes B1 B2 Figure 3.11: For each element we classified the report into two bins.
  • 54. 5.2 Results 35 Average per master report Information item Master Extended Change⇤ Predefined fields – product 1.000 1.127 +0.127 – component 1.000 1.287 +0.287 – operating system 1.000 1.631 +0.631 – reported platform 1.000 1.241 +0.241 – version 0.927 1.413 +0.486 – reporter 1.000 2.412 +1.412 – priority 1.000 1.291 +0.291 – target milestone 0.654 0.794 +0.140 Patches – total 1.828 1.942 +0.113 – unique: patched files 1.061 1.124 +0.062 Screenshots – total 0.139 0.285 +0.145 – unique: filename, filesize 0.138 0.281 +0.143 Stacktraces – total 0.504 1.422 +0.918 – unique: exception 0.195 0.314 +0.118 – unique: exception, top frame 0.223 0.431 +0.207 – unique: exception, top 2 frames 0.229 0.458 +0.229 – unique: exception, top 3 frames 0.234 0.483 +0.248 – unique: exception, top 4 frames 0.239 0.504 +0.265 – unique: exception, top 5 frames 0.244 0.525 +0.281 ⇤ For all information items the increase is significant at p < .001. Table 5.1: Average amount of information added by duplicates. A reporter’s reputation can go a long way in influencing the future course of a 36 5. Additional Information in Duplicate Reports Average per master report Information item Master Extended Change⇤ Predefined fields – product 1.000 1.400 +0.400 – component 1.000 1.953 +0.953 – operating system 1.000 2.102 +1.102 – reported platform 1.000 1.544 +0.544 – version 0.814 0.979 +0.165 – reporter 1.000 3.705 +2.705 – priority 0.377 0.499 +0.122 – target milestone 0.433 0.558 +0.125 Patches – total 5.038 5.184 +0.146 – unique: patched files 2.003 2.067 +0.064 Screenshots – total 0.200 0.391 +0.191 – unique: filename, filesize 0.197 0.385 +0.187 Stacktraces – total 0.100 0.185 +0.085 – unique: exception 0.033 0.047 +0.014 – unique: exception, top frame 0.069 0.130 +0.061 – unique: exception, top 2 frames 0.072 0.136 +0.064 – unique: exception, top 3 frames 0.073 0.139 +0.066 – unique: exception, top 4 frames 0.074 0.141 +0.067 – unique: exception, top 5 frames 0.075 0.143 +0.068 ⇤ For all information items the increase is significant at p < .001. Table 5.2: Average amount of information added by duplicates. We compared stack traces considering the exception that was thrown and ECLIPSE MOZILLA ADDITIONAL INFORMATION
  • 55. Duplicate bug reports can provide useful additional information. For example, we can find up to three times the stack traces which are helpful in fixing bugs
  • 56. There is significant evidence of additional information in duplicate bug reports that is uniquely different from the information already reported.
  • 57. PART 1 Is there extra information in duplicate reports and if so, can we quantify how much? PART 2 Is that extra information helpful for carrying out software engineering tasks?
  • 58.
  • 65. A1 A2 An ... MASTER Class 3 A1 A2 An ... DUPLICATE n Class 2 A1 A2 An ... DUPLICATE 1 Class 3 A1 A2 An ... DUPLICATE n Class 3 ... A1 A2 An ... MASTER Class 2 A1 A2 An ... MASTER Class 3 ... A1 A2 An ... DUPLICATE n Class 1 A1 A2 An ... DUPLICATE 1 Class 2 A1 A2 An ... DUPLICATE n Class 2 A1 A2 An ... DUPLICATE 1 Class 3 A1 A2 An ... DUPLICATE n Class 3 ... A1 A2 An ... MASTER Class 2 A1 A2 An ... MASTER Class 1 A1 A2 An ... MASTER Class 3 ... A1 A2 An ... DUPLICATE 1 Class 1 A1 A2 An ... DUPLICATE n Class 1 ... A1 A2 An ... DUPLICATE 1 Class 2 A1 A2 An ... DUPLICATE n Class 2 A1 A2 An ... DUPLICATE 1 Class 3 A1 A2 An ... DUPLICATE n Class 3 ... “Whoever was assigned to the Master should have been assigned to any of the Duplicates.” “Only the person who was originally assigned to a report can fix it.” “Any person assigned to any of the reports in the duplicate group can provide a fix.”
  • 66. Master reports, sorted chronologically Training Training Training Testing Fold 1 Fold 2 Fold 3 Fold 11 Testing Testing . . . . . . . . . . . . . . . . . . ....... Split into Run 1 Run 2 Run 10
  • 67. 46 6. Additional Information can Help Developers Table 6.1: Percentages of reports correctly triaged to ECLIPSE developers. Run Model Result Training 1 2 3 4 5 6 7 8 9 10 All SVM Top 1 Master 15.45 19.28 19.03 19.80 25.80 26.44 22.09 27.08 27.71 29.12 23.18 Extended 18.39⇤ 20.95 22.22⇤ 21.46 27.84 28.48 23.37 30.52⇤ 30.78⇤ 30.52 25.45⇤ Top 3 Master 32.44 37.42 40.87 39.72 46.10 46.36 38.95 44.70 48.53 47.25 42.23 Extended 38.70⇤ 42.78⇤ 43.30 39.34 50.83⇤ 49.55⇤ 42.40⇤ 50.32⇤ 50.32 55.04⇤ 46.25⇤ Top 5 Master 41.89 46.87 47.38 47.64 54.66 56.96 47.51 52.36 56.58 56.45 50.83 Extended 47.38⇤ 52.11⇤ 53.00⇤ 51.85⇤ 60.54⇤ 59.90⇤ 51.09⇤ 58.11⇤ 60.28⇤ 65.26⇤ 55.95⇤ Bayes Top 1 Master 14.81 16.60 17.75 17.75 22.73 21.20 20.56 23.50 27.71 28.22 21.08 Extended 15.45 17.11 20.56⇤ 18.01 19.80⇤ 19.80 22.99 27.08⇤ 26.82 30.40⇤ 21.80 Top 3 Master 29.12 32.31 35.12 34.99 40.36 38.06 35.76 43.55 45.59 46.87 38.17 Extended 36.53⇤ 33.08 38.83⇤ 35.50 39.08 39.08 39.97⇤ 46.23 45.85 50.45⇤ 40.46⇤ Top 5 Master 38.44 42.40 45.72 45.21 50.70 47.64 44.06 51.85 54.92 55.17 47.61 Extended 45.72⇤ 44.70 48.02 43.55 48.91 50.45⇤ 49.43⇤ 55.30⇤ 54.28 58.49⇤ 49.88⇤ ⇤ Increase in accuracy is significant at p = .05 Table 6.2: Percentages of reports correctly triaged to MOZILLA developers. Run Model Result Training 1 2 3 4 5 6 7 8 9 10 All Top 1 Master 14.57 14.30 14.16 18.29 18.83 19.17 21.00 19.65 19.99 22.15 18.21 Extended 15.31 14.43 17.95 19.44 19.78 19.51 21.82 23.10 18.29 19.31 18.90 onal Information can Help Developers rectly triaged to ECLIPSE developers. Run 4 5 6 7 8 9 10 All 19.80 25.80 26.44 22.09 27.08 27.71 29.12 23.18 21.46 27.84 28.48 23.37 30.52⇤ 30.78⇤ 30.52 25.45⇤ 39.72 46.10 46.36 38.95 44.70 48.53 47.25 42.23 39.34 50.83⇤ 49.55⇤ 42.40⇤ 50.32⇤ 50.32 55.04⇤ 46.25⇤ 47.64 54.66 56.96 47.51 52.36 56.58 56.45 50.83 51.85⇤ 60.54⇤ 59.90⇤ 51.09⇤ 58.11⇤ 60.28⇤ 65.26⇤ 55.95⇤ 17.75 22.73 21.20 20.56 23.50 27.71 28.22 21.08 18.01 19.80⇤ 19.80 22.99 27.08⇤ 26.82 30.40⇤ 21.80 34.99 40.36 38.06 35.76 43.55 45.59 46.87 38.17 35.50 39.08 39.08 39.97⇤ 46.23 45.85 50.45⇤ 40.46⇤ 45.21 50.70 47.64 44.06 51.85 54.92 55.17 47.61 43.55 48.91 50.45⇤ 49.43⇤ 55.30⇤ 54.28 58.49⇤ 49.88⇤ rectly triaged to MOZILLA developers. Run 4 5 6 7 8 9 10 All 18.29 18.83 19.17 21.00 19.65 19.99 22.15 18.21 19.44 19.78 19.51 21.82 23.10 18.29 19.31 18.90 Bayes Top 3 Master 29.12 32.31 35.12 34.99 40.36 38.06 35.76 43.55 45.59 46.87 38.17 Extended 36.53⇤ 33.08 38.83⇤ 35.50 39.08 39.08 39.97⇤ 46.23 45.85 50.45⇤ 40.46⇤ Top 5 Master 38.44 42.40 45.72 45.21 50.70 47.64 44.06 51.85 54.92 55.17 47.61 Extended 45.72⇤ 44.70 48.02 43.55 48.91 50.45⇤ 49.43⇤ 55.30⇤ 54.28 58.49⇤ 49.88⇤ ⇤ Increase in accuracy is significant at p = .05 Table 6.2: Percentages of reports correctly triaged to MOZILLA developers. Run Model Result Training 1 2 3 4 5 6 7 8 9 10 All SVM Top 1 Master 14.57 14.30 14.16 18.29 18.83 19.17 21.00 19.65 19.99 22.15 18.21 Extended 15.31 14.43 17.95 19.44 19.78 19.51 21.82 23.10 18.29 19.31 18.90 Top 3 Master 28.59 28.46 31.84 37.53 36.52 39.30 41.26 44.58 42.82 43.09 37.40 Extended 32.38 30.15 36.86 39.70 37.26 40.72 43.29 47.83 42.48 39.36 39.00 Top 5 Master 37.13 36.04 41.67 46.41 44.99 48.92 50.75 56.03 53.52 51.22 46.67 Extended 42.48 39.77 46.07 49.80 49.05 54.27 53.32 60.57 54.74 49.66 49.98 Bayes Top 1 Master 15.11 12.60 16.94 17.62 17.01 19.44 18.22 25.81 25.47 27.98 19.62 Extended 15.24 13.75 18.50 20.39 19.78 23.51 23.31 26.22 24.46 25.88 21.10 Top 3 Master 27.71 29.67 34.42 37.94 35.70 40.18 40.04 44.58 45.33 43.90 37.94 Extended 32.11 29.40 36.72 39.50 39.70 44.24 44.24 48.85 45.87 44.17 40.48 Top 5 Master 35.77 37.74 43.09 47.09 44.99 51.90 49.46 54.13 55.15 51.83 47.11 Extended 40.72 39.63 45.05 49.66 48.58 54.47 54.74 59.49 55.76 53.52 50.16 Importantly, all but the Top 1 results using Naïve Bayes in the last column were significant, too. Thus, the results demonstrate that bug reports can be better triaged by considering a larger set of existing bug reports by including duplicate reports. Bayes Top 3 Master 29.12 32.31 35.12 34. Extended 36.53⇤ 33.08 38.83⇤ 35. Top 5 Master 38.44 42.40 45.72 45. Extended 45.72⇤ 44.70 48.02 43. ⇤ Increase in accuracy is significant at p = .05 Table 6.2: Percentages of reports correc Model Result Training 1 2 3 SVM Top 1 Master 14.57 14.30 14.16 18. Extended 15.31 14.43 17.95 19. Top 3 Master 28.59 28.46 31.84 37. Extended 32.38 30.15 36.86 39. Top 5 Master 37.13 36.04 41.67 46. Extended 42.48 39.77 46.07 49. Bayes Top 1 Master 15.11 12.60 16.94 17. Extended 15.24 13.75 18.50 20. Top 3 Master 27.71 29.67 34.42 37. Extended 32.11 29.40 36.72 39. Top 5 Master 35.77 37.74 43.09 47. Extended 40.72 39.63 45.05 49. Importantly, all but the Top 1 results us were significant, too. Thus, the results d better triaged by considering a larger set duplicate reports. ECLIPSE MOZILLA
  • 68. The information contained in Duplicate reports the improves accuracy of Machine Learning algorithms when solving for the Bug Triage problem.