The document describes a study on understanding log lines using development knowledge from source code. The researchers examined real-life inquiries about log lines from user mailing lists and logs of three large software systems. They found that experts are crucial in resolving log inquiries, with 8 out of 11 resolved inquiries addressed by experts. The researchers propose attaching development knowledge like source code, code comments, and issue reports to logs to help practitioners understand log messages without relying on expert assistance. An example demonstrates how different types of development knowledge can help explain the meaning, cause, impact and solution for the log message "fetch failure".
Tata AIG General Insurance Company - Insurer Innovation Award 2024
ICSME2014
1. 1
Understanding
Log
Lines
Using
Development
Knowledge
Ahmed
E.
Hassan
Meiyappan
Nagappan
Weiyi
Shang
Zhen
Ming
Jiang
2. Prac77oners
have
challenges
in
understanding
log
lines
2
Fetch
failure
What
exactly
does
this
message
mean?
What
could
be
the
cause?
Is
it
affecCng
my
data?
4. We
performed
an
exploratory
study
on
3
large
soAware
systems
4
Zookeeper
5,641
logging
statements
1,080
logging
statements
1,163
logging
statements
5. We
manually
examined
real-‐life
inquiries
about
log
lines
from
3
sources
5
User
mailing
lists
Randomly
sampled
logs
6. 5
types
of
informa7on
are
inquired
about
logs
6
Meaning
Cause
Impact
Solu7on
Context
What
exactly
does
this
message
mean?
When
does
this
occur?
What
could
be
the
cause?
How
can
I
avoid
this
message/
problem?
Is
it
affecCng
my
data?
7. Experts
are
crucial
in
resolving
log
inquiries
7
5
1
0
1
3
0
2
0
0
0
3
0
0
0
0
0
1
2
3
4
5
6
by
expert
by
non-‐expert
replied
by
expert
only
replied
by
non-‐expert
not
answered
resolved
un-‐resolved
Hadoop
Cassanddra
Zookeeper
8. Experts
are
crucial
in
resolving
log
inquiries
8
out
of
11
resolved
inquiries
are
resolved
by
experts.
8
5
1
0
1
3
0
2
0
0
0
3
0
0
0
0
0
1
2
3
4
5
6
by
expert
by
non-‐expert
replied
by
expert
only
replied
by
non-‐expert
not
answered
resolved
un-‐resolved
Hadoop
Cassanddra
Zookeeper
9. Experts
are
crucial
in
resolving
log
inquiries
9
5
1
0
1
3
0
2
0
0
0
3
0
0
0
0
0
1
2
3
4
5
6
by
expert
by
non-‐expert
replied
by
expert
only
replied
by
non-‐expert
not
answered
resolved
un-‐resolved
Hadoop
Cassanddra
Zookeeper
Inquiries
are
always
resolved
if
experts
reply.
10. Looking
for
an
expert
is
not
the
op7mal
approach
to
resolve
log
inquiries
10
Over
20%
of
the
inquires
have
no
reply.
Wrong
answers
may
be
posted
in
reply
to
inquiries.
IdenCfying
the
expert
of
a
log
line
is
challenging.
First
reply
can
take
up
to
210
hours.
12. Nothing
in
common
between
inquired
logs
12
An
on-‐demand
approach
is
needed
to
assist
in
understanding
logs.
Different
log
verbosity
levels
0
to
2
degrees
of
fan-‐in
0
to
200
prior
code
change
Real-‐life
inquiries
13. 13
We
propose
to
aQach
development
knowledge
to
logs
Code
commit
Issue
reports
Source
code
/*
…
*/
Call
graph
Code
comments
14. 14
Code
commit
Issue
reports
Source
code
/*
…
*/
Code
comments
Call
graph
fetch
failure
From
method
checkAndInformJobTracker
of
file
ShuffleScheduler.java
An
example
of
using
development
knowledge
to
resolve
inquiries
of
log
“fetch
failure”
15. 15
Code
commit
Issue
reports
Source
code
/*
…
*/
Code
comments
Call
graph
fetch
failure
NoCfy
the
JobTracker
a`er
every
read
error,
if
`reportReadErrorImmediately'
is
true
or
a`er
every
`maxFetchFailuresBeforeReporCng'
failures
An
example
of
using
development
knowledge
to
resolve
inquiries
of
log
“fetch
failure”
16. 16
Code
commit
Issue
reports
Source
code
/*
…
*/
Code
comments
Call
graph
fetch
failure
Called
by
method
copyFailed
in
class
ShuffleScheduler
An
example
of
using
development
knowledge
to
resolve
inquiries
of
log
“fetch
failure”
17. 17
Code
commit
Issue
reports
Source
code
/*
…
*/
Code
comments
Call
graph
fetch
failure
Allow
shuffle
retries
and
read-‐error
reporCng
to
be
configurable.
Contributed
by
Amareshwari
Sriramadasu.
An
example
of
using
development
knowledge
to
resolve
inquiries
of
log
“fetch
failure”
18. 18
Code
commit
Issue
reports
Source
code
/*
…
*/
Code
comments
Call
graph
fetch
failure
MAPREDUCE-‐1171.
…
This
is
caused
by
a
behavioral
change
in
hadoop
0.20.1.
…
…One
soluCon
I
could
see
is
"Provide
a
config
opCon...
”…
An
example
of
using
development
knowledge
to
resolve
inquiries
of
log
“fetch
failure”
19. 19
Code
commit
Issue
reports
Source
code
/*
…
*/
Code
comments
Call
graph
fetch
failure
Meaning:
There
is
a
data
reading
error.
Cause:
One
of
the
possible
reasons
is
a
configuraCon.
Context:
The
event
happens
during
the
shuffle
period,
while
copying
data.
Impact:
The
event
impacts
the
jobtracker.
Solu7on:
Changing
a
configuraCon
opCon
would
solve
the
issue.
Amareshwari
Sriramadasu
is
the
expert
to
go
to.
An
example
of
using
development
knowledge
to
resolve
inquiries
of
log
“fetch
failure”
Resolve
the
inquiry
by
development
knowledge
Go
to
the
expert
for
help.
20. 20
Overview
of
our
approach
Version
control
system
GeneraCng
templates
for
logs
Matching
logs
with
log
templates
Alaching
development
knowledge
to
logs
Source
code
Log
templates
Development
knowledge
21. 21
Step
1:
Genera7ng
templates
for
logs
Version
control
system
foo()
{
…
Log_statement(“7me=%d,
Trying
to
launch,
TaskID=
%s”,
7me,
taskid);
…
}
7me=d+,
Trying
to
launch,
TaskID=S+
23. Step
3:
AQaching
development
knowledge
to
logs
23
Code
commit
Issue
reports
Source
code
/*
…
*/
Call
graph
Code
comments
Version
control
system
Issue
tracking
system
24. Can
development
knowledge
complement
logging
statements?
Complemen7ng
logging
statements
24
Resolving
real-‐
life
log
inquiries
Can
development
knowledge
help
resolve
real-‐life
inquiries?
25. We
compare
our
approach
against
Google
and
mailing
list
for
resolving
real-‐life
log
inquiries
25
Real-‐life
inquiries
26. 0%
10%
20%
30%
40%
50%
60%
70%
80%
Percentage
of
resolved
log
inquiries
Our
approach
outperforms
Google
and
is
comparable
to
mailing
lists
to
resolve
log
inquiries
26
27. 0%
20%
40%
60%
80%
100%
Meaning
Cause
Context
Solu7on
Impact
Percentage
of
each
type
of
inquired
informa7on
provided
by
our
approach
Our
approach
provides
62%
of
inquired
log
informa7on
27
28. Complemen7ng
logging
statements
28
Resolving
Log
Inquiries
Can
Development
Knowledge
Help
Resolve
Real-‐life
Inquiries?
YES!
Can
development
knowledge
complement
logging
statements?
29. Complemen7ng
logging
statements
29
Resolving
Log
Inquiries
Can
Development
Knowledge
Help
Resolve
Real-‐life
Inquiries?
YES!
Can
development
knowledge
complement
logging
statements?
30. We
complement
a
random
sample
of
logging
statements
using
our
approach
30
Zookeeper
300
randomly
sampled
logging
statements
31. Development
knowledge
can
complement
logging
statements
31
0
20
40
60
80
100
meaning
cause
context
soluCon
impact
Percentage
of
logging
statements
complemented
by
our
approach
Hadoop
Cassandra
Zookeeper
Issue
reports
are
the
best
development
knowledge
to
complement
logging
statements.
32. Complemen7ng
logging
statements
32
Resolving
Log
Inquiries
Can
Development
Knowledge
Help
Resolve
Real-‐life
Inquiries?
YES!
YES!
Can
development
knowledge
complement
logging
statements?
33. Prac77oners
have
challenges
in
understanding
log
lines
33
Fetch
failure
What
exactly
does
this
message
mean?
…
could
this
be
the
cause?
Is
it
affecCng
my
data?
35. 5
types
of
informa7on
are
inquired
about
logs
35
Meaning
Cause
Impact
Solu7on
Context
What
exactly
does
this
message
mean?
When
does
this
occur?
…
could
this
be
the
cause?
It
will
be
great
if
some
one
can
point
to
the
direcCon
how
to
solve
this?
Is
it
affecCng
my
data?
39. Complemen7ng
logging
statements
39
Resolving
Log
Inquiries
Can
Development
Knowledge
Help
Resolve
Real-‐life
Inquiries?
YES!
YES!
Can
development
knowledge
complement
logging
statements?