SlideShare a Scribd company logo
1 of 98
Download to read offline
TRACING SUMMIT 
EUROPE 
From 
DTrace 
To 
Linux: 
Oct, 2014 
What 
can 
Linux 
learn 
from 
DTrace? 
Brendan 
Gregg 
Senior 
Performance 
Architect 
bgregg@ne7lix.com 
@brendangregg
Brendan 
Gregg 
• DTrace 
contribu?ons 
include: 
– Primary 
author 
of 
the 
DTrace 
book 
– DTraceToolkit 
– dtrace-­‐cloud-­‐tools 
– DTrace 
network 
providers 
• I 
now 
work 
on 
Linux 
at 
Ne7lix 
– using: 
Jrace, 
perf_events, 
SystemTap, 
ktap, 
eBPF, 
… 
– created: 
perf-­‐tools, 
msr-­‐cloud-­‐tools 
• Opinions 
in 
this 
talk 
are 
my 
own
Agenda 
1. DTrace 
– What 
is 
DTrace, 
really? 
– Who 
is 
DTrace 
for, 
really? 
– Why 
doesn’t 
Linux 
have 
DTrace? 
– What 
worked 
well? 
– What 
didn’t 
work 
well? 
2. Linux 
Tracers 
– Jrace, 
perf_events, 
eBPF, 
… 
Topics 
include 
adop?on, 
marke?ng, 
technical 
challenges, 
and 
our 
usage 
at 
Ne7lix.
What 
is 
DTrace, 
really?
Technology 
+ 
Marke?ng 
(Like 
many 
other 
company 
products)
Prior 
Technology 
Kerninst: 
kernel 
dynamic 
tracing, 
Solaris 
2.5.1, 
1999
Prior 
Technology 
Early 
dynamic 
tracers 
weren’t 
safe
Prior 
Technology 
• Also: 
– Sun’s 
TNF 
– DProbes: 
user 
+ 
kernel 
dynamic 
tracing 
– Linux 
Trace 
Toolkit 
(LTT) 
– Others, 
including 
offline 
binary 
instrumenta?on 
• DProbes 
and 
LTT 
were 
combined 
in 
Nov 
2000, 
but 
not 
integrated 
into 
the 
Linux 
kernel1 
• Sun 
set 
forth 
to 
produce 
a 
produc?on-­‐safe 
tool 
1 
h^p://lkml.iu.edu/hypermail/linux/kernel/0011.3/0183.html
From DTrace to Linux
Technology 
• DTrace: 
– Safe 
for 
produc?on 
use 
• You 
might 
step 
on 
your 
foot 
(overhead), 
but 
you 
won’t 
shoot 
it 
off 
– Dynamic 
tracing, 
sta?c 
tracing, 
and 
profiling 
– User-­‐ 
and 
kernel-­‐level, 
unified 
– Programma?c: 
filters 
and 
summaries 
– Solved 
countless 
issues 
in 
dev 
and 
prod 
• That’s 
what 
DTrace 
is 
for 
me 
– An 
awesome 
technology, 
oJen 
needed 
to 
root 
cause 
kernel 
& 
app 
issues 
• But 
for 
most 
people….
A 
Typical 
Conversa?on… 
“Does 
Linux 
have 
DTrace 
yet?” 
“That’s 
a 
pity” 
“DTrace 
is 
awesome!” 
“I’m 
not 
sure” 
“No.” 
“No.” 
“Why?” 
“Why, 
specifically?” 
“Have 
you 
used 
it?”
Marke?ng
Early 
Marke?ng
Early 
Marke?ng 
• DTrace 
had 
awesome 
marke?ng 
– People 
s?ll 
want 
it 
but 
don’t 
really 
know 
why 
• Early 
marke?ng: 
tradi?onal, 
$$$ 
– Great 
marke?ng 
product 
managers 
• 10 
Moves 
Ahead 
campaign: 
airports, 
sta?ons, 
etc. 
– Sun 
sales 
staff 
pitched 
DTrace 
directly 
– Sun 
technology 
evangelists 
• Benefits 
– Not 
another 
Sun 
tech 
no 
one 
knew 
about 
– Compelled 
people 
to 
learn 
more, 
try 
it 
out
Marke?ng 
Evolved 
• Sun 
marke?ng 
become 
innova?ve 
– Engineering 
blogs, 
BigAdmin 
– Marke?ng 
staff 
who 
used 
and 
understood 
DTrace 
• Who 
could 
be^er 
ar?culate 
its 
value 
– Marke?ng 
more 
directly 
from 
the 
engineers
Later 
Marke?ng
Later 
Marke?ng 
• Many 
ini?a?ves 
by 
Deirdré 
Straughan: 
– Social 
media, 
blogs, 
events, 
the 
ponycorn 
mascot, 
... 
– Video 
and 
share 
everything: 
all 
meetups, 
talks 
• Blogs: 
– including 
h^p://dtrace.org; 
my 
own 
> 
1M 
views 
• Books: 
– my 
own 
> 
30k 
sold 
• Videos: 
– me 
shou?ng 
while 
DTracing 
disks, 
~1M 
views 
• Language 
support 
exposed 
new 
communi?es 
to 
DTrace
From DTrace to Linux
From DTrace to Linux
???
Who 
is 
DTrace 
for, 
really?
DTrace 
end-­‐users: 
Current 
Es?mated 
DTrace 
guide 
users: 
~100 
Script 
end-­‐users: 
~5,000 
Note: 
91.247% 
of 
sta?s?cs 
are 
made 
up
DTrace 
end-­‐users: 
Current 
• DTrace 
guide 
users: 
~100 
– Understand 
the 
~400 
page 
Dynamic 
Tracing 
Guide 
– Develop 
their 
own 
scripts 
from 
scratch 
– Understand 
overhead 
intui?vely 
• Script 
end-­‐users: 
~5000 
– DTraceToolkit, 
Google 
– Run 
scripts. 
Some 
tweaks/customiza?ons.
DTrace 
end-­‐users: 
Future 
eg, 
Oracle 
ZFS 
Storage 
Appliance 
Analy?cs
DTrace 
end-­‐users: 
Future 
Possible 
Future 
DTrace 
guide 
users: 
~100 
Script 
end-­‐users: 
~5,000 
GUI 
end-­‐users: 
>50,000
Company 
Usage
Company 
Usage 
• Prac?cal 
usage 
for 
most 
companies: 
– A) 
A 
performance 
team 
(or 
person) 
• Acquires 
useful 
scripts 
• Develops 
custom 
scripts 
– B) 
The 
rest 
of 
the 
company 
asks 
(A) 
for 
script/help 
• They 
need 
to 
know 
what’s 
possible, 
to 
know 
to 
ask 
– Or, 
you 
buy/develop 
a 
GUI 
that 
everyone 
can 
use 
• There 
are 
some 
excep?ons
Why 
doesn’t 
Linux 
have 
DTrace?
Why 
doesn’t 
Linux 
have 
a 
DTrace 
equivalent? 
4 
Answers…
1. 
It 
does 
(sort 
of) 
Jrace 
perf_events
1. 
It 
does 
(sort 
of) 
• Linux 
has 
changed 
– In 
2005, 
numerous 
Linux 
issues 
were 
difficult 
or 
impossible 
to 
solve. 
Linux 
needed 
a 
DTrace 
equivalent. 
– By 
2014, 
many 
of 
these 
are 
now 
solvable, 
especially 
using 
Jrace, 
perf_events, 
kprobes, 
uprobes: 
all 
part 
of 
the 
Linux 
kernel
2. 
Technical 
semantic error: missing x86_64 
kernel/module debuginfo
2. 
Technical 
• Linux 
is 
a 
more 
difficult 
environment 
– Solaris 
always 
has 
symbols, 
via 
CTF, 
which 
DTrace 
uses 
for 
dynamic 
tracing 
– Linux 
doesn’t 
always 
have 
symbols/debuginfo
3. 
Linux 
isn’t 
a 
Company 
“All 
the 
wood 
behind 
one 
arrow” 
– 
Sco^ 
McNealy, 
CEO, 
Sun 
Microsystems
3. 
Linux 
isn’t 
a 
Company 
• Linus 
can 
refuse 
patches, 
but 
can’t 
stop 
projects 
– The 
tracing 
wood 
is 
split 
between 
many 
arrows 
• Jrace, 
perf_events, 
eBPF, 
SystemTap, 
ktap, 
LTTng, 
… 
– And 
we 
are 
a 
small 
community: 
there’s 
not 
much 
wood 
to 
go 
around!
4. 
No 
Trace 
Race
4. 
No 
Trace 
Race 
• Post 
2001, 
Solaris 
was 
losing 
ground 
to 
Linux. 
Sun 
desperately 
needed 
differen?ators 
to 
survive 
– Three 
top 
Sun 
engineers 
spent 
years 
on 
DTrace 
– Sun 
marke?ng 
gave 
it 
their 
best 
shot… 
• This 
circumstance 
will 
never 
exist 
again 
– For 
Linux 
today, 
it 
would 
be 
like 
having 
Linus, 
Ingo, 
and 
Steven 
do 
tracing 
full-­‐?me 
for 
three 
years, 
followed 
by 
a 
major 
marke?ng 
campaign 
• There 
may 
never 
be 
another 
trace 
race. 
Unless…
Why 
doesn’t 
Linux 
have 
DTrace 
itself? 
2 
Answers…
1. 
The 
CDDL 
From: Claire Giordano <claire.giordano@sun.com> 
To: license-discuss@opensource.org 
[Open 
Source 
Ini?a?ve] 
Subject: For Approval: Common Development and Distribution License (CDDL) 
Date: Wed, 01 Dec 2004 19:47:39 -0800 
[…] 
Like the MPL, the CDDL is not expected to be compatible with the GPL, 
since it contains requirements that are not in the GPL (for example, 
the "patent peace" provision in section 6). Thus, it is likely that 
files released under the CDDL will not be able to be combined with 
files released under the GPL to create a larger program. 
[…] 
CDDL Team, Sun Microsystems 
Source: 
h^p://lwn.net/Ar?cles/114840/
1. 
The 
CDDL 
• Linux 
tradi?onally 
includes 
the 
tracer/profiler 
in 
the 
(GPL) 
kernel, 
but 
the 
DTrace 
license 
is 
CDDL 
– Out-­‐of-­‐tree 
projects 
have 
maintenance 
difficul?es 
– Oracle 
(who 
own 
the 
DTrace 
copyrights) 
could 
relicense 
it 
as 
GPL, 
but 
haven’t, 
and 
may 
never 
do 
this 
• Note 
that 
ZFS 
on 
Linux 
is 
doing 
well, 
despite 
being 
CDDL, 
and 
out 
of 
tree
2. 
DTrace 
ports
2. 
DTrace 
ports 
• There 
are 
two 
ports, 
but 
both 
currently 
incomplete 
• A) 
h^ps://github.com/dtrace4linux/linux: 
– Mostly 
one 
UK 
developer, 
Paul 
Fox, 
as 
a 
hobby 
since 
2008 
(when 
he 
isn’t 
developing 
on 
the 
Rasberry 
Pi) 
• B) 
Oracle 
Linux 
DTrace: 
– Open 
source 
kernel, 
closed 
source 
user-­‐level 
($) 
• We 
pay 
for 
monitoring 
tools; 
why 
not 
this 
too? 
– Experienced 
engineers, 
test 
suite 
focused 
– Had 
been 
good 
progress, 
but 
no 
updates 
for 
months
What 
with 
DTrace 
worked 
well? 
5 
Key 
items…
1. 
Produc?on 
Safety
1. 
Produc?on 
Safety 
• DTrace 
architecture 
– Restricted 
probe 
context: 
no 
kernel 
facility 
calls, 
restricted 
instruc?ons, 
no 
backwards 
branches, 
restricted 
loads/stores 
– Heartbeat: 
aborted 
due 
to 
systemic 
unresponsiveness 
• DTrace 
Test 
Suite 
– Hundreds 
of 
tests 
• Linux 
is 
learning 
this: 
– Oracle 
Linux 
DTrace 
is 
taking 
the 
test 
suite 
seriously 
– Jracetest
2. 
All 
the 
wood 
behind 
one 
arrow 
DTrace
2. 
All 
the 
wood 
behind 
one 
arrow 
• Can 
Linux 
learn 
this? 
– Can 
we 
vote 
some 
off 
the 
Linux 
tracing 
island? 
• At 
least, 
no 
new 
tracers 
in 
2015, 
please!
3. 
In-­‐Kernel 
Aggrega?ons 
value ------------- Distribution ------------- count! 
4096 | 0! 
8192 |@@@@@@@@@@@@@@@@@@@@@@@@@@ 1085! 
16384 |@@@@@@@@@@@ 443! 
32768 |@@ 98! 
65536 | 5! 
131072 | 1! 
262144 | 1! 
524288 | 0! 
1048576 | 11! 
2097152 | 0!
3. 
In-­‐Kernel 
Aggrega?ons 
• Changed 
how 
inves?ga?ons 
are 
conducted 
– rapid, 
live 
analysis 
• Low 
overhead: 
– per-­‐CPU 
storage, 
asynchronous 
kernel-­‐>user 
transfers 
• Key 
uses: 
– summary 
sta?s?cs: 
count, 
avg, 
min, 
max 
– histograms: 
quan?ze, 
lquan?ze 
– by-­‐key: 
execname, 
kernel 
stacks, 
user 
stacks 
• Linux 
can 
learn: 
– Need 
aggrega?ons 
(eBPF 
maps, 
SystemTap, 
ktap?)
4. 
Many 
Example 
Scripts
4. 
Many 
Example 
Scripts
4. 
Many 
Example 
Scripts 
• Scripts 
serve 
many 
needs: 
– tools: 
ready 
to 
run 
– examples: 
learn 
by-­‐example 
– marke?ng: 
each 
is 
a 
use 
case 
• DTrace 
book 
scripts 
– 150+ 
short 
examples 
• DTraceToolkit 
– 230 
more 
scripts 
– all 
DTraceToolkit 
scripts 
have 
man 
pages, 
example 
files, 
and 
are 
tested 
– An 
essen?al 
factor 
in 
DTrace’s 
adop?on
4. 
Many 
Example 
Scripts 
• Linux 
can 
learn: 
– Many 
users 
will 
just 
run 
scripts, 
not 
write 
them 
– People 
want 
good 
short 
examples 
– If 
they 
aren’t 
tested, 
they 
don’t 
work 
• It’s 
easy 
to 
generate 
metrics 
that 
kind-­‐of 
work; 
it’s 
hard 
to 
make 
them 
reliable 
for 
different 
workloads. 
– Maintenance 
of 
dynamic 
tracing 
scripts 
is 
painful 
• The 
instrumented 
code 
can 
change 
• Need 
more 
sta?c 
tracepoints
5. 
Marke?ng
5. 
Marke?ng 
• DTrace 
was 
effec?vely 
marketed 
in 
many 
ways 
– Tradi?onal, 
social, 
blogs, 
scripts, 
ponycorn, 
… 
• Linux 
has 
virtually 
no 
marke?ng 
for 
its 
tracers 
– Jrace 
is 
great, 
if 
you 
ever 
discover 
it; 
etc. 
– Marke?ng 
spend 
is 
on 
commercial 
products 
instead 
• Linux 
can 
learn 
to 
market 
what 
it 
has 
– Tracers 
may 
also 
benefit 
from 
“a 
great 
name 
and 
a 
cute 
logo”1 
– “eBPF” 
is 
not 
catchy, 
and 
doesn’t 
convey 
meaning 
1 
h^p://thenewstack.io/why-­‐did-­‐docker-­‐catch-­‐on-­‐quickly-­‐and-­‐why-­‐is-­‐it-­‐so-­‐interes?ng/
Cute 
Tracing 
Logos 
dtrace4linux 
perf_events 
ktap 
LTTng 
SystemTap 
Jrace 
Ponies 
by 
Deirdré 
Straughan, 
using: 
h^p://generalzoi.deviantart.com 
pony 
creator
Other 
Things 
• Programmable/scriptable 
• Built-­‐in 
stability 
seman?cs
What 
with 
DTrace 
didn’t 
work 
well?
What 
with 
DTrace 
didn’t 
work 
well? 
5 
Key 
Issues…
1. 
Adop?on
1. 
Adop?on 
• Few 
customers 
ever 
wrote 
DTrace 
scripts 
– DTrace 
should 
have 
been 
used 
more 
than 
it 
was 
– Sun’s 
“killer” 
tool 
just 
wasn’t 
– Be^er 
pickup 
rate 
with 
developers, 
not 
sysadmins 
• Many 
customers 
just 
ran 
my 
scripts 
– Not 
ideal, 
but 
be^er 
than 
nothing 
– This 
wasn’t 
what 
many 
at 
Sun 
dreamed 
• Internal 
adop?on 
was 
slow, 
limited 
– Sun 
could 
have 
done 
much 
more, 
but 
didn’t 
• The 
problem 
was 
knowing 
what 
to 
do 
with 
it 
– The 
syntax 
was 
the 
easy 
part
1. 
Adop?on 
• Linux 
can 
learn: 
– Adop?on 
is 
about 
more 
than 
just 
the 
technology 
• Documenta?on, 
marke?ng, 
training, 
community 
– Teaching 
what 
it 
does 
is 
more 
important 
than 
how 
• Everyone 
needs 
to 
know 
when 
to 
ask 
for 
it, 
not 
necessarily 
how 
to 
use 
it 
– Needs 
an 
adop?on 
curve 
(not 
a 
step 
func?on) 
• Tools, 
one-­‐liners, 
short 
scripts, 
…
2. 
Training 
This 
is 
to 
cer?fy 
that 
Brendan 
Gregg 
Has 
Completed 
the 
Sun 
Educa?onal 
course 
DTrace 
is 
a 
Solaris 
differen3ator
2. 
Training 
• Early 
training 
was 
not 
very 
effec?ve 
– Sun 
began 
including 
the 
DTraceToolkit 
in 
courses, 
with 
be^er 
success 
• It 
gradually 
improved 
– The 
last 
courses 
I 
developed 
and 
taught 
(aJer 
Sun) 
used 
simulated 
problems 
for 
the 
students 
to 
solve 
on 
their 
own 
with 
DTrace 
• Linux 
can 
learn: 
– Lab-­‐based 
training 
is 
most 
effec?ve. 
Online 
tutorials?
3. 
GUIs
3. 
GUIs 
• Dozens 
of 
performance 
monitoring 
products, 
but 
almost 
no 
meaningful 
DTrace 
support 
• A 
couple 
of 
excep?ons: 
– Oracle 
ZFS 
Storage 
Appliance 
Analy?cs 
• Formally 
the 
Sun 
Storage 
7000 
Analy?cs 
• Should 
be 
generalized. 
Oracle 
Solaris 
11.3? 
– Joyent 
Cloud 
Analy?cs
3. 
GUIs 
• Linux 
can 
learn: 
– Real 
adop?on 
possible 
through 
scripts 
& 
GUIs 
– Use 
the 
GUI 
to 
add 
value 
to 
the 
data 
• Heat 
maps: 
latency, 
u?liza?on, 
offset 
• Flame 
graphs 
• Time 
series 
thread 
visualiza?ons 
(Trace 
Compass) 
• ie, 
not 
just 
line 
graphs! 
– Commercial 
GUI 
products 
have 
marke?ng 
budget 
• Applica?on 
perf 
monitoring 
was 
$2.4B 
in 
20131 
1 
h^ps://www.gartner.com/doc/2752217/market-­‐share-­‐analysis-­‐applica?on-­‐performance
3. 
GUIs 
• Heat 
maps 
are 
an 
example 
must-­‐have 
use 
case 
for 
trace 
data
4. 
Overheads 
While 
the 
DTrace 
technology 
is 
awesome, 
it 
does 
have 
some 
minor 
technical 
challenges 
as 
well
4. 
Overheads 
• While 
op?mized, 
for 
many 
targets 
the 
DTrace 
CPU 
overheads 
can 
s?ll 
be 
too 
high 
– Scheduler 
tracing, 
memory 
alloca?on 
tracing 
– User-­‐level 
dynamic 
tracing 
(fast 
trap) 
– VM 
probes 
(eg, 
Java 
disables 
some 
probes 
by 
default) 
– 10 
GbE 
Network 
I/O, 
etc… 
• In 
some 
cases 
it 
doesn’t 
ma^er 
– Despera?on: 
system 
already 
mel?ng 
down 
– Troubleshoo?ng 
in 
dev: 
speed 
not 
a 
concern 
• Linux 
can 
learn: 
– Speed 
can 
ma^er, 
faster 
makes 
more 
possible
5. 
Syscall 
Provider
5. 
Syscall 
Provider 
• Solaris 
DTrace 
instrumented 
the 
trap 
table, 
and 
called 
it 
the 
syscall 
provider 
– Which 
is 
actually 
an 
unstable 
interface 
– Breaks 
between 
Solaris 
versions 
• And 
really 
broke 
in 
Oracle 
Solaris 
11 
– Other 
weird 
caveats 
• Linux 
can 
learn: 
– syscalls 
are 
the 
#1 
target 
for 
users 
learning 
system 
tracers. 
The 
API 
should 
be 
easy 
and 
stable.
Other 
Issues 
• The 
lack 
of: 
– Bounded 
loops 
(like 
SystemTap) 
– Kernel 
instruc?on 
tracing 
(like 
perf_events) 
– Easy 
PMC 
interface 
(like 
perf 
stat) 
– Aggrega?on 
key/value 
access 
(stap, 
ktap, 
eBPF) 
– Kernel 
source 
(issue 
for 
Oracle 
Solaris 
only) 
• 4+ 
second 
startup 
?mes 
– Several 
Linux 
tracers 
start 
instantly
From 
DTrace 
to 
Linux 
Tracers 
(2014)
• Massive 
AWS 
EC2 
Linux 
cloud, 
with 
FreeBSD 
appliances 
for 
content 
delivery 
• Performance 
is 
cri?cal: 
>50M 
subscribers 
• Just 
launched 
in 
Europe!
System 
Tracing 
at 
Ne7lix 
• Present: 
– Jrace 
can 
serve 
many 
needs 
– perf_events 
some 
more, 
esp. 
with 
debuginfo 
– SystemTap 
as 
needed, 
esp. 
for 
Java 
– ad 
hoc 
other 
tools 
• Future: 
– Jrace/perf_events/ktap 
with 
eBPF, 
for 
a 
fully 
featured 
and 
mainline 
tracer? 
– One 
of 
the 
other 
tracers 
going 
mainline? 
• Summarizing 
4 
tracers…
1. 
Jrace
1. 
Jrace 
• Tracing 
and 
profiling: 
/sys/kernel/debug/tracing 
– added 
by 
Steven 
Rostedt 
and 
others 
since 
2.6.27, 
and 
already 
enabled 
on 
our 
servers 
(3.2+) 
• Experiences: 
– very 
useful 
capabili?es: 
tracing, 
coun?ng 
– surprising 
features: 
graphing 
(latencies), 
filters 
• Front-­‐end 
tools 
to 
ease 
use 
– h^ps://github.com/brendangregg/perf-­‐tools 
– WARNING: 
these 
are 
unsupported 
hacks 
– There’s 
also 
the 
trace-­‐cmd 
front-­‐end 
by 
Steven 
• 4 
examples…
perf-­‐tools: 
iosnoop 
• Block 
I/O 
(disk) 
events 
with 
latency: 
# ./iosnoop –ts! 
Tracing block I/O. Ctrl-C to end.! 
STARTs ENDs COMM PID TYPE DEV BLOCK BYTES LATms! 
5982800.302061 5982800.302679 supervise 1809 W 202,1 17039600 4096 0.62! 
5982800.302423 5982800.302842 supervise 1809 W 202,1 17039608 4096 0.42! 
5982800.304962 5982800.305446 supervise 1801 W 202,1 17039616 4096 0.48! 
5982800.305250 5982800.305676 supervise 1801 W 202,1 17039624 4096 0.43! 
[…]! 
# ./iosnoop –h! 
USAGE: iosnoop [-hQst] [-d device] [-i iotype] [-p PID] [-n name] [duration]! 
-d device # device string (eg, "202,1)! 
-i iotype # match type (eg, '*R*' for all reads)! 
-n name # process name to match on I/O issue! 
-p PID # PID to match on I/O issue! 
-Q # include queueing time in LATms! 
-s # include start time of I/O (s)! 
-t # include completion time of I/O (s)! 
-h # this usage message! 
duration # duration seconds, and use buffers! 
[…]!
perf-­‐tools: 
iolatency 
• Block 
I/O 
(disk) 
latency 
distribu?ons: 
# ./iolatency ! 
Tracing block I/O. Output every 1 seconds. Ctrl-C to end.! 
! 
>=(ms) .. <(ms) : I/O |Distribution |! 
0 -> 1 : 1144 |######################################|! 
1 -> 2 : 267 |######### |! 
2 -> 4 : 10 |# |! 
4 -> 8 : 5 |# |! 
8 -> 16 : 248 |######### |! 
16 -> 32 : 601 |#################### |! 
32 -> 64 : 117 |#### |! 
• User-­‐level 
processing 
some?mes 
can’t 
keep 
up 
– Over 
50k 
IOPS. 
Could 
buffer 
more 
workaround, 
but 
would 
prefer 
in-­‐kernel 
aggrega?ons 
[…]!
perf-­‐tools: 
opensnoop 
• Trace 
open() 
syscalls 
showing 
filenames: 
# ./opensnoop -t! 
Tracing open()s. Ctrl-C to end.! 
TIMEs COMM PID FD FILE! 
4345768.332626 postgres 23886 0x8 /proc/self/oom_adj! 
4345768.333923 postgres 23886 0x5 global/pg_filenode.map! 
4345768.333971 postgres 23886 0x5 global/pg_internal.init! 
4345768.334813 postgres 23886 0x5 base/16384/PG_VERSION! 
4345768.334877 postgres 23886 0x5 base/16384/pg_filenode.map! 
4345768.334891 postgres 23886 0x5 base/16384/pg_internal.init! 
4345768.335821 postgres 23886 0x5 base/16384/11725! 
4345768.347911 svstat 24649 0x4 supervise/ok! 
4345768.347921 svstat 24649 0x4 supervise/status! 
4345768.350340 stat 24651 0x3 /etc/ld.so.cache! 
4345768.350372 stat 24651 0x3 /lib/x86_64-linux-gnu/libselinux…! 
4345768.350460 stat 24651 0x3 /lib/x86_64-linux-gnu/libc.so.6! 
4345768.350526 stat 24651 0x3 /lib/x86_64-linux-gnu/libdl.so.2! 
4345768.350981 stat 24651 0x3 /proc/filesystems! 
4345768.351182 stat 24651 0x3 /etc/nsswitch.conf! 
[…]!
perf-­‐tools: 
kprobe 
• Just 
wrapping 
capabili?es 
eases 
use. 
Eg, 
kprobes: 
# ./kprobe 'p:open do_sys_open filename=+0(%si):string' 'filename ~ "*stat"'! 
Tracing kprobe myopen. Ctrl-C to end.! 
+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"! 
+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"! 
+0x0/0x220) filename="pg_stat_tmp/pgstat.stat”! 
^C! 
Ending tracing...! 
• By 
postgres-1172 [000] d... 6594028.787166: open: (do_sys_open 
postgres-1172 [001] d... 6594028.797410: open: (do_sys_open 
postgres-1172 [001] d... 6594028.797467: open: (do_sys_open 
some 
defini?on 
of 
“ease”. 
Would 
like 
easier 
symbol 
usage, 
instead 
of 
+0(%si).
1. 
Jrace 
• Sugges?ons: 
– I’m 
blogging 
and 
so 
can 
you! 
– Func?on 
profiler: 
• Can 
these 
in-­‐kernel 
counts 
be 
used 
for 
other 
vars? 
Eg, 
associa?ve 
array 
or 
histogram 
of 
%dx 
– Func?on 
grapher: 
• Can 
the 
?ming 
be 
exposed 
by 
some 
vars? 
Picture 
histogram 
of 
latency 
– Mul?-­‐user 
access 
possible?
2. 
perf_events
2. 
perf_events 
• In-­‐kernel, 
tools/perf, 
mul?-­‐tool, 
“perf” 
command 
• Experiences: 
– Stable, 
powerful, 
reliable 
– The 
sub 
op?ons 
can 
feel 
inconsistent 
(perf 
bench?) 
– Amazing 
with 
kernel 
debuginfo, 
when 
we 
have 
it 
– We 
use 
it 
for 
CPU 
stack 
profiles 
all 
the 
?me 
• And 
turn 
them 
into 
flame 
graphs, 
which 
have 
solved 
numerous 
issues 
so 
far…
perf 
CPU 
Flame 
Graph 
Broken 
Java 
stacks 
(missing 
frame 
pointer) 
Kernel 
TCP/IP 
GC 
Idle 
Time 
thread 
Locks 
epoll
2. 
perf_events 
• Sugges?ons: 
– Support 
for 
func?on 
argument 
symbols 
without 
a 
full 
debuginfo 
– Rework 
scrip?ng 
framework 
(eg, 
try 
por?ng 
iosnoop) 
• eg, 
“perf 
record” 
may 
need 
a 
tunable 
?meout 
to 
trigger 
data 
writes, 
for 
efficient 
interac?ve 
scripts 
– Break 
up 
the 
mul?-­‐tool 
a 
bit 
(separate 
perf 
bench) 
– eBPF 
integra?on 
for 
custom 
aggrega?ons?
3. 
SystemTap
3. 
SystemTap 
• The 
most 
powerful 
of 
the 
tracers 
• Used 
for 
the 
deepest 
custom 
tracing 
– Especially 
Java 
hotspot 
probes 
• Experiences: 
– Undergoing 
a 
reset. 
Switching 
to 
the 
latest 
SystemTap 
version, 
and 
a 
newer 
kernel. 
So 
far, 
so 
good. 
– Trying 
out 
nd_syscall 
for 
debuginfo-­‐less 
tracing 
• Sugges?ons: 
– More 
non-­‐debuginfo 
tapset 
func?onality
4. 
eBPF
4. 
eBPF 
• Extended 
BPF: 
programs 
on 
tracepoints 
– High 
performance 
filtering: 
JIT 
– In-­‐kernel 
summaries: 
maps 
• eg, 
in-­‐kernel 
latency 
heat 
map 
(showing 
bimodal): 
Low 
latency 
cache 
hits 
High 
latency 
device 
I/O 
Time
4. 
eBPF 
• Experiences: 
– Can 
have 
lower 
CPU 
overhead 
than 
DTrace 
– Very 
powerful: 
really 
custom 
maps 
– Assembly 
version 
very 
hard 
to 
use; 
C 
is 
be^er, 
but 
s?ll 
not 
easy 
• Sugges?ons: 
– Integrate: 
custom 
in-­‐kernel 
aggrega?ons 
is 
the 
missing 
piece
Other 
Tracers 
• Experiences 
and 
sugges?ons: 
– ktap 
– LTTng 
– Oracle 
Linux 
DTrace 
– dtrace4linux 
– sysdig
The 
Tracing 
Landscape, 
Oct 
2014 
Scope 
& 
Capability 
(less 
brutal) 
Ease 
of 
use 
sysdig 
perf 
Jrace 
eBPF 
ktap 
stap 
Stage 
of 
Development 
(my 
opinion) 
dtrace4L. 
(brutal) 
(alpha) 
(mature)
Summary 
• DTrace 
is 
an 
awesome 
technology 
– Which 
has 
also 
had 
awesome 
marke?ng 
• Tradi?onal, 
social, 
sales, 
blogs, 
… 
– Most 
people 
won’t 
use 
it 
directly, 
and 
that’s 
ok 
• Drive 
usage 
via 
GUIs 
and 
scripts 
• Linux 
Tracers 
are 
catching 
up, 
and 
may 
surpass 
– It’s 
not 
2005 
anymore 
• Now 
we 
have 
Jrace, 
perf_events, 
kprobes, 
uprobes, 
… 
– Speed 
and 
aggrega?ons 
ma^er 
• If 
DTrace 
is 
Ki^y 
Hawk, 
eBPF 
is 
a 
jet 
engine
Acks 
• dtrace.conf 
X-­‐ray 
pony 
art 
by 
substack 
• h^p://www.raspberrypi.org/ 
rasberry 
PI 
image 
• h^p://en.wikipedia.org/wiki/Crash_test_dummy 
photo 
by 
Brady 
Holt 
• h^ps://findery.com/johnfox/notes/all-­‐the-­‐wood-­‐behind-­‐one-­‐arrow 
• h^p://en.wikipedia.org/wiki/Early_flying_machines 
hang 
glider 
image 
• h^p://www.beginningwithi.com/2010/09/12/how-­‐the-­‐dtrace-­‐book-­‐got-­‐ 
done/ 
• h^p://www.cafepress.com/joyentsmartos.724465338 
• h^p://generalzoi.deviantart.com/art/Pony-­‐Creator-­‐v3-­‐397808116 
• Tux 
by 
Larry 
Ewing; 
Linux® 
is 
the 
registered 
trademark 
of 
Linus 
Torvalds 
in 
the 
U.S. 
and 
other 
countries. 
• Thanks 
Dominic 
Kay 
and 
Deirdré 
Straughan 
for 
feedback
Links 
• h^ps://www.usenix.org/legacy/event/usenix04/tech/general/full_papers/ 
cantrill/cantrill.pdf 
• Jrace 
& 
perf-­‐tools 
• h^ps://github.com/brendangregg/perf-­‐tools 
• h^p://lwn.net/Ar?cles/608497/ 
• eBPF: 
h^p://lwn.net/Ar?cles/603983/ 
• ktap: 
h^p://www.ktap.org/ 
• SystemTap: 
h^ps://sourceware.org/systemtap/ 
• sysdig: 
h^p://www.sysdig.org/ 
• h^p://lwn.net/Ar?cles/114840/ 
CDDL 
• h^p://dtrace.org/blogs/ahl/2011/10/05/dtrace-­‐for-­‐linux-­‐2/ 
• Jp://Jp.cs.wisc.edu/paradyn/papers/Tamches99Using.pdf 
• h^p://www.brendangregg.com/heatmaps.html 
• h^p://lkml.iu.edu/hypermail/linux/kernel/0011.3/0183.html 
LTT 
+ 
DProbes
Thanks 
• Ques?ons? 
• h^p://slideshare.net/brendangregg 
• h^p://www.brendangregg.com 
• bgregg@ne7lix.com 
• @brendangregg

More Related Content

What's hot

Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Brendan Gregg
 
Linux Performance Analysis and Tools
Linux Performance Analysis and ToolsLinux Performance Analysis and Tools
Linux Performance Analysis and ToolsBrendan Gregg
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance AnalysisBrendan Gregg
 
Kdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysisKdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysisBuland Singh
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KernelThomas Graf
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingMichelle Holley
 
LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityBrendan Gregg
 
Profiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf ToolsProfiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf ToolsemBO_Conference
 
05.2 virtio introduction
05.2 virtio introduction05.2 virtio introduction
05.2 virtio introductionzenixls2
 
Linux Linux Traffic Control
Linux Linux Traffic ControlLinux Linux Traffic Control
Linux Linux Traffic ControlSUSE Labs Taipei
 
U Boot or Universal Bootloader
U Boot or Universal BootloaderU Boot or Universal Bootloader
U Boot or Universal BootloaderSatpal Parmar
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF SuperpowersBrendan Gregg
 
Linux kernel tracing
Linux kernel tracingLinux kernel tracing
Linux kernel tracingViller Hsiao
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDPDaniel T. Lee
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
 
Embedded Linux BSP Training (Intro)
Embedded Linux BSP Training (Intro)Embedded Linux BSP Training (Intro)
Embedded Linux BSP Training (Intro)RuggedBoardGroup
 

What's hot (20)

Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016
 
Linux Performance Analysis and Tools
Linux Performance Analysis and ToolsLinux Performance Analysis and Tools
Linux Performance Analysis and Tools
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 
Kdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysisKdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysis
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet Processing
 
LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF Observability
 
Profiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf ToolsProfiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf Tools
 
05.2 virtio introduction
05.2 virtio introduction05.2 virtio introduction
05.2 virtio introduction
 
Linux Linux Traffic Control
Linux Linux Traffic ControlLinux Linux Traffic Control
Linux Linux Traffic Control
 
U Boot or Universal Bootloader
U Boot or Universal BootloaderU Boot or Universal Bootloader
U Boot or Universal Bootloader
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF Superpowers
 
Linux kernel tracing
Linux kernel tracingLinux kernel tracing
Linux kernel tracing
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
 
Linux-Internals-and-Networking
Linux-Internals-and-NetworkingLinux-Internals-and-Networking
Linux-Internals-and-Networking
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
 
ゼロからはじめるKVM超入門
ゼロからはじめるKVM超入門ゼロからはじめるKVM超入門
ゼロからはじめるKVM超入門
 
Embedded Linux BSP Training (Intro)
Embedded Linux BSP Training (Intro)Embedded Linux BSP Training (Intro)
Embedded Linux BSP Training (Intro)
 

Viewers also liked

Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at NetflixBrendan Gregg
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016Brendan Gregg
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBrendan Gregg
 
Open Source Systems Performance
Open Source Systems PerformanceOpen Source Systems Performance
Open Source Systems PerformanceBrendan Gregg
 
Designing Tracing Tools
Designing Tracing ToolsDesigning Tracing Tools
Designing Tracing ToolsBrendan Gregg
 
SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsBrendan Gregg
 
Linux Performance Tools 2014
Linux Performance Tools 2014Linux Performance Tools 2014
Linux Performance Tools 2014Brendan Gregg
 
FreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame GraphsFreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame GraphsBrendan Gregg
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems PerformanceBrendan Gregg
 
DTrace Topics: Introduction
DTrace Topics: IntroductionDTrace Topics: Introduction
DTrace Topics: IntroductionBrendan Gregg
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance ToolsBrendan Gregg
 
Systems Performance: Enterprise and the Cloud
Systems Performance: Enterprise and the CloudSystems Performance: Enterprise and the Cloud
Systems Performance: Enterprise and the CloudBrendan Gregg
 
Stop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production SystemsStop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production SystemsBrendan Gregg
 
Performance Analysis: The USE Method
Performance Analysis: The USE MethodPerformance Analysis: The USE Method
Performance Analysis: The USE MethodBrendan Gregg
 
ACM Applicative System Methodology 2016
ACM Applicative System Methodology 2016ACM Applicative System Methodology 2016
ACM Applicative System Methodology 2016Brendan Gregg
 
What Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versaWhat Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versaBrendan Gregg
 
Netflix: From Clouds to Roots
Netflix: From Clouds to RootsNetflix: From Clouds to Roots
Netflix: From Clouds to RootsBrendan Gregg
 
Monitorama 2015 Netflix Instance Analysis
Monitorama 2015 Netflix Instance AnalysisMonitorama 2015 Netflix Instance Analysis
Monitorama 2015 Netflix Instance AnalysisBrendan Gregg
 
Massively Distributed Backups at Facebook Scale - Shlomo Priymak, Facebook - ...
Massively Distributed Backups at Facebook Scale - Shlomo Priymak, Facebook - ...Massively Distributed Backups at Facebook Scale - Shlomo Priymak, Facebook - ...
Massively Distributed Backups at Facebook Scale - Shlomo Priymak, Facebook - ...DevOpsDays Tel Aviv
 
You can be a kernel hacker
You can be a kernel hackerYou can be a kernel hacker
You can be a kernel hackerjvans
 

Viewers also liked (20)

Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 
Open Source Systems Performance
Open Source Systems PerformanceOpen Source Systems Performance
Open Source Systems Performance
 
Designing Tracing Tools
Designing Tracing ToolsDesigning Tracing Tools
Designing Tracing Tools
 
SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREs
 
Linux Performance Tools 2014
Linux Performance Tools 2014Linux Performance Tools 2014
Linux Performance Tools 2014
 
FreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame GraphsFreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame Graphs
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems Performance
 
DTrace Topics: Introduction
DTrace Topics: IntroductionDTrace Topics: Introduction
DTrace Topics: Introduction
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance Tools
 
Systems Performance: Enterprise and the Cloud
Systems Performance: Enterprise and the CloudSystems Performance: Enterprise and the Cloud
Systems Performance: Enterprise and the Cloud
 
Stop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production SystemsStop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production Systems
 
Performance Analysis: The USE Method
Performance Analysis: The USE MethodPerformance Analysis: The USE Method
Performance Analysis: The USE Method
 
ACM Applicative System Methodology 2016
ACM Applicative System Methodology 2016ACM Applicative System Methodology 2016
ACM Applicative System Methodology 2016
 
What Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versaWhat Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versa
 
Netflix: From Clouds to Roots
Netflix: From Clouds to RootsNetflix: From Clouds to Roots
Netflix: From Clouds to Roots
 
Monitorama 2015 Netflix Instance Analysis
Monitorama 2015 Netflix Instance AnalysisMonitorama 2015 Netflix Instance Analysis
Monitorama 2015 Netflix Instance Analysis
 
Massively Distributed Backups at Facebook Scale - Shlomo Priymak, Facebook - ...
Massively Distributed Backups at Facebook Scale - Shlomo Priymak, Facebook - ...Massively Distributed Backups at Facebook Scale - Shlomo Priymak, Facebook - ...
Massively Distributed Backups at Facebook Scale - Shlomo Priymak, Facebook - ...
 
You can be a kernel hacker
You can be a kernel hackerYou can be a kernel hacker
You can be a kernel hacker
 

Similar to From DTrace to Linux

The Dirty Little Secrets They Didn’t Teach You In Pentesting Class
The Dirty Little Secrets They Didn’t Teach You In Pentesting Class The Dirty Little Secrets They Didn’t Teach You In Pentesting Class
The Dirty Little Secrets They Didn’t Teach You In Pentesting Class Chris Gates
 
Linux Perf Tools
Linux Perf ToolsLinux Perf Tools
Linux Perf ToolsRaj Pandey
 
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable PythonTravis Oliphant
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesJen Aman
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesJen Aman
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDatabricks
 
The DITA Iceberg, DITA Europe 2016
The DITA Iceberg, DITA Europe 2016The DITA Iceberg, DITA Europe 2016
The DITA Iceberg, DITA Europe 2016IXIASOFT
 
Solaris DTrace, An Introduction
Solaris DTrace, An IntroductionSolaris DTrace, An Introduction
Solaris DTrace, An Introductionsatyajit_t
 
Monitoring and metrics in the cloud
Monitoring and metrics in the cloudMonitoring and metrics in the cloud
Monitoring and metrics in the cloudDavid Lutz
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionDaniel Coupal
 
Running TFLite on Your Mobile Devices, 2020
Running TFLite on Your Mobile Devices, 2020Running TFLite on Your Mobile Devices, 2020
Running TFLite on Your Mobile Devices, 2020Koan-Sin Tan
 
Solving Large Scale Optimization Problems using CPLEX Optimization Studio
Solving Large Scale Optimization Problems using CPLEX Optimization StudioSolving Large Scale Optimization Problems using CPLEX Optimization Studio
Solving Large Scale Optimization Problems using CPLEX Optimization Studiooptimizatiodirectdirect
 
The State of Rootless Containers
The State of Rootless ContainersThe State of Rootless Containers
The State of Rootless ContainersAkihiro Suda
 
Notes from 2016 bay area deep learning school
Notes from 2016 bay area deep learning school Notes from 2016 bay area deep learning school
Notes from 2016 bay area deep learning school Niketan Pansare
 
dtrace_topics_intro.pdf
dtrace_topics_intro.pdfdtrace_topics_intro.pdf
dtrace_topics_intro.pdfssuser785ce21
 
Software Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
Software Engineering Advice from Google's Jeff Dean for Big, Distributed SystemsSoftware Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
Software Engineering Advice from Google's Jeff Dean for Big, Distributed Systemsadrianionel
 
Choosing the right parallel compute architecture
Choosing the right parallel compute architecture Choosing the right parallel compute architecture
Choosing the right parallel compute architecture corehard_by
 
Introducing TensorFlow: The game changer in building "intelligent" applications
Introducing TensorFlow: The game changer in building "intelligent" applicationsIntroducing TensorFlow: The game changer in building "intelligent" applications
Introducing TensorFlow: The game changer in building "intelligent" applicationsRokesh Jankie
 
Engage 2020 - Best Practices for analyzing Domino Applications
Engage 2020 - Best Practices for analyzing Domino ApplicationsEngage 2020 - Best Practices for analyzing Domino Applications
Engage 2020 - Best Practices for analyzing Domino Applicationspanagenda
 

Similar to From DTrace to Linux (20)

The Dirty Little Secrets They Didn’t Teach You In Pentesting Class
The Dirty Little Secrets They Didn’t Teach You In Pentesting Class The Dirty Little Secrets They Didn’t Teach You In Pentesting Class
The Dirty Little Secrets They Didn’t Teach You In Pentesting Class
 
Linux Perf Tools
Linux Perf ToolsLinux Perf Tools
Linux Perf Tools
 
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable Python
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best Practices
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best Practices
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best Practices
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
The DITA Iceberg, DITA Europe 2016
The DITA Iceberg, DITA Europe 2016The DITA Iceberg, DITA Europe 2016
The DITA Iceberg, DITA Europe 2016
 
Solaris DTrace, An Introduction
Solaris DTrace, An IntroductionSolaris DTrace, An Introduction
Solaris DTrace, An Introduction
 
Monitoring and metrics in the cloud
Monitoring and metrics in the cloudMonitoring and metrics in the cloud
Monitoring and metrics in the cloud
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in production
 
Running TFLite on Your Mobile Devices, 2020
Running TFLite on Your Mobile Devices, 2020Running TFLite on Your Mobile Devices, 2020
Running TFLite on Your Mobile Devices, 2020
 
Solving Large Scale Optimization Problems using CPLEX Optimization Studio
Solving Large Scale Optimization Problems using CPLEX Optimization StudioSolving Large Scale Optimization Problems using CPLEX Optimization Studio
Solving Large Scale Optimization Problems using CPLEX Optimization Studio
 
The State of Rootless Containers
The State of Rootless ContainersThe State of Rootless Containers
The State of Rootless Containers
 
Notes from 2016 bay area deep learning school
Notes from 2016 bay area deep learning school Notes from 2016 bay area deep learning school
Notes from 2016 bay area deep learning school
 
dtrace_topics_intro.pdf
dtrace_topics_intro.pdfdtrace_topics_intro.pdf
dtrace_topics_intro.pdf
 
Software Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
Software Engineering Advice from Google's Jeff Dean for Big, Distributed SystemsSoftware Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
Software Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
 
Choosing the right parallel compute architecture
Choosing the right parallel compute architecture Choosing the right parallel compute architecture
Choosing the right parallel compute architecture
 
Introducing TensorFlow: The game changer in building "intelligent" applications
Introducing TensorFlow: The game changer in building "intelligent" applicationsIntroducing TensorFlow: The game changer in building "intelligent" applications
Introducing TensorFlow: The game changer in building "intelligent" applications
 
Engage 2020 - Best Practices for analyzing Domino Applications
Engage 2020 - Best Practices for analyzing Domino ApplicationsEngage 2020 - Best Practices for analyzing Domino Applications
Engage 2020 - Best Practices for analyzing Domino Applications
 

More from Brendan Gregg

YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing PerformanceBrendan Gregg
 
IntelON 2021 Processor Benchmarking
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor BenchmarkingBrendan Gregg
 
Systems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedBrendan Gregg
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Brendan Gregg
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceBrendan Gregg
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at NetflixBrendan Gregg
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceBrendan Gregg
 
LPC2019 BPF Tracing Tools
LPC2019 BPF Tracing ToolsLPC2019 BPF Tracing Tools
LPC2019 BPF Tracing ToolsBrendan Gregg
 
YOW2018 CTO Summit: Working at netflix
YOW2018 CTO Summit: Working at netflixYOW2018 CTO Summit: Working at netflix
YOW2018 CTO Summit: Working at netflixBrendan Gregg
 
eBPF Perf Tools 2019
eBPF Perf Tools 2019eBPF Perf Tools 2019
eBPF Perf Tools 2019Brendan Gregg
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixBrendan Gregg
 
NetConf 2018 BPF Observability
NetConf 2018 BPF ObservabilityNetConf 2018 BPF Observability
NetConf 2018 BPF ObservabilityBrendan Gregg
 
ATO Linux Performance 2018
ATO Linux Performance 2018ATO Linux Performance 2018
ATO Linux Performance 2018Brendan Gregg
 
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Brendan Gregg
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceBrendan Gregg
 
LISA17 Container Performance Analysis
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance AnalysisBrendan Gregg
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixBrendan Gregg
 
Kernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPFKernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPFBrendan Gregg
 

More from Brendan Gregg (20)

YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing Performance
 
IntelON 2021 Processor Benchmarking
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor Benchmarking
 
Systems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting Started
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
 
LPC2019 BPF Tracing Tools
LPC2019 BPF Tracing ToolsLPC2019 BPF Tracing Tools
LPC2019 BPF Tracing Tools
 
YOW2018 CTO Summit: Working at netflix
YOW2018 CTO Summit: Working at netflixYOW2018 CTO Summit: Working at netflix
YOW2018 CTO Summit: Working at netflix
 
eBPF Perf Tools 2019
eBPF Perf Tools 2019eBPF Perf Tools 2019
eBPF Perf Tools 2019
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
 
BPF Tools 2017
BPF Tools 2017BPF Tools 2017
BPF Tools 2017
 
NetConf 2018 BPF Observability
NetConf 2018 BPF ObservabilityNetConf 2018 BPF Observability
NetConf 2018 BPF Observability
 
FlameScope 2018
FlameScope 2018FlameScope 2018
FlameScope 2018
 
ATO Linux Performance 2018
ATO Linux Performance 2018ATO Linux Performance 2018
ATO Linux Performance 2018
 
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
 
LISA17 Container Performance Analysis
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance Analysis
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
 
Kernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPFKernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPF
 

Recently uploaded

AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 

Recently uploaded (20)

AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 

From DTrace to Linux

  • 1. TRACING SUMMIT EUROPE From DTrace To Linux: Oct, 2014 What can Linux learn from DTrace? Brendan Gregg Senior Performance Architect bgregg@ne7lix.com @brendangregg
  • 2. Brendan Gregg • DTrace contribu?ons include: – Primary author of the DTrace book – DTraceToolkit – dtrace-­‐cloud-­‐tools – DTrace network providers • I now work on Linux at Ne7lix – using: Jrace, perf_events, SystemTap, ktap, eBPF, … – created: perf-­‐tools, msr-­‐cloud-­‐tools • Opinions in this talk are my own
  • 3. Agenda 1. DTrace – What is DTrace, really? – Who is DTrace for, really? – Why doesn’t Linux have DTrace? – What worked well? – What didn’t work well? 2. Linux Tracers – Jrace, perf_events, eBPF, … Topics include adop?on, marke?ng, technical challenges, and our usage at Ne7lix.
  • 4. What is DTrace, really?
  • 5. Technology + Marke?ng (Like many other company products)
  • 6. Prior Technology Kerninst: kernel dynamic tracing, Solaris 2.5.1, 1999
  • 7. Prior Technology Early dynamic tracers weren’t safe
  • 8. Prior Technology • Also: – Sun’s TNF – DProbes: user + kernel dynamic tracing – Linux Trace Toolkit (LTT) – Others, including offline binary instrumenta?on • DProbes and LTT were combined in Nov 2000, but not integrated into the Linux kernel1 • Sun set forth to produce a produc?on-­‐safe tool 1 h^p://lkml.iu.edu/hypermail/linux/kernel/0011.3/0183.html
  • 10. Technology • DTrace: – Safe for produc?on use • You might step on your foot (overhead), but you won’t shoot it off – Dynamic tracing, sta?c tracing, and profiling – User-­‐ and kernel-­‐level, unified – Programma?c: filters and summaries – Solved countless issues in dev and prod • That’s what DTrace is for me – An awesome technology, oJen needed to root cause kernel & app issues • But for most people….
  • 11. A Typical Conversa?on… “Does Linux have DTrace yet?” “That’s a pity” “DTrace is awesome!” “I’m not sure” “No.” “No.” “Why?” “Why, specifically?” “Have you used it?”
  • 14. Early Marke?ng • DTrace had awesome marke?ng – People s?ll want it but don’t really know why • Early marke?ng: tradi?onal, $$$ – Great marke?ng product managers • 10 Moves Ahead campaign: airports, sta?ons, etc. – Sun sales staff pitched DTrace directly – Sun technology evangelists • Benefits – Not another Sun tech no one knew about – Compelled people to learn more, try it out
  • 15. Marke?ng Evolved • Sun marke?ng become innova?ve – Engineering blogs, BigAdmin – Marke?ng staff who used and understood DTrace • Who could be^er ar?culate its value – Marke?ng more directly from the engineers
  • 17. Later Marke?ng • Many ini?a?ves by Deirdré Straughan: – Social media, blogs, events, the ponycorn mascot, ... – Video and share everything: all meetups, talks • Blogs: – including h^p://dtrace.org; my own > 1M views • Books: – my own > 30k sold • Videos: – me shou?ng while DTracing disks, ~1M views • Language support exposed new communi?es to DTrace
  • 20. ???
  • 21. Who is DTrace for, really?
  • 22. DTrace end-­‐users: Current Es?mated DTrace guide users: ~100 Script end-­‐users: ~5,000 Note: 91.247% of sta?s?cs are made up
  • 23. DTrace end-­‐users: Current • DTrace guide users: ~100 – Understand the ~400 page Dynamic Tracing Guide – Develop their own scripts from scratch – Understand overhead intui?vely • Script end-­‐users: ~5000 – DTraceToolkit, Google – Run scripts. Some tweaks/customiza?ons.
  • 24. DTrace end-­‐users: Future eg, Oracle ZFS Storage Appliance Analy?cs
  • 25. DTrace end-­‐users: Future Possible Future DTrace guide users: ~100 Script end-­‐users: ~5,000 GUI end-­‐users: >50,000
  • 27. Company Usage • Prac?cal usage for most companies: – A) A performance team (or person) • Acquires useful scripts • Develops custom scripts – B) The rest of the company asks (A) for script/help • They need to know what’s possible, to know to ask – Or, you buy/develop a GUI that everyone can use • There are some excep?ons
  • 28. Why doesn’t Linux have DTrace?
  • 29. Why doesn’t Linux have a DTrace equivalent? 4 Answers…
  • 30. 1. It does (sort of) Jrace perf_events
  • 31. 1. It does (sort of) • Linux has changed – In 2005, numerous Linux issues were difficult or impossible to solve. Linux needed a DTrace equivalent. – By 2014, many of these are now solvable, especially using Jrace, perf_events, kprobes, uprobes: all part of the Linux kernel
  • 32. 2. Technical semantic error: missing x86_64 kernel/module debuginfo
  • 33. 2. Technical • Linux is a more difficult environment – Solaris always has symbols, via CTF, which DTrace uses for dynamic tracing – Linux doesn’t always have symbols/debuginfo
  • 34. 3. Linux isn’t a Company “All the wood behind one arrow” – Sco^ McNealy, CEO, Sun Microsystems
  • 35. 3. Linux isn’t a Company • Linus can refuse patches, but can’t stop projects – The tracing wood is split between many arrows • Jrace, perf_events, eBPF, SystemTap, ktap, LTTng, … – And we are a small community: there’s not much wood to go around!
  • 36. 4. No Trace Race
  • 37. 4. No Trace Race • Post 2001, Solaris was losing ground to Linux. Sun desperately needed differen?ators to survive – Three top Sun engineers spent years on DTrace – Sun marke?ng gave it their best shot… • This circumstance will never exist again – For Linux today, it would be like having Linus, Ingo, and Steven do tracing full-­‐?me for three years, followed by a major marke?ng campaign • There may never be another trace race. Unless…
  • 38. Why doesn’t Linux have DTrace itself? 2 Answers…
  • 39. 1. The CDDL From: Claire Giordano <claire.giordano@sun.com> To: license-discuss@opensource.org [Open Source Ini?a?ve] Subject: For Approval: Common Development and Distribution License (CDDL) Date: Wed, 01 Dec 2004 19:47:39 -0800 […] Like the MPL, the CDDL is not expected to be compatible with the GPL, since it contains requirements that are not in the GPL (for example, the "patent peace" provision in section 6). Thus, it is likely that files released under the CDDL will not be able to be combined with files released under the GPL to create a larger program. […] CDDL Team, Sun Microsystems Source: h^p://lwn.net/Ar?cles/114840/
  • 40. 1. The CDDL • Linux tradi?onally includes the tracer/profiler in the (GPL) kernel, but the DTrace license is CDDL – Out-­‐of-­‐tree projects have maintenance difficul?es – Oracle (who own the DTrace copyrights) could relicense it as GPL, but haven’t, and may never do this • Note that ZFS on Linux is doing well, despite being CDDL, and out of tree
  • 42. 2. DTrace ports • There are two ports, but both currently incomplete • A) h^ps://github.com/dtrace4linux/linux: – Mostly one UK developer, Paul Fox, as a hobby since 2008 (when he isn’t developing on the Rasberry Pi) • B) Oracle Linux DTrace: – Open source kernel, closed source user-­‐level ($) • We pay for monitoring tools; why not this too? – Experienced engineers, test suite focused – Had been good progress, but no updates for months
  • 43. What with DTrace worked well? 5 Key items…
  • 45. 1. Produc?on Safety • DTrace architecture – Restricted probe context: no kernel facility calls, restricted instruc?ons, no backwards branches, restricted loads/stores – Heartbeat: aborted due to systemic unresponsiveness • DTrace Test Suite – Hundreds of tests • Linux is learning this: – Oracle Linux DTrace is taking the test suite seriously – Jracetest
  • 46. 2. All the wood behind one arrow DTrace
  • 47. 2. All the wood behind one arrow • Can Linux learn this? – Can we vote some off the Linux tracing island? • At least, no new tracers in 2015, please!
  • 48. 3. In-­‐Kernel Aggrega?ons value ------------- Distribution ------------- count! 4096 | 0! 8192 |@@@@@@@@@@@@@@@@@@@@@@@@@@ 1085! 16384 |@@@@@@@@@@@ 443! 32768 |@@ 98! 65536 | 5! 131072 | 1! 262144 | 1! 524288 | 0! 1048576 | 11! 2097152 | 0!
  • 49. 3. In-­‐Kernel Aggrega?ons • Changed how inves?ga?ons are conducted – rapid, live analysis • Low overhead: – per-­‐CPU storage, asynchronous kernel-­‐>user transfers • Key uses: – summary sta?s?cs: count, avg, min, max – histograms: quan?ze, lquan?ze – by-­‐key: execname, kernel stacks, user stacks • Linux can learn: – Need aggrega?ons (eBPF maps, SystemTap, ktap?)
  • 50. 4. Many Example Scripts
  • 51. 4. Many Example Scripts
  • 52. 4. Many Example Scripts • Scripts serve many needs: – tools: ready to run – examples: learn by-­‐example – marke?ng: each is a use case • DTrace book scripts – 150+ short examples • DTraceToolkit – 230 more scripts – all DTraceToolkit scripts have man pages, example files, and are tested – An essen?al factor in DTrace’s adop?on
  • 53. 4. Many Example Scripts • Linux can learn: – Many users will just run scripts, not write them – People want good short examples – If they aren’t tested, they don’t work • It’s easy to generate metrics that kind-­‐of work; it’s hard to make them reliable for different workloads. – Maintenance of dynamic tracing scripts is painful • The instrumented code can change • Need more sta?c tracepoints
  • 55. 5. Marke?ng • DTrace was effec?vely marketed in many ways – Tradi?onal, social, blogs, scripts, ponycorn, … • Linux has virtually no marke?ng for its tracers – Jrace is great, if you ever discover it; etc. – Marke?ng spend is on commercial products instead • Linux can learn to market what it has – Tracers may also benefit from “a great name and a cute logo”1 – “eBPF” is not catchy, and doesn’t convey meaning 1 h^p://thenewstack.io/why-­‐did-­‐docker-­‐catch-­‐on-­‐quickly-­‐and-­‐why-­‐is-­‐it-­‐so-­‐interes?ng/
  • 56. Cute Tracing Logos dtrace4linux perf_events ktap LTTng SystemTap Jrace Ponies by Deirdré Straughan, using: h^p://generalzoi.deviantart.com pony creator
  • 57. Other Things • Programmable/scriptable • Built-­‐in stability seman?cs
  • 58. What with DTrace didn’t work well?
  • 59. What with DTrace didn’t work well? 5 Key Issues…
  • 61. 1. Adop?on • Few customers ever wrote DTrace scripts – DTrace should have been used more than it was – Sun’s “killer” tool just wasn’t – Be^er pickup rate with developers, not sysadmins • Many customers just ran my scripts – Not ideal, but be^er than nothing – This wasn’t what many at Sun dreamed • Internal adop?on was slow, limited – Sun could have done much more, but didn’t • The problem was knowing what to do with it – The syntax was the easy part
  • 62. 1. Adop?on • Linux can learn: – Adop?on is about more than just the technology • Documenta?on, marke?ng, training, community – Teaching what it does is more important than how • Everyone needs to know when to ask for it, not necessarily how to use it – Needs an adop?on curve (not a step func?on) • Tools, one-­‐liners, short scripts, …
  • 63. 2. Training This is to cer?fy that Brendan Gregg Has Completed the Sun Educa?onal course DTrace is a Solaris differen3ator
  • 64. 2. Training • Early training was not very effec?ve – Sun began including the DTraceToolkit in courses, with be^er success • It gradually improved – The last courses I developed and taught (aJer Sun) used simulated problems for the students to solve on their own with DTrace • Linux can learn: – Lab-­‐based training is most effec?ve. Online tutorials?
  • 66. 3. GUIs • Dozens of performance monitoring products, but almost no meaningful DTrace support • A couple of excep?ons: – Oracle ZFS Storage Appliance Analy?cs • Formally the Sun Storage 7000 Analy?cs • Should be generalized. Oracle Solaris 11.3? – Joyent Cloud Analy?cs
  • 67. 3. GUIs • Linux can learn: – Real adop?on possible through scripts & GUIs – Use the GUI to add value to the data • Heat maps: latency, u?liza?on, offset • Flame graphs • Time series thread visualiza?ons (Trace Compass) • ie, not just line graphs! – Commercial GUI products have marke?ng budget • Applica?on perf monitoring was $2.4B in 20131 1 h^ps://www.gartner.com/doc/2752217/market-­‐share-­‐analysis-­‐applica?on-­‐performance
  • 68. 3. GUIs • Heat maps are an example must-­‐have use case for trace data
  • 69. 4. Overheads While the DTrace technology is awesome, it does have some minor technical challenges as well
  • 70. 4. Overheads • While op?mized, for many targets the DTrace CPU overheads can s?ll be too high – Scheduler tracing, memory alloca?on tracing – User-­‐level dynamic tracing (fast trap) – VM probes (eg, Java disables some probes by default) – 10 GbE Network I/O, etc… • In some cases it doesn’t ma^er – Despera?on: system already mel?ng down – Troubleshoo?ng in dev: speed not a concern • Linux can learn: – Speed can ma^er, faster makes more possible
  • 72. 5. Syscall Provider • Solaris DTrace instrumented the trap table, and called it the syscall provider – Which is actually an unstable interface – Breaks between Solaris versions • And really broke in Oracle Solaris 11 – Other weird caveats • Linux can learn: – syscalls are the #1 target for users learning system tracers. The API should be easy and stable.
  • 73. Other Issues • The lack of: – Bounded loops (like SystemTap) – Kernel instruc?on tracing (like perf_events) – Easy PMC interface (like perf stat) – Aggrega?on key/value access (stap, ktap, eBPF) – Kernel source (issue for Oracle Solaris only) • 4+ second startup ?mes – Several Linux tracers start instantly
  • 74. From DTrace to Linux Tracers (2014)
  • 75. • Massive AWS EC2 Linux cloud, with FreeBSD appliances for content delivery • Performance is cri?cal: >50M subscribers • Just launched in Europe!
  • 76. System Tracing at Ne7lix • Present: – Jrace can serve many needs – perf_events some more, esp. with debuginfo – SystemTap as needed, esp. for Java – ad hoc other tools • Future: – Jrace/perf_events/ktap with eBPF, for a fully featured and mainline tracer? – One of the other tracers going mainline? • Summarizing 4 tracers…
  • 78. 1. Jrace • Tracing and profiling: /sys/kernel/debug/tracing – added by Steven Rostedt and others since 2.6.27, and already enabled on our servers (3.2+) • Experiences: – very useful capabili?es: tracing, coun?ng – surprising features: graphing (latencies), filters • Front-­‐end tools to ease use – h^ps://github.com/brendangregg/perf-­‐tools – WARNING: these are unsupported hacks – There’s also the trace-­‐cmd front-­‐end by Steven • 4 examples…
  • 79. perf-­‐tools: iosnoop • Block I/O (disk) events with latency: # ./iosnoop –ts! Tracing block I/O. Ctrl-C to end.! STARTs ENDs COMM PID TYPE DEV BLOCK BYTES LATms! 5982800.302061 5982800.302679 supervise 1809 W 202,1 17039600 4096 0.62! 5982800.302423 5982800.302842 supervise 1809 W 202,1 17039608 4096 0.42! 5982800.304962 5982800.305446 supervise 1801 W 202,1 17039616 4096 0.48! 5982800.305250 5982800.305676 supervise 1801 W 202,1 17039624 4096 0.43! […]! # ./iosnoop –h! USAGE: iosnoop [-hQst] [-d device] [-i iotype] [-p PID] [-n name] [duration]! -d device # device string (eg, "202,1)! -i iotype # match type (eg, '*R*' for all reads)! -n name # process name to match on I/O issue! -p PID # PID to match on I/O issue! -Q # include queueing time in LATms! -s # include start time of I/O (s)! -t # include completion time of I/O (s)! -h # this usage message! duration # duration seconds, and use buffers! […]!
  • 80. perf-­‐tools: iolatency • Block I/O (disk) latency distribu?ons: # ./iolatency ! Tracing block I/O. Output every 1 seconds. Ctrl-C to end.! ! >=(ms) .. <(ms) : I/O |Distribution |! 0 -> 1 : 1144 |######################################|! 1 -> 2 : 267 |######### |! 2 -> 4 : 10 |# |! 4 -> 8 : 5 |# |! 8 -> 16 : 248 |######### |! 16 -> 32 : 601 |#################### |! 32 -> 64 : 117 |#### |! • User-­‐level processing some?mes can’t keep up – Over 50k IOPS. Could buffer more workaround, but would prefer in-­‐kernel aggrega?ons […]!
  • 81. perf-­‐tools: opensnoop • Trace open() syscalls showing filenames: # ./opensnoop -t! Tracing open()s. Ctrl-C to end.! TIMEs COMM PID FD FILE! 4345768.332626 postgres 23886 0x8 /proc/self/oom_adj! 4345768.333923 postgres 23886 0x5 global/pg_filenode.map! 4345768.333971 postgres 23886 0x5 global/pg_internal.init! 4345768.334813 postgres 23886 0x5 base/16384/PG_VERSION! 4345768.334877 postgres 23886 0x5 base/16384/pg_filenode.map! 4345768.334891 postgres 23886 0x5 base/16384/pg_internal.init! 4345768.335821 postgres 23886 0x5 base/16384/11725! 4345768.347911 svstat 24649 0x4 supervise/ok! 4345768.347921 svstat 24649 0x4 supervise/status! 4345768.350340 stat 24651 0x3 /etc/ld.so.cache! 4345768.350372 stat 24651 0x3 /lib/x86_64-linux-gnu/libselinux…! 4345768.350460 stat 24651 0x3 /lib/x86_64-linux-gnu/libc.so.6! 4345768.350526 stat 24651 0x3 /lib/x86_64-linux-gnu/libdl.so.2! 4345768.350981 stat 24651 0x3 /proc/filesystems! 4345768.351182 stat 24651 0x3 /etc/nsswitch.conf! […]!
  • 82. perf-­‐tools: kprobe • Just wrapping capabili?es eases use. Eg, kprobes: # ./kprobe 'p:open do_sys_open filename=+0(%si):string' 'filename ~ "*stat"'! Tracing kprobe myopen. Ctrl-C to end.! +0x0/0x220) filename="pg_stat_tmp/pgstat.stat"! +0x0/0x220) filename="pg_stat_tmp/pgstat.stat"! +0x0/0x220) filename="pg_stat_tmp/pgstat.stat”! ^C! Ending tracing...! • By postgres-1172 [000] d... 6594028.787166: open: (do_sys_open postgres-1172 [001] d... 6594028.797410: open: (do_sys_open postgres-1172 [001] d... 6594028.797467: open: (do_sys_open some defini?on of “ease”. Would like easier symbol usage, instead of +0(%si).
  • 83. 1. Jrace • Sugges?ons: – I’m blogging and so can you! – Func?on profiler: • Can these in-­‐kernel counts be used for other vars? Eg, associa?ve array or histogram of %dx – Func?on grapher: • Can the ?ming be exposed by some vars? Picture histogram of latency – Mul?-­‐user access possible?
  • 85. 2. perf_events • In-­‐kernel, tools/perf, mul?-­‐tool, “perf” command • Experiences: – Stable, powerful, reliable – The sub op?ons can feel inconsistent (perf bench?) – Amazing with kernel debuginfo, when we have it – We use it for CPU stack profiles all the ?me • And turn them into flame graphs, which have solved numerous issues so far…
  • 86. perf CPU Flame Graph Broken Java stacks (missing frame pointer) Kernel TCP/IP GC Idle Time thread Locks epoll
  • 87. 2. perf_events • Sugges?ons: – Support for func?on argument symbols without a full debuginfo – Rework scrip?ng framework (eg, try por?ng iosnoop) • eg, “perf record” may need a tunable ?meout to trigger data writes, for efficient interac?ve scripts – Break up the mul?-­‐tool a bit (separate perf bench) – eBPF integra?on for custom aggrega?ons?
  • 89. 3. SystemTap • The most powerful of the tracers • Used for the deepest custom tracing – Especially Java hotspot probes • Experiences: – Undergoing a reset. Switching to the latest SystemTap version, and a newer kernel. So far, so good. – Trying out nd_syscall for debuginfo-­‐less tracing • Sugges?ons: – More non-­‐debuginfo tapset func?onality
  • 91. 4. eBPF • Extended BPF: programs on tracepoints – High performance filtering: JIT – In-­‐kernel summaries: maps • eg, in-­‐kernel latency heat map (showing bimodal): Low latency cache hits High latency device I/O Time
  • 92. 4. eBPF • Experiences: – Can have lower CPU overhead than DTrace – Very powerful: really custom maps – Assembly version very hard to use; C is be^er, but s?ll not easy • Sugges?ons: – Integrate: custom in-­‐kernel aggrega?ons is the missing piece
  • 93. Other Tracers • Experiences and sugges?ons: – ktap – LTTng – Oracle Linux DTrace – dtrace4linux – sysdig
  • 94. The Tracing Landscape, Oct 2014 Scope & Capability (less brutal) Ease of use sysdig perf Jrace eBPF ktap stap Stage of Development (my opinion) dtrace4L. (brutal) (alpha) (mature)
  • 95. Summary • DTrace is an awesome technology – Which has also had awesome marke?ng • Tradi?onal, social, sales, blogs, … – Most people won’t use it directly, and that’s ok • Drive usage via GUIs and scripts • Linux Tracers are catching up, and may surpass – It’s not 2005 anymore • Now we have Jrace, perf_events, kprobes, uprobes, … – Speed and aggrega?ons ma^er • If DTrace is Ki^y Hawk, eBPF is a jet engine
  • 96. Acks • dtrace.conf X-­‐ray pony art by substack • h^p://www.raspberrypi.org/ rasberry PI image • h^p://en.wikipedia.org/wiki/Crash_test_dummy photo by Brady Holt • h^ps://findery.com/johnfox/notes/all-­‐the-­‐wood-­‐behind-­‐one-­‐arrow • h^p://en.wikipedia.org/wiki/Early_flying_machines hang glider image • h^p://www.beginningwithi.com/2010/09/12/how-­‐the-­‐dtrace-­‐book-­‐got-­‐ done/ • h^p://www.cafepress.com/joyentsmartos.724465338 • h^p://generalzoi.deviantart.com/art/Pony-­‐Creator-­‐v3-­‐397808116 • Tux by Larry Ewing; Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries. • Thanks Dominic Kay and Deirdré Straughan for feedback
  • 97. Links • h^ps://www.usenix.org/legacy/event/usenix04/tech/general/full_papers/ cantrill/cantrill.pdf • Jrace & perf-­‐tools • h^ps://github.com/brendangregg/perf-­‐tools • h^p://lwn.net/Ar?cles/608497/ • eBPF: h^p://lwn.net/Ar?cles/603983/ • ktap: h^p://www.ktap.org/ • SystemTap: h^ps://sourceware.org/systemtap/ • sysdig: h^p://www.sysdig.org/ • h^p://lwn.net/Ar?cles/114840/ CDDL • h^p://dtrace.org/blogs/ahl/2011/10/05/dtrace-­‐for-­‐linux-­‐2/ • Jp://Jp.cs.wisc.edu/paradyn/papers/Tamches99Using.pdf • h^p://www.brendangregg.com/heatmaps.html • h^p://lkml.iu.edu/hypermail/linux/kernel/0011.3/0183.html LTT + DProbes
  • 98. Thanks • Ques?ons? • h^p://slideshare.net/brendangregg • h^p://www.brendangregg.com • bgregg@ne7lix.com • @brendangregg