SlideShare a Scribd company logo
1 of 51
Download to read offline
Metrics are money
Aurélien “beorn” Rougemont
Well. before that.
whoami(1)
40 years old nerd
Been pushing buttons on a C64 since i was 9
opensource software user since 1996 ( slackware 3.1 )
Hacked kernel code for the first time in 1999 (ISDN modem)
Wrote a few patches for linux/opensolaris/FreeBSD kernels over the past 19
years
Contributed a few patches for various observability projects
On-call for the last 19 years
Woken up for stupid things for 19 years…
Been happily working for synthesio.com for 2.5 years
job(1)
talk(1)
<Friend> "wow congratulations on making it to
the KR conferences"
<Me> "Thanks !"
<Friend> "i was looking at the KR speakers list.
I saw the usual legends. And you. Good luck with
that. Sincerely"
This is not meant to be a public shaming session
Names and bugs were voluntarily removed
Explaining these bugs/patches to most of you would be…
incongruous
You probably wrote or validated the bug… and the fix
motd(5)
Operations
alarm(2)
500 HTTP error
non zero shell return code
Segfault
Kernel panic
OOM
CRC errors
Network problems
No data
No graph
[...]
sleep(1)
HTTP 200 error
Failed shell script returning 0
Segfault hidden by a process supervisor
Silent data corruption
Unknown states
Pattern change
No timeout on probe
[...]
Peter Drucker famously said :
“what gets measured gets managed.”
stat(1)
dash(1)
keepalived(8)
Just a sense of scale
prometheus(1) con 2018
Fastly
114 prometheus servers
28.4M timeseries
2.2M samples/s
Cloudflare
267 prometheus servers
Uber
400-600M datapoints/s pre-aggregation
20M stored datapoints per second
6.6B unique metric IDs
9k grafana dash
30B datapoints
free(1)
So ops guys brains working memory are saturated, among other things, by metrics
What if… Even the most basic metrics weren’t what you thought they’d be ?
What if… The same metric did not mean the same from a server to another ?
What if… We were all wrong most part of the time ?
Now real life stories
Server usage...
top(1)
I have played a game with other mid to senior ops guys : 2 out of 10 were
almost correct.
“Load averages are an industry-critical metric – my company spends millions
auto-scaling cloud instances based on them and other metrics – but on Linux
there's some mystery around them.” Brendan Gregg (2017)
After all it only took around 12 screens to Brendan Gregg to explain linux
load average history.
Oh and good news, linux computes load differently than other kernel/OS
Network packets...
irssi(1) /query foo
<foo> is hired, replaces a dying home-made linux-based switch with a very
common one
<foo> adds metrics to this brand new switch and figures out something is wrong
Switch and server are absolutely not giving the same results : at least 50%
drop on all network tx/rx metrics during the usual benchmarks
<foo> examines the dashboard configuration : there’s also a max() function but
that was just an aggravating factor not the root cause
<foo> beorn you know collectd-fu right ?
vim(1) ~/collectd core/plugin code
Collectd code was pretty straight forward
Collectd reads data from /proc/net/dev
No voodoo magic here
history(3)
The server and the linux-based-old-switch were running the exact same old
linux kernel version
And could not be simply upgraded because of proprietary drivers of specific
components
proc(5) /proc/net/dev
When this story happens there was almost no documentation for /proc/net/dev
Gladly there was this old email that gave some useful hints.
mail(1)
> How can I find out the /proc/net info
>
> eg: softnet_stat is for what purpose
Much of this is only well-documented in the code. Here's an attempt
at interpreting softnet_stat [no guarantee that it is correct; read the code!]:
% softnet_stat.sh
cpu total dropped squeezed collision
0 1794619684 0 346 0
1 36399632 0 74 2
% softnet_stat.sh -h
usage: softnet_stat.sh [ -h ]
[...]
mutt(1)
Output column definitions:
cpu # of the cpu
total # of packets (not including netpoll) received by the interrupt handler
There might be some double counting going on:
net/core/dev.c:1643: __get_cpu_var(netdev_rx_stat).total++;
net/core/dev.c:1836: __get_cpu_var(netdev_rx_stat).total++;
I think the intention was that these were originally on separate
receive paths ...
dropped # of packets that were dropped because netdev_max_backlog was exceeded
squeezed # of times ksoftirq ran out of netdev_budget or time slice with work
remaining
collision # of times that two cpus collided trying to get the device queue lock.
man(1) kernel/drivers
git-log(1)
# git log --pretty=oneline --abbrev-commit |grep igb| grep stats
55c05dd0295d igbvf: Use net_device_stats from struct net_device
e66c083aab32 igb: fix stats for i210 rx_fifo_errors
3dbdf96928dc igb: Fix stats output on i210/i211 parts.
0a915b95d67f igb: Add stats output for OS2BMC feature on i350 devices
12dcd86b75d5 igb: fix stats handling
43915c7c9a99 igb: only read phy specific stats if in internal phy mode
128e45eb61b9 igb: Rework how netdev->stats is handled
645a3abd73c2 igb: Remove invalid stats counters
3f9c01648146 igb: only process global stats in igb_update_stats
04a5fcaaf0e1 igb: move alloc_failed and csum_err stats into per rx-ring stat
231835e4163c igb: Fix erroneous display of stats by ethtool -S
8d24e93309d6 igb: Use the instance of net_device_stats from net_device.
cc9073bbc901 igb: remove unused temp variable from stats clearing path
3ea73afafb8c igb: Record host memory receive overflow in net_stats
04fe63583d46 igb: update stats before doing reset in igb_down
e21ed3538f19 igb: update ethtool stats to support multiqueue
git-commit(1)
/*
* ENOBUFS = no kernel mem, SOCK_NOSPACE = no sndbuf space. Reporting
* ENOBUFS might not be good (it's not tunable per se), but otherwise
* we don't have a good statistic (IpOutDiscards but it can be too many
* things). We could add another new stat but at least for now that
* seems like overkill.
*/
ethtool(8)
Over the years the Linux networking stack had hoarded:
● Tens of (cool) features ( GRO, GSO, RPS, RFS, …)
● Tens of drivers
● Tons of code-paths
● Multiple thousand sysctl entries
● A lot of bugs
Manufacturer’s tech document was 1200 pages long, and is probably bigger today
Documentation was not what it is today
sha512sum(1)
To sum it up
● notice the problem
● Fix the graph configuration (bad aggr)
● start reading the userland stats collecting (collectd here)
● realize and make sure the bug was not there
● read your kernel/driver code
● realize the bug is really in the code path that /proc/net/dev hits
● read the 1200 pages tech specs from the manufacturer
● find a few related patches
● rebuild the kernel/driver only 4 times (we were lucky)
● And just reboot production servers for weeks
All that just to read valid tx.rx packets counters !
bc(1)
<foo> Last year based on these metrics they doubled network
capacity for more than 2.8M euros
Resellers and
Manufacturers...
We had many servers of a validated type with 10Gbps ixgbe nics
According to capacity planning we order a 240 servers batch
Linux TCP/IP network statistics are bad : tcp retransmits, latencies, …
The new switch metrics were green
Linux did not have signal related statistics for fiber NICs
Another brand/model of SFP+ worked just fine
netstat(1)
mutt(1) reseller
<reseller> everything is fine
<me> if only we’ve had those SFP+ DOM registers in kernel/ethtool…
<CTO> you have 10 full days to prove them wrong
<me> Erm it’s the network stack we’re talking about and i’m no real kernel
dev
<CTO> That’s why you have 10 full days to prove them wrong
Read everything i could find about optical signal, SFP+ and DOM statistics
Found a microrouter project named bifrost doing just this with a 2.6 kernel
(2012)
Their Patches were never pushed upstream
We needed it to run on 3.4 kernels for features and hardware compatibility
Emailed the guys about a 3.4 patch: no luck
Let’s port this to 3.4
links(1)
The network API had major changes between 2.6 and 3.4 on this particular part.
Ended up rewriting the patchset (kernel + ethtool) entirely in 5 days
Patch worked in production for a 3-4 years without a glitch
vim(1) patchset.diff
# ethtool -D eth0
Int-Calbr: Avr RX-Power: Alarm & Warn: TX_DISABLE: TX_FAULT: RX_LOS:
RATE_SELECT MON: RATE_SELECT: Wavelength: 850 nm
Alarms, warnings in beginning of line, Ie. AH = Alarm High, WL == Warn
Low etc
Temp: 45.7 C Thresh: Lo: -45.0/-40.0 Hi: 115.0/125.0 C
Vcc: 3.32 V Thresh: Lo: 2.7/2.9 Hi: 3.7/3.9 V
Tx-Bias: 6.6 mA Thresh: Lo: 1.0/2.0 Hi: 12.0/15.0 mA
TX-pwr: -2.9 dBm ( 0.52 mW) Thresh: Lo: -10.0/-8.3 Hi: 0.8/2.0 dBm
RX-pwr: -1.9 dBm ( 0.64 mW) Thresh: Lo: -16.0/-14.2 Hi: 1.8/3.0 dBm
Proud and happy i wrote an email to someone “doing things in the kernel”
<kernelguy> “$#!$#!$$@%$#%#!$%#%@$”
<me> “So what should i fix ?”
<End Of Discussion>
git-format-patch(1)
After adding the ethtool output and the patchset into the reseller’s case he
agreed to change the incompatible SFP+ after only 5 days of hard work
480 brand new SFP+ arrived. We changed the faulty SFP+ for weeks.
And that was it
Roughly 200K euros were saved with these metrics
hledger(1)
Disks...
iozone(1)
In a hosting company we built ZFS based SAN/NAS
<coworker> last batch of servers have serious storage
performances issues under load
<SRE> alright let’s dig
sha256sum(1)
# iostat -E c0t5000C5004124B687d0
sd31 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE Product: ST2000NM0001 Revision: PS04 Serial No: Z1P1HECD
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
# iostat -E c11t50014EE3000E9080d0
sd22 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: WD Product: WD2000FYYG Revision: D1B3 Serial No: WMAWP0192044
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 4 Predictive Failure Analysis: 0
Disk had the same labels, same tech specs, but not exactly the
same physical look
alpine(1)
<me> Sir it is not the same disk brand/model
<reseller> we do not guarantee anything else that tech specs
<me> [...] Please do something !
orion(1)
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sde 0.00 0.00 1.00 246.00 0.00 123.00 1019.89 124.77 127.80 4.05 100.10
sdc 0.00 0.00 1.00 104.00 0.00 52.00 1014.32 120.05 896.32 9.52 100.00
After extensive profiling we are able to reproduce the
problematic workload
Which happens to be a very important workload for ZFS
sup(1)
<manufacturer> Our test suite shows that the disks you have sent are fine
<me> except they are not. see the iostat output
[nothing for 1 week]
smartctl(1)
/dev/sde: SEAGATE ST2000NM0001 PS04
Direct access device specific parameters: WP=0 DPOFUA=1
Read write error recovery [rw] mode page:
AWRE 1 [cha: y, def: 1, sav: 1] Automatic write reallocation enabled
ARRE 1 [cha: y, def: 1, sav: 1] Automatic read reallocation enabled
TB 0 [cha: y, def: 0, sav: 0] Transfer block
RC 0 [cha: y, def: 0, sav: 0] Read continuous
EER 0 [cha: y, def: 0, sav: 0] Enable early recovery
PER 0 [cha: y, def: 0, sav: 0] Post error
DTE 0 [cha: y, def: 0, sav: 0] Data terminate on error
DCR 0 [cha: y, def: 0, sav: 0] Disable correction
RRC 20 [cha: y, def: 20, sav: 20] Read retry count
COR_S 255 [cha: n, def:255, sav:255] Correction span (obsolete)
HOC 0 [cha: n, def: 0, sav: 0] Head offset count (obsolete)
DSOC 0 [cha: n, def: 0, sav: 0] Data strobe offset count (obsolete)
TPERE 0 [cha: n, def: 0, sav: 0] Thin provisioning error reporting enabled
WRC 5 [cha: y, def: 5, sav: 5] Write retry count
RTL 8000 [cha: y, def:8000, sav:8000] Recovery time limit (ms)
Disconnect-reconnect (SPC + transports) [dr] mode page:
BFR 0 [cha: n, def: 0, sav: 0] Buffer full ratio
BER 0 [cha: n, def: 0, sav: 0] Buffer empty ratio
BIL 0 [cha: n, def: 0, sav: 0] Bus inactivity limit
DTL 0 [cha: n, def: 0, sav: 0] Disconnect time limit
CTL 0 [cha: n, def: 0, sav: 0] Connect time limit
MBS 314 [cha: y, def:314, sav:314] Maximum burst size (512 bytes)
EMDP 0 [cha: n, def: 0, sav: 0] Enable modify data pointers
FA 0 [cha: n, def: 0, sav: 0] Fair arbitration
DIMM 0 [cha: n, def: 0, sav: 0] Disconnect immediate
DTDC 0 [cha: n, def: 0, sav: 0] Data transfer disconnect control
FBS 0 [cha: n, def: 0, sav: 0] First burst size (512 bytes)
Format (SBC) [fo] mode page:
TPZ 48080 [cha: n, def:48080, sav:48080] Tracks per zone
ASPZ 0 [cha: n, def: 0, sav: 0] Alternate sectors per zone
ATPZ 0 [cha: n, def: 0, sav: 0] Alternate tracks per zone
ATPLU 896 [cha: n, def:896, sav:896] Alternate tracks per logical unit
SPT 1220 [cha: n, def:1220, sav:1220] Sectors per track
DBPPS 512 [cha: n, def:512, sav:512] Data bytes per physical sector
INTLV 1 [cha: n, def: 1, sav: 1] Interleave
TSF 156 [cha: n, def:156, sav:156] Track skew factor
CSF 38 [cha: n, def: 38, sav: 38] Cylinder skew factor
SSEC 0 [cha: n, def: 0, sav: 0] Soft sector
HSEC 1 [cha: n, def: 1, sav: 1] Hard sector
RMB 0 [cha: n, def: 0, sav: 0] Removable
SURF 0 [cha: n, def: 0, sav: 0] Surface
Rigid disk (SBC) [rd] mode page:
NOC 249000 [cha: n, def:249000, sav:249000] Number of cylinders
NOH 8 [cha: n, def: 8, sav: 8] Number of heads
SCWP 0 [cha: n, def: 0, sav: 0] Starting cylinder for write
precompensation
SCRWC 0 [cha: n, def: 0, sav: 0] Starting cylinder for reduced write
current
DSR 0 [cha: n, def: 0, sav: 0] Device step rate
LZC 0 [cha: n, def: 0, sav: 0] Landing zone cylinder
RPL 0 [cha: n, def: 0, sav: 0] Rotational position locking
ROTO 0 [cha: n, def: 0, sav: 0] Rotational offset
MRR 7200 [cha: n, def:7200, sav:7200] Medium rotation rate (rpm)
Verify error recovery (SBC) [ve] mode page:
V_EER 0 [cha: y, def: 0, sav: 0] Enable early recovery
V_PER 0 [cha: y, def: 0, sav: 0] Post error
V_DTE 0 [cha: y, def: 0, sav: 0] Data terminate on error
V_DCR 0 [cha: y, def: 0, sav: 0] Disable correction
V_RC 20 [cha: y, def: 20, sav: 20] Verify retry count TSF 156 [cha:
n, def:156, sav:156] Track skew factor
V_COR_S 255 [cha: n, def:255, sav:255] Verify correction span (obsolete)
V_RTL 8000 [cha: y, def:8000, sav:8000] Verify recovery time limit (ms)
Caching (SBC) [ca] mode page:
IC 0 [cha: y, def: 0, sav: 0] Initiator control
ABPF 0 [cha: n, def: 0, sav: 0] Abort pre-fetch
CAP 0 [cha: y, def: 0, sav: 0] Caching analysis permitted
DISC 1 [cha: n, def: 1, sav: 1] Discontinuity
SIZE 0 [cha: n, def: 0, sav: 0] Size enable
WCE 0 [cha: y, def: 0, sav: 0] Write cache enable
MF 0 [cha: n, def: 0, sav: 0] Multiplication factor
RCD 0 [cha: y, def: 0, sav: 0] Read cache disable
DRRP 0 [cha: n, def: 0, sav: 0] Demand read retention priority
WRP 0 [cha: n, def: 0, sav: 0] Write retention priority
DPTL -1 [cha: n, def: -1, sav: -1] Disable pre-fetch transfer length
MIPF 0 [cha: y, def: 0, sav: 0] Minimum pre-fetch
MAPF -1 [cha: y, def: -1, sav: -1] Maximum pre-fetch
MAPFC -1 [cha: n, def: -1, sav: -1] Maximum pre-fetch ceiling
FSW 1 [cha: n, def: 1, sav: 1] Force sequential write
LBCSS 0 [cha: n, def: 0, sav: 0] Logical block cache segment size
DRA 0 [cha: y, def: 0, sav: 0] Disable read ahead
NV_DIS 0 [cha: n, def: 0, sav: 0] Non-volatile cache disable
NCS 32 [cha: n, def: 32, sav: 32] Number of cache segments
CSS 0 [cha: n, def: 0, sav: 0] Cache segment size
Control [co] mode page:
TST 0 [cha: n, def: 0, sav: 0] Task set type
TMF_ONLY 0 [cha: n, def: 0, sav: 0] Task management functions only
D_SENSE 0 [cha: y, def: 0, sav: 0] Descriptor format sense data
GLTSD 0 [cha: y, def: 0, sav: 0] Global logging target save disable
RLEC 0 [cha: y, def: 0, sav: 0] Report log exception condition
QAM 0 [cha: y, def: 0, sav: 0] Queue algorithm modifier
QERR 0 [cha: y, def: 0, sav: 0] Queue error management
RAC 0 [cha: n, def: 0, sav: 0] Report a check
UA_INTLCK 0 [cha: n, def: 0, sav: 0] Unit attention interlocks control
SWP 0 [cha: n, def: 0, sav: 0] Software write protect
ATO 0 [cha: n, def: 0, sav: 0] Application tag owner
TAS 0 [cha: n, def: 0, sav: 0] Task aborted status
AUTOLOAD 0 [cha: n, def: 0, sav: 0] Autoload mode
BTP 0 [cha: n, def: 0, sav: 0] Busy timeout period (100us)
ESTCT 18500 [cha: n, def:18500, sav:18500] Extended self test completion time
(sec)
Protocol specific logical unit [pl] mode page:
LUPID 6 [cha: n, def: 6, sav: 6] Logical unit's (transport) protocol
identifier
Protocol specific port [pp] mode page:
PPID 6 [cha: n, def: 6, sav: 6] Port's (transport) protocol identifier
Power condition [po] mode page:
STANDBY_Y 0 [cha: n, def: 0, sav: 0] Standby_y timer enabled
IDLE_C 0 [cha: n, def: 0, sav: 0] Idle_c timer enabled
IDLE_B 0 [cha: n, def: 0, sav: 0] Idle_b timer active
IDLE 1 [cha: y, def: 1, sav: 1] Idle timer enabled
STANDBY 0 [cha: y, def: 0, sav: 0] Standby timer active
ICT 5 [cha: y, def: 5, sav: 5] Idle condition timer (100 ms)
SCT 36000 [cha: n, def:36000, sav:36000] Standby condition timer (100 ms)
Informational exceptions control [ie] mode page:
PERF 0 [cha: y, def: 0, sav: 0] Performance (impact of ie operations)
EBF 0 [cha: n, def: 0, sav: 0] Enable background function
EWASC 0 [cha: y, def: 0, sav: 0] Enable warning
DEXCPT 0 [cha: y, def: 0, sav: 0] Disable exceptions
TEST 0 [cha: y, def: 0, sav: 0] Test (simulate device failure)
EBACKERR 0 [cha: n, def: 0, sav: 0] Enable background (scan + self test)
error reporting
LOGERR 1 [cha: y, def: 1, sav: 1] Log informational exception errors
MRIE 6 [cha: y, def: 6, sav: 6] Method of reporting informational
exceptions
INTT 6000 [cha: y, def:6000, sav:6000] Interval timer (100 ms)
REPC 0 [cha: n, def: 0, sav: 0] Report count (or Test flag number
[SSC-3])
Background control (SBC) [bc] mode page:
S_L_FULL 0 [cha: n, def: 0, sav: 0] Suspend on log full
LOWIR 0 [cha: n, def: 0, sav: 0] Log only when intervention required
EN_BMS 1 [cha: y, def: 1, sav: 1] Enable background medium scan
EN_PS 0 [cha: n, def: 0, sav: 0] Enable pre-scan
BMS_I 336 [cha: y, def:336, sav:336] Background medium scan interval time
(hour)
BPS_TL 24 [cha: y, def: 24, sav: 24] Background pre-scan time limit (hour)
MIN_IDLE 250 [cha: y, def:250, sav:250] Minumum idle time before background scan
(ms)
MAX_SUSP 0 [cha: y, def: 0, sav: 0] Maximum time to suspend background scan
(ms)
:(){ :|:& };:
At the same time they provided us a “fix” firmware
Performances were even better than with the good disk.
Binary diff showed a 1 bit change.
Yes a boolean.
Their firmware was silently enabling write cache (without
battery)
Which is to say the least dangerous
dc(1)
--- st2000nm00001_sdparm.txt 2013-02-28 21:12:15.287433646 +0100
+++ wd2000fyyg_sdparm.txt 2013-02-28 21:12:28.823131392 +0100
@@ -1,102 +1,97 @@
- /dev/sde: SEAGATE ST2000NM0001 PS04
+ /dev/sdc: WD WD2000FYYG D1BB
Direct access device specific parameters: WP=0 DPOFUA=1
Read write error recovery [rw] mode page:
- RC 0 [cha: y, def: 0, sav: 0] Read continuous
+ RC 0 [cha: n, def: 0, sav: 0] Read continuous
- RRC 20 [cha: y, def: 20, sav: 20] Read retry count
- COR_S 255 [cha: n, def:255, sav:255] Correction span (obsolete)
+ RRC 255 [cha: y, def:255, sav:255] Read retry count
+ COR_S 0 [cha: n, def: 0, sav: 0] Correction span (obsolete)
- WRC 5 [cha: y, def: 5, sav: 5] Write retry count
+ WRC 255 [cha: y, def:255, sav:255] Write retry count
Verify error recovery (SBC) [ve] mode page:
- V_RC 20 [cha: y, def: 20, sav: 20] Verify retry count
- V_COR_S 255 [cha: n, def:255, sav:255] Verify correction span (obsolete)
+ V_RC 255 [cha: y, def:255, sav:255] Verify retry count
+ V_COR_S 0 [cha: n, def: 0, sav: 0] Verify correction span (obsolete)
Caching (SBC) [ca] mode page:
- FSW 1 [cha: n, def: 1, sav: 1] Force sequential write
+ FSW 0 [cha: y, def: 0, sav: 0] Force sequential write
Informational exceptions control [ie] mode page:
- PERF 0 [cha: y, def: 0, sav: 0] Performance (impact of ie operations)
- EBF 0 [cha: n, def: 0, sav: 0] Enable background function
+ PERF 0 [cha: n, def: 0, sav: 0] Performance (impact of ie operations)
+ EBF 1 [cha: y, def: 1, sav: 1] Enable background function
- EBACKERR 0 [cha: n, def: 0, sav: 0] Enable background (scan + self test)
error reporting
- LOGERR 1 [cha: y, def: 1, sav: 1] Log informational exception errors
+ EBACKERR 0 [cha: y, def: 0, sav: 0] Enable background (scan + self test)
error reporting
+ LOGERR 1 [cha: n, def: 1, sav: 1] Log informational exception errors
- REPC 0 [cha: n, def: 0, sav: 0] Report count (or Test flag number
[SSC-3])
+ REPC 0 [cha: y, def: 0, sav: 0] Report count (or Test flag number
[SSC-3])
Background control (SBC) [bc] mode page:
- S_L_FULL 0 [cha: n, def: 0, sav: 0] Suspend on log full
- LOWIR 0 [cha: n, def: 0, sav: 0] Log only when intervention required
+ S_L_FULL 0 [cha: y, def: 0, sav: 0] Suspend on log full
+ LOWIR 0 [cha: y, def: 0, sav: 0] Log only when intervention required
- EN_PS 0 [cha: n, def: 0, sav: 0] Enable pre-scan
+ EN_PS 0 [cha: y, def: 0, sav: 0] Enable pre-scan
- BPS_TL 24 [cha: y, def: 24, sav: 24] Background pre-scan time limit (hour)
+ BPS_TL 0 [cha: y, def: 0, sav: 0] Background pre-scan time limit (hour)
We proved the disks did not
have the same behavior/specs
using this diff
The reseller changed all the
250 disks 150K euros
We spent the next months
changing and resilvering
arrays already in production
TLDR;
sha256sum(1)
Metrics are everywhere in operations at an unprecedented scale and still
growing fast
The vast majority of I.T. professionals do not understand fully what they are
currently graphing
Graphs are meant to trigger a deeper questioning when the behavior changes
To make a costly decision based on metrics without taking the time to ensure
what is exactly this metric is pure folly
Acquiring this knowledge is necessary and time consuming and requires humility
task(1) add project:young_me [...]
Kernel code ain’t no saint writing
Macros make things easy if you are not a C guru
Read git history per sub-system it helps a lot
Ask upstream if they are interested in what you plan to write
Propose a (probably stupid) way of doing the change before doing code
Then code and get things upstreamed
Kernel Recipes 2019 - Metrics are money

More Related Content

What's hot

Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedKernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedAnne Nicolas
 
Troubleshooting Tips from a Docker Support Engineer
Troubleshooting Tips from a Docker Support EngineerTroubleshooting Tips from a Docker Support Engineer
Troubleshooting Tips from a Docker Support EngineerJeff Anderson
 
Enable DPDK and SR-IOV for containerized virtual network functions with zun
Enable DPDK and SR-IOV for containerized virtual network functions with zunEnable DPDK and SR-IOV for containerized virtual network functions with zun
Enable DPDK and SR-IOV for containerized virtual network functions with zunheut2008
 
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterAnne Nicolas
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabMichelle Holley
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux KernelKernel TLV
 
Building a Virtualized Continuum with Intel(r) Clear Containers
Building a Virtualized Continuum with Intel(r) Clear ContainersBuilding a Virtualized Continuum with Intel(r) Clear Containers
Building a Virtualized Continuum with Intel(r) Clear ContainersMichelle Holley
 
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce RichardsonThe 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardsonharryvanhaaren
 
Kernel maintainance in Linux distributions: Debian
Kernel maintainance in Linux distributions: DebianKernel maintainance in Linux distributions: Debian
Kernel maintainance in Linux distributions: DebianAnne Nicolas
 
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelKernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelAnne Nicolas
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsHisaki Ohara
 
How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.Naoto MATSUMOTO
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptablesKernel TLV
 
Kernel Recipes 2019 - Suricata and XDP
Kernel Recipes 2019 - Suricata and XDPKernel Recipes 2019 - Suricata and XDP
Kernel Recipes 2019 - Suricata and XDPAnne Nicolas
 
debugging openstack neutron /w openvswitch
debugging openstack neutron /w openvswitchdebugging openstack neutron /w openvswitch
debugging openstack neutron /w openvswitch어형 이
 
Kernel Recipes 2016 - Landlock LSM: Unprivileged sandboxing
Kernel Recipes 2016 - Landlock LSM: Unprivileged sandboxingKernel Recipes 2016 - Landlock LSM: Unprivileged sandboxing
Kernel Recipes 2016 - Landlock LSM: Unprivileged sandboxingAnne Nicolas
 
Linux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - WonokaerunLinux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - Wonokaerunidsecconf
 

What's hot (20)

Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedKernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
 
Troubleshooting Tips from a Docker Support Engineer
Troubleshooting Tips from a Docker Support EngineerTroubleshooting Tips from a Docker Support Engineer
Troubleshooting Tips from a Docker Support Engineer
 
Enable DPDK and SR-IOV for containerized virtual network functions with zun
Enable DPDK and SR-IOV for containerized virtual network functions with zunEnable DPDK and SR-IOV for containerized virtual network functions with zun
Enable DPDK and SR-IOV for containerized virtual network functions with zun
 
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on Lab
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux Kernel
 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
 
Building a Virtualized Continuum with Intel(r) Clear Containers
Building a Virtualized Continuum with Intel(r) Clear ContainersBuilding a Virtualized Continuum with Intel(r) Clear Containers
Building a Virtualized Continuum with Intel(r) Clear Containers
 
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce RichardsonThe 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
 
Kernel maintainance in Linux distributions: Debian
Kernel maintainance in Linux distributions: DebianKernel maintainance in Linux distributions: Debian
Kernel maintainance in Linux distributions: Debian
 
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelKernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
 
How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptables
 
Kernel Recipes 2019 - Suricata and XDP
Kernel Recipes 2019 - Suricata and XDPKernel Recipes 2019 - Suricata and XDP
Kernel Recipes 2019 - Suricata and XDP
 
debugging openstack neutron /w openvswitch
debugging openstack neutron /w openvswitchdebugging openstack neutron /w openvswitch
debugging openstack neutron /w openvswitch
 
Staging driver sins
Staging driver sinsStaging driver sins
Staging driver sins
 
Kernel Recipes 2016 - Landlock LSM: Unprivileged sandboxing
Kernel Recipes 2016 - Landlock LSM: Unprivileged sandboxingKernel Recipes 2016 - Landlock LSM: Unprivileged sandboxing
Kernel Recipes 2016 - Landlock LSM: Unprivileged sandboxing
 
Linux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - WonokaerunLinux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - Wonokaerun
 

Similar to Kernel Recipes 2019 - Metrics are money

Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Andriy Berestovskyy
 
All of Your Network Monitoring is (probably) Wrong
All of Your Network Monitoring is (probably) WrongAll of Your Network Monitoring is (probably) Wrong
All of Your Network Monitoring is (probably) Wrongice799
 
BKK16-211 Internet of Tiny Linux (io tl)- Status and Progress
BKK16-211 Internet of Tiny Linux (io tl)- Status and ProgressBKK16-211 Internet of Tiny Linux (io tl)- Status and Progress
BKK16-211 Internet of Tiny Linux (io tl)- Status and ProgressLinaro
 
Bugs from Outer Space | while42 SF #6
Bugs from Outer Space | while42 SF #6Bugs from Outer Space | while42 SF #6
Bugs from Outer Space | while42 SF #6While42
 
Significance
SignificanceSignificance
SignificanceJulie May
 
Linux or unix interview questions
Linux or unix interview questionsLinux or unix interview questions
Linux or unix interview questionsTeja Bheemanapally
 
Interview questions
Interview questionsInterview questions
Interview questionsxavier john
 
Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2ice799
 
Analyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE MethodAnalyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE MethodBrendan Gregg
 
The internet of $h1t
The internet of $h1tThe internet of $h1t
The internet of $h1tAmit Serper
 
sshuttle VPN (2011-04)
sshuttle VPN (2011-04)sshuttle VPN (2011-04)
sshuttle VPN (2011-04)apenwarr
 
Time Sensitive Networking in the Linux Kernel
Time Sensitive Networking in the Linux KernelTime Sensitive Networking in the Linux Kernel
Time Sensitive Networking in the Linux Kernelhenrikau
 
Auditing the Opensource Kernels
Auditing the Opensource KernelsAuditing the Opensource Kernels
Auditing the Opensource KernelsSilvio Cesare
 
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...Slide_N
 
HKG15-902: Upstreaming 201
HKG15-902: Upstreaming 201HKG15-902: Upstreaming 201
HKG15-902: Upstreaming 201Linaro
 
Learning Erlang (from a Prolog dropout's perspective)
Learning Erlang (from a Prolog dropout's perspective)Learning Erlang (from a Prolog dropout's perspective)
Learning Erlang (from a Prolog dropout's perspective)elliando dias
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Johan Andersson
 

Similar to Kernel Recipes 2019 - Metrics are money (20)

Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
 
New204
New204New204
New204
 
All of Your Network Monitoring is (probably) Wrong
All of Your Network Monitoring is (probably) WrongAll of Your Network Monitoring is (probably) Wrong
All of Your Network Monitoring is (probably) Wrong
 
BKK16-211 Internet of Tiny Linux (io tl)- Status and Progress
BKK16-211 Internet of Tiny Linux (io tl)- Status and ProgressBKK16-211 Internet of Tiny Linux (io tl)- Status and Progress
BKK16-211 Internet of Tiny Linux (io tl)- Status and Progress
 
Bugs from Outer Space | while42 SF #6
Bugs from Outer Space | while42 SF #6Bugs from Outer Space | while42 SF #6
Bugs from Outer Space | while42 SF #6
 
Significance
SignificanceSignificance
Significance
 
Linux or unix interview questions
Linux or unix interview questionsLinux or unix interview questions
Linux or unix interview questions
 
Linux router
Linux routerLinux router
Linux router
 
Interview questions
Interview questionsInterview questions
Interview questions
 
Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2
 
Analyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE MethodAnalyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE Method
 
The internet of $h1t
The internet of $h1tThe internet of $h1t
The internet of $h1t
 
sshuttle VPN (2011-04)
sshuttle VPN (2011-04)sshuttle VPN (2011-04)
sshuttle VPN (2011-04)
 
Operating System Assignment Help
Operating System Assignment HelpOperating System Assignment Help
Operating System Assignment Help
 
Time Sensitive Networking in the Linux Kernel
Time Sensitive Networking in the Linux KernelTime Sensitive Networking in the Linux Kernel
Time Sensitive Networking in the Linux Kernel
 
Auditing the Opensource Kernels
Auditing the Opensource KernelsAuditing the Opensource Kernels
Auditing the Opensource Kernels
 
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
 
HKG15-902: Upstreaming 201
HKG15-902: Upstreaming 201HKG15-902: Upstreaming 201
HKG15-902: Upstreaming 201
 
Learning Erlang (from a Prolog dropout's perspective)
Learning Erlang (from a Prolog dropout's perspective)Learning Erlang (from a Prolog dropout's perspective)
Learning Erlang (from a Prolog dropout's perspective)
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 

More from Anne Nicolas

Kernel Recipes 2019 - Driving the industry toward upstream first
Kernel Recipes 2019 - Driving the industry toward upstream firstKernel Recipes 2019 - Driving the industry toward upstream first
Kernel Recipes 2019 - Driving the industry toward upstream firstAnne Nicolas
 
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIKernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIAnne Nicolas
 
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...Anne Nicolas
 
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary dataKernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary dataAnne Nicolas
 
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and BareboxEmbedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and BareboxAnne Nicolas
 
Embedded Recipes 2019 - Making embedded graphics less special
Embedded Recipes 2019 - Making embedded graphics less specialEmbedded Recipes 2019 - Making embedded graphics less special
Embedded Recipes 2019 - Making embedded graphics less specialAnne Nicolas
 
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre SiliconEmbedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre SiliconAnne Nicolas
 
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) pictureEmbedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) pictureAnne Nicolas
 
Embedded Recipes 2019 - Herd your socs become a matchmaker
Embedded Recipes 2019 - Herd your socs become a matchmakerEmbedded Recipes 2019 - Herd your socs become a matchmaker
Embedded Recipes 2019 - Herd your socs become a matchmakerAnne Nicolas
 
Embedded Recipes 2019 - LLVM / Clang integration
Embedded Recipes 2019 - LLVM / Clang integrationEmbedded Recipes 2019 - LLVM / Clang integration
Embedded Recipes 2019 - LLVM / Clang integrationAnne Nicolas
 
Embedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debuggingEmbedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debuggingAnne Nicolas
 
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimediaEmbedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimediaAnne Nicolas
 
Kernel Recipes 2019 - Kernel hacking behind closed doors
Kernel Recipes 2019 - Kernel hacking behind closed doorsKernel Recipes 2019 - Kernel hacking behind closed doors
Kernel Recipes 2019 - Kernel hacking behind closed doorsAnne Nicolas
 
Kernel Recipes 2019 - Faster IO through io_uring
Kernel Recipes 2019 - Faster IO through io_uringKernel Recipes 2019 - Faster IO through io_uring
Kernel Recipes 2019 - Faster IO through io_uringAnne Nicolas
 
Embedded Recipes 2019 - RT is about to make it to mainline. Now what?
Embedded Recipes 2019 - RT is about to make it to mainline. Now what?Embedded Recipes 2019 - RT is about to make it to mainline. Now what?
Embedded Recipes 2019 - RT is about to make it to mainline. Now what?Anne Nicolas
 
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelKernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelAnne Nicolas
 
Kernel Recipes 2019 - pidfds: Process file descriptors on Linux
Kernel Recipes 2019 - pidfds: Process file descriptors on LinuxKernel Recipes 2019 - pidfds: Process file descriptors on Linux
Kernel Recipes 2019 - pidfds: Process file descriptors on LinuxAnne Nicolas
 
Kernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at FacebookKernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at FacebookAnne Nicolas
 
Kernel Recipes 2019 - The ubiquity but also the necessity of eBPF as a techno...
Kernel Recipes 2019 - The ubiquity but also the necessity of eBPF as a techno...Kernel Recipes 2019 - The ubiquity but also the necessity of eBPF as a techno...
Kernel Recipes 2019 - The ubiquity but also the necessity of eBPF as a techno...Anne Nicolas
 
Kernel Recipes 2019 - What To Do When Your Device Depends on Another One
Kernel Recipes 2019 - What To Do When Your Device Depends on Another OneKernel Recipes 2019 - What To Do When Your Device Depends on Another One
Kernel Recipes 2019 - What To Do When Your Device Depends on Another OneAnne Nicolas
 

More from Anne Nicolas (20)

Kernel Recipes 2019 - Driving the industry toward upstream first
Kernel Recipes 2019 - Driving the industry toward upstream firstKernel Recipes 2019 - Driving the industry toward upstream first
Kernel Recipes 2019 - Driving the industry toward upstream first
 
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIKernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
 
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
 
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary dataKernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
 
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and BareboxEmbedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
 
Embedded Recipes 2019 - Making embedded graphics less special
Embedded Recipes 2019 - Making embedded graphics less specialEmbedded Recipes 2019 - Making embedded graphics less special
Embedded Recipes 2019 - Making embedded graphics less special
 
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre SiliconEmbedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
 
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) pictureEmbedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
 
Embedded Recipes 2019 - Herd your socs become a matchmaker
Embedded Recipes 2019 - Herd your socs become a matchmakerEmbedded Recipes 2019 - Herd your socs become a matchmaker
Embedded Recipes 2019 - Herd your socs become a matchmaker
 
Embedded Recipes 2019 - LLVM / Clang integration
Embedded Recipes 2019 - LLVM / Clang integrationEmbedded Recipes 2019 - LLVM / Clang integration
Embedded Recipes 2019 - LLVM / Clang integration
 
Embedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debuggingEmbedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debugging
 
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimediaEmbedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
 
Kernel Recipes 2019 - Kernel hacking behind closed doors
Kernel Recipes 2019 - Kernel hacking behind closed doorsKernel Recipes 2019 - Kernel hacking behind closed doors
Kernel Recipes 2019 - Kernel hacking behind closed doors
 
Kernel Recipes 2019 - Faster IO through io_uring
Kernel Recipes 2019 - Faster IO through io_uringKernel Recipes 2019 - Faster IO through io_uring
Kernel Recipes 2019 - Faster IO through io_uring
 
Embedded Recipes 2019 - RT is about to make it to mainline. Now what?
Embedded Recipes 2019 - RT is about to make it to mainline. Now what?Embedded Recipes 2019 - RT is about to make it to mainline. Now what?
Embedded Recipes 2019 - RT is about to make it to mainline. Now what?
 
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelKernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
 
Kernel Recipes 2019 - pidfds: Process file descriptors on Linux
Kernel Recipes 2019 - pidfds: Process file descriptors on LinuxKernel Recipes 2019 - pidfds: Process file descriptors on Linux
Kernel Recipes 2019 - pidfds: Process file descriptors on Linux
 
Kernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at FacebookKernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at Facebook
 
Kernel Recipes 2019 - The ubiquity but also the necessity of eBPF as a techno...
Kernel Recipes 2019 - The ubiquity but also the necessity of eBPF as a techno...Kernel Recipes 2019 - The ubiquity but also the necessity of eBPF as a techno...
Kernel Recipes 2019 - The ubiquity but also the necessity of eBPF as a techno...
 
Kernel Recipes 2019 - What To Do When Your Device Depends on Another One
Kernel Recipes 2019 - What To Do When Your Device Depends on Another OneKernel Recipes 2019 - What To Do When Your Device Depends on Another One
Kernel Recipes 2019 - What To Do When Your Device Depends on Another One
 

Recently uploaded

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 

Recently uploaded (20)

Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 

Kernel Recipes 2019 - Metrics are money

  • 1. Metrics are money Aurélien “beorn” Rougemont
  • 3. whoami(1) 40 years old nerd Been pushing buttons on a C64 since i was 9 opensource software user since 1996 ( slackware 3.1 ) Hacked kernel code for the first time in 1999 (ISDN modem) Wrote a few patches for linux/opensolaris/FreeBSD kernels over the past 19 years Contributed a few patches for various observability projects On-call for the last 19 years Woken up for stupid things for 19 years… Been happily working for synthesio.com for 2.5 years
  • 5. talk(1) <Friend> "wow congratulations on making it to the KR conferences" <Me> "Thanks !" <Friend> "i was looking at the KR speakers list. I saw the usual legends. And you. Good luck with that. Sincerely"
  • 6. This is not meant to be a public shaming session Names and bugs were voluntarily removed Explaining these bugs/patches to most of you would be… incongruous You probably wrote or validated the bug… and the fix motd(5)
  • 8. alarm(2) 500 HTTP error non zero shell return code Segfault Kernel panic OOM CRC errors Network problems No data No graph [...]
  • 9. sleep(1) HTTP 200 error Failed shell script returning 0 Segfault hidden by a process supervisor Silent data corruption Unknown states Pattern change No timeout on probe [...]
  • 10. Peter Drucker famously said : “what gets measured gets managed.” stat(1)
  • 13. Just a sense of scale
  • 14. prometheus(1) con 2018 Fastly 114 prometheus servers 28.4M timeseries 2.2M samples/s Cloudflare 267 prometheus servers Uber 400-600M datapoints/s pre-aggregation 20M stored datapoints per second 6.6B unique metric IDs 9k grafana dash 30B datapoints
  • 15. free(1) So ops guys brains working memory are saturated, among other things, by metrics What if… Even the most basic metrics weren’t what you thought they’d be ? What if… The same metric did not mean the same from a server to another ? What if… We were all wrong most part of the time ?
  • 16. Now real life stories
  • 18. top(1) I have played a game with other mid to senior ops guys : 2 out of 10 were almost correct. “Load averages are an industry-critical metric – my company spends millions auto-scaling cloud instances based on them and other metrics – but on Linux there's some mystery around them.” Brendan Gregg (2017) After all it only took around 12 screens to Brendan Gregg to explain linux load average history. Oh and good news, linux computes load differently than other kernel/OS
  • 20. irssi(1) /query foo <foo> is hired, replaces a dying home-made linux-based switch with a very common one <foo> adds metrics to this brand new switch and figures out something is wrong Switch and server are absolutely not giving the same results : at least 50% drop on all network tx/rx metrics during the usual benchmarks <foo> examines the dashboard configuration : there’s also a max() function but that was just an aggravating factor not the root cause <foo> beorn you know collectd-fu right ?
  • 21. vim(1) ~/collectd core/plugin code Collectd code was pretty straight forward Collectd reads data from /proc/net/dev No voodoo magic here
  • 22. history(3) The server and the linux-based-old-switch were running the exact same old linux kernel version And could not be simply upgraded because of proprietary drivers of specific components
  • 23. proc(5) /proc/net/dev When this story happens there was almost no documentation for /proc/net/dev Gladly there was this old email that gave some useful hints.
  • 24. mail(1) > How can I find out the /proc/net info > > eg: softnet_stat is for what purpose Much of this is only well-documented in the code. Here's an attempt at interpreting softnet_stat [no guarantee that it is correct; read the code!]: % softnet_stat.sh cpu total dropped squeezed collision 0 1794619684 0 346 0 1 36399632 0 74 2 % softnet_stat.sh -h usage: softnet_stat.sh [ -h ] [...]
  • 25. mutt(1) Output column definitions: cpu # of the cpu total # of packets (not including netpoll) received by the interrupt handler There might be some double counting going on: net/core/dev.c:1643: __get_cpu_var(netdev_rx_stat).total++; net/core/dev.c:1836: __get_cpu_var(netdev_rx_stat).total++; I think the intention was that these were originally on separate receive paths ... dropped # of packets that were dropped because netdev_max_backlog was exceeded squeezed # of times ksoftirq ran out of netdev_budget or time slice with work remaining collision # of times that two cpus collided trying to get the device queue lock.
  • 27. git-log(1) # git log --pretty=oneline --abbrev-commit |grep igb| grep stats 55c05dd0295d igbvf: Use net_device_stats from struct net_device e66c083aab32 igb: fix stats for i210 rx_fifo_errors 3dbdf96928dc igb: Fix stats output on i210/i211 parts. 0a915b95d67f igb: Add stats output for OS2BMC feature on i350 devices 12dcd86b75d5 igb: fix stats handling 43915c7c9a99 igb: only read phy specific stats if in internal phy mode 128e45eb61b9 igb: Rework how netdev->stats is handled 645a3abd73c2 igb: Remove invalid stats counters 3f9c01648146 igb: only process global stats in igb_update_stats 04a5fcaaf0e1 igb: move alloc_failed and csum_err stats into per rx-ring stat 231835e4163c igb: Fix erroneous display of stats by ethtool -S 8d24e93309d6 igb: Use the instance of net_device_stats from net_device. cc9073bbc901 igb: remove unused temp variable from stats clearing path 3ea73afafb8c igb: Record host memory receive overflow in net_stats 04fe63583d46 igb: update stats before doing reset in igb_down e21ed3538f19 igb: update ethtool stats to support multiqueue
  • 28. git-commit(1) /* * ENOBUFS = no kernel mem, SOCK_NOSPACE = no sndbuf space. Reporting * ENOBUFS might not be good (it's not tunable per se), but otherwise * we don't have a good statistic (IpOutDiscards but it can be too many * things). We could add another new stat but at least for now that * seems like overkill. */
  • 29. ethtool(8) Over the years the Linux networking stack had hoarded: ● Tens of (cool) features ( GRO, GSO, RPS, RFS, …) ● Tens of drivers ● Tons of code-paths ● Multiple thousand sysctl entries ● A lot of bugs Manufacturer’s tech document was 1200 pages long, and is probably bigger today Documentation was not what it is today
  • 30. sha512sum(1) To sum it up ● notice the problem ● Fix the graph configuration (bad aggr) ● start reading the userland stats collecting (collectd here) ● realize and make sure the bug was not there ● read your kernel/driver code ● realize the bug is really in the code path that /proc/net/dev hits ● read the 1200 pages tech specs from the manufacturer ● find a few related patches ● rebuild the kernel/driver only 4 times (we were lucky) ● And just reboot production servers for weeks All that just to read valid tx.rx packets counters !
  • 31. bc(1) <foo> Last year based on these metrics they doubled network capacity for more than 2.8M euros
  • 33. We had many servers of a validated type with 10Gbps ixgbe nics According to capacity planning we order a 240 servers batch Linux TCP/IP network statistics are bad : tcp retransmits, latencies, … The new switch metrics were green Linux did not have signal related statistics for fiber NICs Another brand/model of SFP+ worked just fine netstat(1)
  • 34. mutt(1) reseller <reseller> everything is fine <me> if only we’ve had those SFP+ DOM registers in kernel/ethtool… <CTO> you have 10 full days to prove them wrong <me> Erm it’s the network stack we’re talking about and i’m no real kernel dev <CTO> That’s why you have 10 full days to prove them wrong
  • 35. Read everything i could find about optical signal, SFP+ and DOM statistics Found a microrouter project named bifrost doing just this with a 2.6 kernel (2012) Their Patches were never pushed upstream We needed it to run on 3.4 kernels for features and hardware compatibility Emailed the guys about a 3.4 patch: no luck Let’s port this to 3.4 links(1)
  • 36. The network API had major changes between 2.6 and 3.4 on this particular part. Ended up rewriting the patchset (kernel + ethtool) entirely in 5 days Patch worked in production for a 3-4 years without a glitch vim(1) patchset.diff # ethtool -D eth0 Int-Calbr: Avr RX-Power: Alarm & Warn: TX_DISABLE: TX_FAULT: RX_LOS: RATE_SELECT MON: RATE_SELECT: Wavelength: 850 nm Alarms, warnings in beginning of line, Ie. AH = Alarm High, WL == Warn Low etc Temp: 45.7 C Thresh: Lo: -45.0/-40.0 Hi: 115.0/125.0 C Vcc: 3.32 V Thresh: Lo: 2.7/2.9 Hi: 3.7/3.9 V Tx-Bias: 6.6 mA Thresh: Lo: 1.0/2.0 Hi: 12.0/15.0 mA TX-pwr: -2.9 dBm ( 0.52 mW) Thresh: Lo: -10.0/-8.3 Hi: 0.8/2.0 dBm RX-pwr: -1.9 dBm ( 0.64 mW) Thresh: Lo: -16.0/-14.2 Hi: 1.8/3.0 dBm
  • 37. Proud and happy i wrote an email to someone “doing things in the kernel” <kernelguy> “$#!$#!$$@%$#%#!$%#%@$” <me> “So what should i fix ?” <End Of Discussion> git-format-patch(1)
  • 38. After adding the ethtool output and the patchset into the reseller’s case he agreed to change the incompatible SFP+ after only 5 days of hard work 480 brand new SFP+ arrived. We changed the faulty SFP+ for weeks. And that was it Roughly 200K euros were saved with these metrics hledger(1)
  • 40. iozone(1) In a hosting company we built ZFS based SAN/NAS <coworker> last batch of servers have serious storage performances issues under load <SRE> alright let’s dig
  • 41. sha256sum(1) # iostat -E c0t5000C5004124B687d0 sd31 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: SEAGATE Product: ST2000NM0001 Revision: PS04 Serial No: Z1P1HECD Size: 2000.40GB <2000398934016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 # iostat -E c11t50014EE3000E9080d0 sd22 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: WD Product: WD2000FYYG Revision: D1B3 Serial No: WMAWP0192044 Size: 2000.40GB <2000398934016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 4 Predictive Failure Analysis: 0 Disk had the same labels, same tech specs, but not exactly the same physical look
  • 42. alpine(1) <me> Sir it is not the same disk brand/model <reseller> we do not guarantee anything else that tech specs <me> [...] Please do something !
  • 43. orion(1) Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sde 0.00 0.00 1.00 246.00 0.00 123.00 1019.89 124.77 127.80 4.05 100.10 sdc 0.00 0.00 1.00 104.00 0.00 52.00 1014.32 120.05 896.32 9.52 100.00 After extensive profiling we are able to reproduce the problematic workload Which happens to be a very important workload for ZFS
  • 44. sup(1) <manufacturer> Our test suite shows that the disks you have sent are fine <me> except they are not. see the iostat output [nothing for 1 week]
  • 45. smartctl(1) /dev/sde: SEAGATE ST2000NM0001 PS04 Direct access device specific parameters: WP=0 DPOFUA=1 Read write error recovery [rw] mode page: AWRE 1 [cha: y, def: 1, sav: 1] Automatic write reallocation enabled ARRE 1 [cha: y, def: 1, sav: 1] Automatic read reallocation enabled TB 0 [cha: y, def: 0, sav: 0] Transfer block RC 0 [cha: y, def: 0, sav: 0] Read continuous EER 0 [cha: y, def: 0, sav: 0] Enable early recovery PER 0 [cha: y, def: 0, sav: 0] Post error DTE 0 [cha: y, def: 0, sav: 0] Data terminate on error DCR 0 [cha: y, def: 0, sav: 0] Disable correction RRC 20 [cha: y, def: 20, sav: 20] Read retry count COR_S 255 [cha: n, def:255, sav:255] Correction span (obsolete) HOC 0 [cha: n, def: 0, sav: 0] Head offset count (obsolete) DSOC 0 [cha: n, def: 0, sav: 0] Data strobe offset count (obsolete) TPERE 0 [cha: n, def: 0, sav: 0] Thin provisioning error reporting enabled WRC 5 [cha: y, def: 5, sav: 5] Write retry count RTL 8000 [cha: y, def:8000, sav:8000] Recovery time limit (ms) Disconnect-reconnect (SPC + transports) [dr] mode page: BFR 0 [cha: n, def: 0, sav: 0] Buffer full ratio BER 0 [cha: n, def: 0, sav: 0] Buffer empty ratio BIL 0 [cha: n, def: 0, sav: 0] Bus inactivity limit DTL 0 [cha: n, def: 0, sav: 0] Disconnect time limit CTL 0 [cha: n, def: 0, sav: 0] Connect time limit MBS 314 [cha: y, def:314, sav:314] Maximum burst size (512 bytes) EMDP 0 [cha: n, def: 0, sav: 0] Enable modify data pointers FA 0 [cha: n, def: 0, sav: 0] Fair arbitration DIMM 0 [cha: n, def: 0, sav: 0] Disconnect immediate DTDC 0 [cha: n, def: 0, sav: 0] Data transfer disconnect control FBS 0 [cha: n, def: 0, sav: 0] First burst size (512 bytes) Format (SBC) [fo] mode page: TPZ 48080 [cha: n, def:48080, sav:48080] Tracks per zone ASPZ 0 [cha: n, def: 0, sav: 0] Alternate sectors per zone ATPZ 0 [cha: n, def: 0, sav: 0] Alternate tracks per zone ATPLU 896 [cha: n, def:896, sav:896] Alternate tracks per logical unit SPT 1220 [cha: n, def:1220, sav:1220] Sectors per track DBPPS 512 [cha: n, def:512, sav:512] Data bytes per physical sector INTLV 1 [cha: n, def: 1, sav: 1] Interleave TSF 156 [cha: n, def:156, sav:156] Track skew factor CSF 38 [cha: n, def: 38, sav: 38] Cylinder skew factor SSEC 0 [cha: n, def: 0, sav: 0] Soft sector HSEC 1 [cha: n, def: 1, sav: 1] Hard sector RMB 0 [cha: n, def: 0, sav: 0] Removable SURF 0 [cha: n, def: 0, sav: 0] Surface Rigid disk (SBC) [rd] mode page: NOC 249000 [cha: n, def:249000, sav:249000] Number of cylinders NOH 8 [cha: n, def: 8, sav: 8] Number of heads SCWP 0 [cha: n, def: 0, sav: 0] Starting cylinder for write precompensation SCRWC 0 [cha: n, def: 0, sav: 0] Starting cylinder for reduced write current DSR 0 [cha: n, def: 0, sav: 0] Device step rate LZC 0 [cha: n, def: 0, sav: 0] Landing zone cylinder RPL 0 [cha: n, def: 0, sav: 0] Rotational position locking ROTO 0 [cha: n, def: 0, sav: 0] Rotational offset MRR 7200 [cha: n, def:7200, sav:7200] Medium rotation rate (rpm) Verify error recovery (SBC) [ve] mode page: V_EER 0 [cha: y, def: 0, sav: 0] Enable early recovery V_PER 0 [cha: y, def: 0, sav: 0] Post error V_DTE 0 [cha: y, def: 0, sav: 0] Data terminate on error V_DCR 0 [cha: y, def: 0, sav: 0] Disable correction V_RC 20 [cha: y, def: 20, sav: 20] Verify retry count TSF 156 [cha: n, def:156, sav:156] Track skew factor V_COR_S 255 [cha: n, def:255, sav:255] Verify correction span (obsolete) V_RTL 8000 [cha: y, def:8000, sav:8000] Verify recovery time limit (ms) Caching (SBC) [ca] mode page: IC 0 [cha: y, def: 0, sav: 0] Initiator control ABPF 0 [cha: n, def: 0, sav: 0] Abort pre-fetch CAP 0 [cha: y, def: 0, sav: 0] Caching analysis permitted DISC 1 [cha: n, def: 1, sav: 1] Discontinuity SIZE 0 [cha: n, def: 0, sav: 0] Size enable WCE 0 [cha: y, def: 0, sav: 0] Write cache enable MF 0 [cha: n, def: 0, sav: 0] Multiplication factor RCD 0 [cha: y, def: 0, sav: 0] Read cache disable DRRP 0 [cha: n, def: 0, sav: 0] Demand read retention priority WRP 0 [cha: n, def: 0, sav: 0] Write retention priority DPTL -1 [cha: n, def: -1, sav: -1] Disable pre-fetch transfer length MIPF 0 [cha: y, def: 0, sav: 0] Minimum pre-fetch MAPF -1 [cha: y, def: -1, sav: -1] Maximum pre-fetch MAPFC -1 [cha: n, def: -1, sav: -1] Maximum pre-fetch ceiling FSW 1 [cha: n, def: 1, sav: 1] Force sequential write LBCSS 0 [cha: n, def: 0, sav: 0] Logical block cache segment size DRA 0 [cha: y, def: 0, sav: 0] Disable read ahead NV_DIS 0 [cha: n, def: 0, sav: 0] Non-volatile cache disable NCS 32 [cha: n, def: 32, sav: 32] Number of cache segments CSS 0 [cha: n, def: 0, sav: 0] Cache segment size Control [co] mode page: TST 0 [cha: n, def: 0, sav: 0] Task set type TMF_ONLY 0 [cha: n, def: 0, sav: 0] Task management functions only D_SENSE 0 [cha: y, def: 0, sav: 0] Descriptor format sense data GLTSD 0 [cha: y, def: 0, sav: 0] Global logging target save disable RLEC 0 [cha: y, def: 0, sav: 0] Report log exception condition QAM 0 [cha: y, def: 0, sav: 0] Queue algorithm modifier QERR 0 [cha: y, def: 0, sav: 0] Queue error management RAC 0 [cha: n, def: 0, sav: 0] Report a check UA_INTLCK 0 [cha: n, def: 0, sav: 0] Unit attention interlocks control SWP 0 [cha: n, def: 0, sav: 0] Software write protect ATO 0 [cha: n, def: 0, sav: 0] Application tag owner TAS 0 [cha: n, def: 0, sav: 0] Task aborted status AUTOLOAD 0 [cha: n, def: 0, sav: 0] Autoload mode BTP 0 [cha: n, def: 0, sav: 0] Busy timeout period (100us) ESTCT 18500 [cha: n, def:18500, sav:18500] Extended self test completion time (sec) Protocol specific logical unit [pl] mode page: LUPID 6 [cha: n, def: 6, sav: 6] Logical unit's (transport) protocol identifier Protocol specific port [pp] mode page: PPID 6 [cha: n, def: 6, sav: 6] Port's (transport) protocol identifier Power condition [po] mode page: STANDBY_Y 0 [cha: n, def: 0, sav: 0] Standby_y timer enabled IDLE_C 0 [cha: n, def: 0, sav: 0] Idle_c timer enabled IDLE_B 0 [cha: n, def: 0, sav: 0] Idle_b timer active IDLE 1 [cha: y, def: 1, sav: 1] Idle timer enabled STANDBY 0 [cha: y, def: 0, sav: 0] Standby timer active ICT 5 [cha: y, def: 5, sav: 5] Idle condition timer (100 ms) SCT 36000 [cha: n, def:36000, sav:36000] Standby condition timer (100 ms) Informational exceptions control [ie] mode page: PERF 0 [cha: y, def: 0, sav: 0] Performance (impact of ie operations) EBF 0 [cha: n, def: 0, sav: 0] Enable background function EWASC 0 [cha: y, def: 0, sav: 0] Enable warning DEXCPT 0 [cha: y, def: 0, sav: 0] Disable exceptions TEST 0 [cha: y, def: 0, sav: 0] Test (simulate device failure) EBACKERR 0 [cha: n, def: 0, sav: 0] Enable background (scan + self test) error reporting LOGERR 1 [cha: y, def: 1, sav: 1] Log informational exception errors MRIE 6 [cha: y, def: 6, sav: 6] Method of reporting informational exceptions INTT 6000 [cha: y, def:6000, sav:6000] Interval timer (100 ms) REPC 0 [cha: n, def: 0, sav: 0] Report count (or Test flag number [SSC-3]) Background control (SBC) [bc] mode page: S_L_FULL 0 [cha: n, def: 0, sav: 0] Suspend on log full LOWIR 0 [cha: n, def: 0, sav: 0] Log only when intervention required EN_BMS 1 [cha: y, def: 1, sav: 1] Enable background medium scan EN_PS 0 [cha: n, def: 0, sav: 0] Enable pre-scan BMS_I 336 [cha: y, def:336, sav:336] Background medium scan interval time (hour) BPS_TL 24 [cha: y, def: 24, sav: 24] Background pre-scan time limit (hour) MIN_IDLE 250 [cha: y, def:250, sav:250] Minumum idle time before background scan (ms) MAX_SUSP 0 [cha: y, def: 0, sav: 0] Maximum time to suspend background scan (ms)
  • 46. :(){ :|:& };: At the same time they provided us a “fix” firmware Performances were even better than with the good disk. Binary diff showed a 1 bit change. Yes a boolean. Their firmware was silently enabling write cache (without battery) Which is to say the least dangerous
  • 47. dc(1) --- st2000nm00001_sdparm.txt 2013-02-28 21:12:15.287433646 +0100 +++ wd2000fyyg_sdparm.txt 2013-02-28 21:12:28.823131392 +0100 @@ -1,102 +1,97 @@ - /dev/sde: SEAGATE ST2000NM0001 PS04 + /dev/sdc: WD WD2000FYYG D1BB Direct access device specific parameters: WP=0 DPOFUA=1 Read write error recovery [rw] mode page: - RC 0 [cha: y, def: 0, sav: 0] Read continuous + RC 0 [cha: n, def: 0, sav: 0] Read continuous - RRC 20 [cha: y, def: 20, sav: 20] Read retry count - COR_S 255 [cha: n, def:255, sav:255] Correction span (obsolete) + RRC 255 [cha: y, def:255, sav:255] Read retry count + COR_S 0 [cha: n, def: 0, sav: 0] Correction span (obsolete) - WRC 5 [cha: y, def: 5, sav: 5] Write retry count + WRC 255 [cha: y, def:255, sav:255] Write retry count Verify error recovery (SBC) [ve] mode page: - V_RC 20 [cha: y, def: 20, sav: 20] Verify retry count - V_COR_S 255 [cha: n, def:255, sav:255] Verify correction span (obsolete) + V_RC 255 [cha: y, def:255, sav:255] Verify retry count + V_COR_S 0 [cha: n, def: 0, sav: 0] Verify correction span (obsolete) Caching (SBC) [ca] mode page: - FSW 1 [cha: n, def: 1, sav: 1] Force sequential write + FSW 0 [cha: y, def: 0, sav: 0] Force sequential write Informational exceptions control [ie] mode page: - PERF 0 [cha: y, def: 0, sav: 0] Performance (impact of ie operations) - EBF 0 [cha: n, def: 0, sav: 0] Enable background function + PERF 0 [cha: n, def: 0, sav: 0] Performance (impact of ie operations) + EBF 1 [cha: y, def: 1, sav: 1] Enable background function - EBACKERR 0 [cha: n, def: 0, sav: 0] Enable background (scan + self test) error reporting - LOGERR 1 [cha: y, def: 1, sav: 1] Log informational exception errors + EBACKERR 0 [cha: y, def: 0, sav: 0] Enable background (scan + self test) error reporting + LOGERR 1 [cha: n, def: 1, sav: 1] Log informational exception errors - REPC 0 [cha: n, def: 0, sav: 0] Report count (or Test flag number [SSC-3]) + REPC 0 [cha: y, def: 0, sav: 0] Report count (or Test flag number [SSC-3]) Background control (SBC) [bc] mode page: - S_L_FULL 0 [cha: n, def: 0, sav: 0] Suspend on log full - LOWIR 0 [cha: n, def: 0, sav: 0] Log only when intervention required + S_L_FULL 0 [cha: y, def: 0, sav: 0] Suspend on log full + LOWIR 0 [cha: y, def: 0, sav: 0] Log only when intervention required - EN_PS 0 [cha: n, def: 0, sav: 0] Enable pre-scan + EN_PS 0 [cha: y, def: 0, sav: 0] Enable pre-scan - BPS_TL 24 [cha: y, def: 24, sav: 24] Background pre-scan time limit (hour) + BPS_TL 0 [cha: y, def: 0, sav: 0] Background pre-scan time limit (hour) We proved the disks did not have the same behavior/specs using this diff The reseller changed all the 250 disks 150K euros We spent the next months changing and resilvering arrays already in production
  • 48. TLDR;
  • 49. sha256sum(1) Metrics are everywhere in operations at an unprecedented scale and still growing fast The vast majority of I.T. professionals do not understand fully what they are currently graphing Graphs are meant to trigger a deeper questioning when the behavior changes To make a costly decision based on metrics without taking the time to ensure what is exactly this metric is pure folly Acquiring this knowledge is necessary and time consuming and requires humility
  • 50. task(1) add project:young_me [...] Kernel code ain’t no saint writing Macros make things easy if you are not a C guru Read git history per sub-system it helps a lot Ask upstream if they are interested in what you plan to write Propose a (probably stupid) way of doing the change before doing code Then code and get things upstreamed