Velocity 2012 - Learning WebOps the Hard Way

Learning Webops
the Hard Way

What could possibly go wrong?

Cosimo Streppone
WebOps Lead – Opera Software

teams organization

mail

webops sysadmin

#
# Cyrus IMAPD annotation definitions file
#
/vendor/messagingengine.com/
preview,message,string,backend,value.shared,

misplaced comma +
fix didn't make it to master +
unintended general rollout +
parser choked on comma +
fork with no rate limiting +
fatal() dumped core +
kernel.core_uses_pid = 1 +
small SSD metadata partition +
indexes corruption =
massive outage (no data loss)

DO
Rate limit fork of children

Test disk full conditions

Master your infrastructure

DO NOT
Underestimate Mighty Comma

Rollout everywhere at once

Leave your CI builds messy

read more
“A cascade of errors”
http://blog.fastmail.fm/2011/05/15/outage-
report-a-cascade-of-errors/

physical bladecenters?
LVS? network?
kernel?
solar storms?
WTF?!?
random failures in our
defective cpus?
infrastructure
DDoS?
Mayas?
bnx2? traffic?
recent deploys?

what we experienced

random performance degradation
general instability
steady increase of WTFs/min!

real problem
●
2.6.32 = debian squeeze kernel
● sched – find_busiest_group()

● TSC register wraparound

Proof
64
2
= 208,49
10 9
2 · 86400 · 10

Subject: [PATCH] sched: avoid unnecessary overflow in sched_clock
From: Salman Qazi <sqazi@google.com>
Date: 2011-11-16 20:55:31

In hundreds of days, the __cycles_2_ns calculation in sched_clock
has an overflow. cyc * per_cpu(cyc2ns, cpu) exceeds 64 bits, causing
the final value to become zero. We can solve this without losing
any precision.

We can decompose TSC into quotient and remainder of division by the
scale factor, and then use this to convert TSC into nanoseconds.

Reviewed-by: Paul Turner <pjt@google.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Salman Qazi <sqazi@google.com>
---
arch/x86/include/asm/timer.h | 23 ++++++++++++++++++++++-
1 files changed, 22 insertions(+), 1 deletions(-)

Patch #1, Nov 16th 2011
diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
index fa7b917..431793e 100644
--- a/arch/x86/include/asm/timer.h
+++ b/arch/x86/include/asm/timer.h
@@ -32,6 +32,22 @@ extern int no_timer_check;
* (mathieu.desnoyers@polymtl.ca)
*
* -johnstul@us.ibm.com "math is hard, lets go shopping!"

--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -608,6 +608,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, ...)
{
unsigned long long tsc_now, ns_now, *offset;
unsigned long flags, *scale;
+ unsigned long long quot;
+ unsigned long long rem; Patch #2, Mar 8th 2012
local_irq_save(flags);
sched_clock_idle_sleep_event();
@@ -620,7 +622,15 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, ...)

if (cpu_khz) {
*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
- *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
+
+ /*
+ * Avoid premature overflow by splitting into quotient
+ * and remainder. See the comment above __cycles_2_ns
+ */
+ quot = (tsc_now >> CYC2NS_SCALE_FACTOR);
+ rem = tsc_now & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
+ *offset = ns_now - (quot * *scale +
+ ((rem * *scale) >> CYC2NS_SCALE_FACTOR));
}

32
2
= 49,7
3
86400 · 10

DO
Be perseverant and creative :)

Learn more about your kernel

Improve tools to collect data

DO NOT
Run servers continuously
for more than 208 days?

t - 4y 2m
From: Roman Zippel <zippel@linux-m68k.org>
Date: Thu, 1 May 2008 04:34:41 -0700
Subject: [PATCH] ntp: handle leap second via timer

Remove the leap second handling from second_overflow(), which doesn't have to
check for it every second anymore. With CONFIG_NO_HZ this also makes sure the
leap second is handled close to the full second. Additionally this makes it
possible to abort a leap second properly by resetting the STA_INS/STA_DEL status bits.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
include/linux/clocksource.h | 2 +
include/linux/timex.h | 1 +
kernel/time/ntp.c | 133 +++++++++++++++++++++++++++++--------------
kernel/time/timekeeping.c | 4 +-

lie(t) = (1 – cos(πt / w)) / 2
lie(s)

t

T - 6m

http://bit.ly/NmA47E

http://my.opera.com/marcomarongiu/blog/index.dml/tag/ntp

T – 1 month
package {
ntpdate: ensure => installed;
adjtimex: ensure => installed;
}

file { "/usr/local/bin/leap-adjust.pl":
ensure => present,
source => "puppet:///modules/ntp/leap-adjust.pl",
}

file { "/etc/cron.d/ntp-leap-second":
ensure => present,
source => "puppet:///modules/ntp/leap-crontab",
require => [ Package["ntp"], Package["adjtimex"] ],
}

T – 1 day
June 30th 2012

chaos begins

T - 8h

http://bit.ly/PSBMRP

http://serverfault.com/questions/403732/leapocalypse

the work around

# date -s now

T + {1,2}m
{August,September} 1st, 2012

fake leap seconds

read more
A story of leaping seconds
http://blog.fastmail.fm/2012/07/03/a-story-of-leaping-seconds/

Tips and tricks to deal with leap seconds
http://my.opera.com/marcomarongiu/blog/index.dml/tag/ntp

Serverfault question on random debian crashes
http://serverfault.com/questions/403732/leapocalypse

Wired article about leap second problems
http://www.wired.com/wiredenterprise/2012/07/leap-second-bug-
wreaks-havoc-with-java-linux/

DO
Keep your kernel updated

Use valuable external resources
(serverfault etc...)

DO NOT
Underestimate the
importance of time

failure lessons learned

}
expect
assume
prepare
simulate failure
measure
embrace

ops lessons learned
Don't repeat yourself (DRY)
Always keep it simple (KISS)
Separate ops team doesn't work well
Practice Continuous deployment. Now.
Communication makes the difference
Learn your tools
Master your infrastructure
RTFM
...

Thanks!

@cstrep
cosimo@opera.com

Velocity 2012 - Learning WebOps the Hard Way

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Velocity 2012 - Learning WebOps the Hard Way

Similar to Velocity 2012 - Learning WebOps the Hard Way (20)

More from Cosimo Streppone

More from Cosimo Streppone (11)

Recently uploaded

Recently uploaded (20)

Velocity 2012 - Learning WebOps the Hard Way