SlideShare a Scribd company logo
1 of 56
Download to read offline
Make Sure Your Applications Crash




           Moshe Zadka
True story
Python doesn't crash




Memory managed, no direct pointer arithmetic
...except it does




 C bugs, untrapped exception, infinite loops,
blocking calls, thread dead-lock, inconsistent
                 resident state
Recovery is important




"[S]ystem failure can usually be considered to
  be the result of two program errors[...] the
      second, in the recovery routine[...]"
Crashes and inconsistent data




A crash results in data from an arbitrary
            program state.
Avoid storage




Caches are better than master copies.
Databases




Transactions maintain consistency
    Databases can crash too!
Atomic operations




    File rename
Example: Counting
def update_counter():
    fp = file("counter.txt")
    s = fp.read()
    counter = int(s.strip())
    counter += 1
    # If there is a crash before this point,
    # no changes have been done.
    fp = file("counter.txt.tmp", 'w')
    print >>fp, counter
    fp.close()
    # If there is a crash before this point,
    # only a temp file has been modified
    # The following is an atomic operation
    os.rename("counter.txt.tmp", "counter.txt")
Efficient caches, reliable masters




     Mark inconsistency of cache
No shutdown




Crash in testing
Availability




If data is consistent, just restart!
Improving availability




        Limit impact
       Fast detection
        Fast start-up
Vertical splitting




Different execution paths, different processes
Horizontal splitting




Different code bases, different processes
Watchdog




Monitor -> Flag -> Remediate
Watchdog principles




Keep it simple, keep it safe!
Watchdog: Heartbeats
## In a Twisted process
def beat():
    file('beats/my-name', 'a').close()
task.LoopingCall(beat).start(30)
Watchdog: Get time-outs
def getTimeout()
    timeout = dict()
    now = time.time()
    for heart in glob.glob('hearts/*'):
        beat = int(file(heart).read().strip())
        timeout[heart] = now-beat
    return timeout
Watchdog: Mark problems
def markProblems():
    timeout = getTimeout()
    for heart in glob.glob('beats/*'):
        mtime = os.path.getmtime(heart)
        problem = 'problems/'+heart
        if (mtime<timeout[heart] and
           not os.path.isfile(problem)):
            fp = file('problems/'+heart, 'w')
            fp.write('watchdog')
            fp.close()
Watchdog: check solutions
def checkSolutions():
    now = time.time()
    problemTimeout = now-30
    for problem in glob.glob('problems/*'):
        mtime = os.path.getmtime(problem)
        if mtime<problemTimeout:
            subprocess.call(['restart-system'])
Watchdog: Loop
## Watchdog
while True:
    markProblems()
    checkSolutions()
    time.sleep(1)
Watchdog: accuracy of




Custom checkers can manufacture problems
Watchdog: reliability of




   Use cron for main loop
Watchdog: reliability of




Use software/hardware watchdogs
Conclusions




Everything crashes -- plan for it
Questions?
Welcome to the back-up slides
         Extra! Extra!
Example: Counting on Windows
def update_counter():
    fp = file("counter.txt")
    s = fp.read()
    counter = int(s.strip())
    counter += 1
    # If there is a crash before this point,
    # no changes have been done.
    fp = file("counter.txt.tmp", 'w')
    print >>fp, counter
    fp.close()
    # If there is a crash before this point,
    # only a temp file has been modified
    os.remove("counter.txt")
    # At this point, the state is inconsistent*
    # The following is an atomic operation
os.rename("counter.txt.tmp", "counter.txt")
Example: Counting on Windows
             (Recovery)
def recover():
    if not os.path.exists("counter.txt"):
        # The permanent file has been removed
        # Therefore, the temp file is valid
        os.rename("counter.txt.tmp",
                  "counter.txt")
Example: Counting with versions
def update_counter():
    files = [int(name.split('.')[-1])
               for name in os.listdir('.')
                 if name.startswith('counter.')]
    last = max(files)
    counter = int(file('counter.%s' % last
                      ).read().strip())
    counter += 1
    # If there is a crash before this point,
    # no changes have been done.
    fp = file("tmp.counter", 'w')
    print >>fp, counter
    fp.close()
    # If there is a crash before this point,
    # only a temp file has been modified
os.rename('tmp.counter',
          'counter.%s' % (last+1))
os.remove('counter.%s' % last)
Example: Counting with versions
             (cleanup)
# This is not a recovery routine, but a cleanup
# routine.
# Even in its absence, the state is consistent
def cleanup():
    files = [int(name.split('.')[-1])
                for name in os.listdir('.')
                  if name.startswith('counter.')]
    files.sort()
    files.pop()
    for n in files:
        os.remove('counter.%d' % n)
    if os.path.exists('tmp.counter'):
        os.remove('tmp.counter')
Correct ordering
def activate_due():
    scheduled = rs.smembers('scheduled')
    now = time.time()
    for el in scheduled:
        due = int(rs.get(el+':due'))
        if now<due:
            continue
        rs.sadd('activated', el)
        rs.delete(el+':due')
        rs.sremove('scheduled', el)
Correct ordering (recovery)
def recover():
    inconsistent = rs.sinter('activated',
                             'scheduled')
    for el in inconsistent:
        rs.delete(el+':due') #*
        rs.sremove('scheduled', el)
Example: Key/value stores
0.log:
  ['add', 'key-0', 'value-0']
  ['add', 'key-1', 'value-1']
  ['add', 'key-0', 'value-2']
  ['remove', 'key-1']
  .
  .
  .

1.log:
  .
  .
  .

2.log:
.
.
.
Example: Key/value stores (utility
             functions)
## Get the level of a file
def getLevel(s)
    return int(s.split('.')[0])

## Get all files of a given type
def getType(tp):
    return [(getLevel(s), s)
             for s in files if s.endswith(tp)]
Example: Key/value stores
             (classifying files)
## Get all relevant files
def relevant(d):
    files = os.listdir(d):
    mlevel, master = max(getType('.master'))
    logs = getType('.log')
    logs.sort()
    return master+[log for llevel, log in logs
                           if llevel>mlevel]
Example: Key/value stores (reading)
## Read in a single file
def update(result, fp):
    for line in fp:
        val = json.loads(line)
        if val[0] == 'add':
            result[val[1]] = val[2]
        else:
            del result[val[1]]

## Read in several files
def read(files):
    result = dict()
    for fname in files:
        try:
             update(result, file(fname))
except ValueError:
        pass
return result
Example: Key/value stores (writer
               class)
class Writer(object):
    def __init__(self, level):
        self.level = level
        self.fp = None
        self._next()
    def _next(self):
        self.level += 1
        if self.fp:
            self.fp.close()
        name ='%3d.log' % self.currentLevel
        self.fp = file(name, 'w')
        self.rows = 0
    def write(self, value):
print >>self.fp, json.dumps(value)
self.fp.flush()
self.rows += 1
if self.rows>200:
    self._next()
Example: Key/value stores (storage
               class)
## The actual data store abstraction.
class Store(object):
    def __init__(self):
        files = relevant(d)
        self.result = read(files)
        level = getLevel(files[-1])
        self.writer = Writer(level)
    def get(self, key):
        return self.result[key]
    def add(self, key, value):
        self.writer.write(['add', key, value])
    def remove(self, key):
        self.writer.write(['remove', key])
Example: Key/value stores
            (compression code)
## This should be run periodically
# from a different thread
def compress(d):
    files = relevant(d)[:-1]
    if len(files)<2:
        return
    result = read(files)
    master = getLevel(files[-1])+1
    fp = file('%3d.master.tmp' % master, 'w')
    for key, value in result.iteritems():
        towrite = ['add', key, value])
        print >>fp, json.dumps(towrite)
    fp.close()
Vertical splitting: Example
def forking_server():
    s = socket.socket()
    s.bind(('', 8080))
    s.listen(5)
    while True:
        client = s.accept()
        newpid = os.fork()
        if newpid:
            f = client.makefile()
            f.write("Sunday, May 22, 1983 "
                    "18:45:59-PST")
            f.close()
            os._exit()
Horizontal splitting: front-end
## Process one
class SchedulerResource(resource.Resource):
    isLeaf = True
    def __init__(self, filepath):
        resource.Resource.__init__(self)
        self.filepath = filepath
    def render_PUT(self, request):
        uuid, = request.postpath
        content = request.content.read()
        child = self.filepath.child(uuid)
        child.setContent(content)
fp = filepath.FilePath("things")
r = SchedulerResource(fp)
s = server.Site(r)
reactor.listenTCP(8080, s)
Horizontal splitting: scheduler
## Process two
rs = redis.Redis(host='localhost',
                  port=6379, db=9)
while True:
    for fname in os.listdir("things"):
        when = int(file(fname).read().strip())
        rs.set(uuid+':due', when)
        rs.sadd('scheduled', uuid)
        os.remove(fname)
    time.sleep(1)
Horizontal splitting: runner
## Process three
rs = redis.Redis(host='localhost',
                  port=6379, db=9)
recover()
while True:
    activate_due()
    time.sleep(1)
Horizontal splitting: message
           queues
     No direct dependencies
Horizontal splitting: message
            queues: sender
## Process four
rs = redis.Redis(host='localhost',
                 port=6379, db=9)
params = pika.ConnectionParameters('localhost')
conn = pika.BlockingConnection(params)
channel = conn.channel()
channel.queue_declare(queue='active')
while True:
    activated = rs.smembers('activated')
    finished = set(rs.smembers('finished'))
    for el in activated:
        if el in finished:
            continue
channel.basic_publish(
    exchange='', routing_key='active',
    body=el)
rs.add('finished', el)
Horizontal splitting: message
            queues: receiver
## Process five
# It is possible to get "dups" of bodies.
# Application logic should deal with that
params = pika.ConnectionParameters('localhost')
conn = pika.BlockingConnection(params)
channel = conn.channel()
channel.queue_declare(queue='active')
def callback(ch, method, properties, el):
    syslog.syslog('Activated %s' % el)
channel.basic_consume(callback, queue='hello', no_ack=True)
channel.start_consuming()
Horizontal splitting: point-to-point
      Use HTTP (preferably, REST)

More Related Content

What's hot

Ansible for Beginners
Ansible for BeginnersAnsible for Beginners
Ansible for BeginnersArie Bregman
 
Leveraging Hadoop for Legacy Systems
Leveraging Hadoop for Legacy SystemsLeveraging Hadoop for Legacy Systems
Leveraging Hadoop for Legacy SystemsMathias Herberts
 
Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介Kiyotaka Oku
 
Apache Airflow
Apache AirflowApache Airflow
Apache AirflowJason Kim
 
Replica Sets (NYC NoSQL Meetup)
Replica Sets (NYC NoSQL Meetup)Replica Sets (NYC NoSQL Meetup)
Replica Sets (NYC NoSQL Meetup)MongoDB
 
Threads Advance in System Administration with Linux
Threads Advance in System Administration with LinuxThreads Advance in System Administration with Linux
Threads Advance in System Administration with LinuxSoumen Santra
 
(map Clojure everyday-tasks)
(map Clojure everyday-tasks)(map Clojure everyday-tasks)
(map Clojure everyday-tasks)Jacek Laskowski
 
Python profiling
Python profilingPython profiling
Python profilingdreampuf
 
Assignment no39
Assignment no39Assignment no39
Assignment no39Jay Patel
 
Advanced Replication
Advanced ReplicationAdvanced Replication
Advanced ReplicationMongoDB
 
Improving go-git performance
Improving go-git performanceImproving go-git performance
Improving go-git performancesource{d}
 
使ってみよう!JDK Flight Recorder
使ってみよう!JDK Flight Recorder使ってみよう!JDK Flight Recorder
使ってみよう!JDK Flight RecorderYoshiro Tokumasu
 
JFR Event StreamingによるAP監視 - JDK Flight Recorder の活用 -
JFR Event StreamingによるAP監視 - JDK Flight Recorder の活用 -JFR Event StreamingによるAP監視 - JDK Flight Recorder の活用 -
JFR Event StreamingによるAP監視 - JDK Flight Recorder の活用 -Yoshiro Tokumasu
 
The Ring programming language version 1.5.4 book - Part 25 of 185
The Ring programming language version 1.5.4 book - Part 25 of 185The Ring programming language version 1.5.4 book - Part 25 of 185
The Ring programming language version 1.5.4 book - Part 25 of 185Mahmoud Samir Fayed
 
Hadoop 20111215
Hadoop 20111215Hadoop 20111215
Hadoop 20111215exsuns
 
The Ring programming language version 1.6 book - Part 71 of 189
The Ring programming language version 1.6 book - Part 71 of 189The Ring programming language version 1.6 book - Part 71 of 189
The Ring programming language version 1.6 book - Part 71 of 189Mahmoud Samir Fayed
 

What's hot (20)

Ansible for Beginners
Ansible for BeginnersAnsible for Beginners
Ansible for Beginners
 
Leveraging Hadoop for Legacy Systems
Leveraging Hadoop for Legacy SystemsLeveraging Hadoop for Legacy Systems
Leveraging Hadoop for Legacy Systems
 
Ns2programs
Ns2programsNs2programs
Ns2programs
 
Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
 
Hadoop
HadoopHadoop
Hadoop
 
Replica Sets (NYC NoSQL Meetup)
Replica Sets (NYC NoSQL Meetup)Replica Sets (NYC NoSQL Meetup)
Replica Sets (NYC NoSQL Meetup)
 
Threads Advance in System Administration with Linux
Threads Advance in System Administration with LinuxThreads Advance in System Administration with Linux
Threads Advance in System Administration with Linux
 
(map Clojure everyday-tasks)
(map Clojure everyday-tasks)(map Clojure everyday-tasks)
(map Clojure everyday-tasks)
 
Python profiling
Python profilingPython profiling
Python profiling
 
Assignment no39
Assignment no39Assignment no39
Assignment no39
 
Advanced Replication
Advanced ReplicationAdvanced Replication
Advanced Replication
 
Improving go-git performance
Improving go-git performanceImproving go-git performance
Improving go-git performance
 
使ってみよう!JDK Flight Recorder
使ってみよう!JDK Flight Recorder使ってみよう!JDK Flight Recorder
使ってみよう!JDK Flight Recorder
 
Javantura v2 - Replication with MongoDB - what could go wrong... - Philipp Krenn
Javantura v2 - Replication with MongoDB - what could go wrong... - Philipp KrennJavantura v2 - Replication with MongoDB - what could go wrong... - Philipp Krenn
Javantura v2 - Replication with MongoDB - what could go wrong... - Philipp Krenn
 
JFR Event StreamingによるAP監視 - JDK Flight Recorder の活用 -
JFR Event StreamingによるAP監視 - JDK Flight Recorder の活用 -JFR Event StreamingによるAP監視 - JDK Flight Recorder の活用 -
JFR Event StreamingによるAP監視 - JDK Flight Recorder の活用 -
 
The Ring programming language version 1.5.4 book - Part 25 of 185
The Ring programming language version 1.5.4 book - Part 25 of 185The Ring programming language version 1.5.4 book - Part 25 of 185
The Ring programming language version 1.5.4 book - Part 25 of 185
 
Hadoop 20111215
Hadoop 20111215Hadoop 20111215
Hadoop 20111215
 
How To Recoord
How To RecoordHow To Recoord
How To Recoord
 
The Ring programming language version 1.6 book - Part 71 of 189
The Ring programming language version 1.6 book - Part 71 of 189The Ring programming language version 1.6 book - Part 71 of 189
The Ring programming language version 1.6 book - Part 71 of 189
 

Viewers also liked

My trans kit checklist gw1 ds1_gw3
My trans kit checklist gw1 ds1_gw3My trans kit checklist gw1 ds1_gw3
My trans kit checklist gw1 ds1_gw3David Sommer
 
Strategies for Friendly English and Successful Localization
Strategies for Friendly English and Successful LocalizationStrategies for Friendly English and Successful Localization
Strategies for Friendly English and Successful LocalizationJohn Collins
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Javajbellis
 
Internationalization in Rails 2.2
Internationalization in Rails 2.2Internationalization in Rails 2.2
Internationalization in Rails 2.2Nicolas Jacobeus
 
Sample of instructions
Sample of instructionsSample of instructions
Sample of instructionsDavid Sommer
 
Designing for Multiple Mobile Platforms
Designing for Multiple Mobile PlatformsDesigning for Multiple Mobile Platforms
Designing for Multiple Mobile PlatformsRobert Douglas
 
2008 Fourth Quarter Real Estate Commentary
2008 Fourth Quarter Real Estate Commentary2008 Fourth Quarter Real Estate Commentary
2008 Fourth Quarter Real Estate Commentaryalghanim
 
Stc 2014 unraveling the mysteries of localization kits
Stc 2014 unraveling the mysteries of localization kitsStc 2014 unraveling the mysteries of localization kits
Stc 2014 unraveling the mysteries of localization kitsDavid Sommer
 
Linguistic Potluck: Crowdsourcing localization with Rails
Linguistic Potluck: Crowdsourcing localization with RailsLinguistic Potluck: Crowdsourcing localization with Rails
Linguistic Potluck: Crowdsourcing localization with RailsHeatherRivers
 
mobile development platforms
mobile development platformsmobile development platforms
mobile development platformsguestfa9375
 
Sample email submission
Sample email submissionSample email submission
Sample email submissionDavid Sommer
 
How to make intelligent web apps
How to make intelligent web appsHow to make intelligent web apps
How to make intelligent web appsiapain
 
Putting Out Fires with Content Strategy (STC Academic SIG)
Putting Out Fires with Content Strategy (STC Academic SIG)Putting Out Fires with Content Strategy (STC Academic SIG)
Putting Out Fires with Content Strategy (STC Academic SIG)John Collins
 
The ruby on rails i18n core api-Neeraj Kumar
The ruby on rails i18n core api-Neeraj KumarThe ruby on rails i18n core api-Neeraj Kumar
The ruby on rails i18n core api-Neeraj KumarThoughtWorks
 
Building Quality Experiences for Users in Any Language
Building Quality Experiences for Users in Any LanguageBuilding Quality Experiences for Users in Any Language
Building Quality Experiences for Users in Any LanguageJohn Collins
 
Strategies for Friendly English and Successful Localization (InfoDevWorld 2014)
Strategies for Friendly English and Successful Localization (InfoDevWorld 2014)Strategies for Friendly English and Successful Localization (InfoDevWorld 2014)
Strategies for Friendly English and Successful Localization (InfoDevWorld 2014)John Collins
 
Putting Out Fires with Content Strategy (InfoDevDC meetup)
Putting Out Fires with Content Strategy (InfoDevDC meetup)Putting Out Fires with Content Strategy (InfoDevDC meetup)
Putting Out Fires with Content Strategy (InfoDevDC meetup)John Collins
 

Viewers also liked (20)

My trans kit checklist gw1 ds1_gw3
My trans kit checklist gw1 ds1_gw3My trans kit checklist gw1 ds1_gw3
My trans kit checklist gw1 ds1_gw3
 
Glossary
GlossaryGlossary
Glossary
 
Strategies for Friendly English and Successful Localization
Strategies for Friendly English and Successful LocalizationStrategies for Friendly English and Successful Localization
Strategies for Friendly English and Successful Localization
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Java
 
Internationalization in Rails 2.2
Internationalization in Rails 2.2Internationalization in Rails 2.2
Internationalization in Rails 2.2
 
Sample of instructions
Sample of instructionsSample of instructions
Sample of instructions
 
Designing for Multiple Mobile Platforms
Designing for Multiple Mobile PlatformsDesigning for Multiple Mobile Platforms
Designing for Multiple Mobile Platforms
 
2008 Fourth Quarter Real Estate Commentary
2008 Fourth Quarter Real Estate Commentary2008 Fourth Quarter Real Estate Commentary
2008 Fourth Quarter Real Estate Commentary
 
Stc 2014 unraveling the mysteries of localization kits
Stc 2014 unraveling the mysteries of localization kitsStc 2014 unraveling the mysteries of localization kits
Stc 2014 unraveling the mysteries of localization kits
 
Linguistic Potluck: Crowdsourcing localization with Rails
Linguistic Potluck: Crowdsourcing localization with RailsLinguistic Potluck: Crowdsourcing localization with Rails
Linguistic Potluck: Crowdsourcing localization with Rails
 
Shrunken Head
 Shrunken Head  Shrunken Head
Shrunken Head
 
mobile development platforms
mobile development platformsmobile development platforms
mobile development platforms
 
Silmeyiniz
SilmeyinizSilmeyiniz
Silmeyiniz
 
Sample email submission
Sample email submissionSample email submission
Sample email submission
 
How to make intelligent web apps
How to make intelligent web appsHow to make intelligent web apps
How to make intelligent web apps
 
Putting Out Fires with Content Strategy (STC Academic SIG)
Putting Out Fires with Content Strategy (STC Academic SIG)Putting Out Fires with Content Strategy (STC Academic SIG)
Putting Out Fires with Content Strategy (STC Academic SIG)
 
The ruby on rails i18n core api-Neeraj Kumar
The ruby on rails i18n core api-Neeraj KumarThe ruby on rails i18n core api-Neeraj Kumar
The ruby on rails i18n core api-Neeraj Kumar
 
Building Quality Experiences for Users in Any Language
Building Quality Experiences for Users in Any LanguageBuilding Quality Experiences for Users in Any Language
Building Quality Experiences for Users in Any Language
 
Strategies for Friendly English and Successful Localization (InfoDevWorld 2014)
Strategies for Friendly English and Successful Localization (InfoDevWorld 2014)Strategies for Friendly English and Successful Localization (InfoDevWorld 2014)
Strategies for Friendly English and Successful Localization (InfoDevWorld 2014)
 
Putting Out Fires with Content Strategy (InfoDevDC meetup)
Putting Out Fires with Content Strategy (InfoDevDC meetup)Putting Out Fires with Content Strategy (InfoDevDC meetup)
Putting Out Fires with Content Strategy (InfoDevDC meetup)
 

Similar to Make Sure Your Applications Crash

3 1. preprocessor, math, stdlib
3 1. preprocessor, math, stdlib3 1. preprocessor, math, stdlib
3 1. preprocessor, math, stdlib웅식 전
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11Henry Schreiner
 
GE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingGE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingMuthu Vinayagam
 
Fantastic DSL in Python
Fantastic DSL in PythonFantastic DSL in Python
Fantastic DSL in Pythonkwatch
 
Python utan-stodhjul-motorsag
Python utan-stodhjul-motorsagPython utan-stodhjul-motorsag
Python utan-stodhjul-motorsagniklal
 
Python Asíncrono - Async Python
Python Asíncrono - Async PythonPython Asíncrono - Async Python
Python Asíncrono - Async PythonJavier Abadía
 
8799.pdfOr else the work is fine only. Lot to learn buddy.... Improve your ba...
8799.pdfOr else the work is fine only. Lot to learn buddy.... Improve your ba...8799.pdfOr else the work is fine only. Lot to learn buddy.... Improve your ba...
8799.pdfOr else the work is fine only. Lot to learn buddy.... Improve your ba...Yashpatel821746
 
Or else the work is fine only. Lot to learn buddy.... Improve your basics in ...
Or else the work is fine only. Lot to learn buddy.... Improve your basics in ...Or else the work is fine only. Lot to learn buddy.... Improve your basics in ...
Or else the work is fine only. Lot to learn buddy.... Improve your basics in ...Yashpatel821746
 
PYTHONOr else the work is fine only. Lot to learn buddy.... Improve your basi...
PYTHONOr else the work is fine only. Lot to learn buddy.... Improve your basi...PYTHONOr else the work is fine only. Lot to learn buddy.... Improve your basi...
PYTHONOr else the work is fine only. Lot to learn buddy.... Improve your basi...Yashpatel821746
 
Bash cheat sheet
Bash cheat sheetBash cheat sheet
Bash cheat sheetJogesh Rao
 
Functions and modules in python
Functions and modules in pythonFunctions and modules in python
Functions and modules in pythonKarin Lagesen
 
Can you fix the errors- It isn't working when I try to run import s.pdf
Can you fix the errors- It isn't working when I try to run    import s.pdfCan you fix the errors- It isn't working when I try to run    import s.pdf
Can you fix the errors- It isn't working when I try to run import s.pdfaksachdevahosymills
 
Marrow: A Meta-Framework for Python 2.6+ and 3.1+
Marrow: A Meta-Framework for Python 2.6+ and 3.1+Marrow: A Meta-Framework for Python 2.6+ and 3.1+
Marrow: A Meta-Framework for Python 2.6+ and 3.1+ConFoo
 
Commit2015 kharchenko - python generators - ext
Commit2015   kharchenko - python generators - extCommit2015   kharchenko - python generators - ext
Commit2015 kharchenko - python generators - extMaxym Kharchenko
 
Think Async: Asynchronous Patterns in NodeJS
Think Async: Asynchronous Patterns in NodeJSThink Async: Asynchronous Patterns in NodeJS
Think Async: Asynchronous Patterns in NodeJSAdam L Barrett
 
Terminal linux commands_ Fedora based
Terminal  linux commands_ Fedora basedTerminal  linux commands_ Fedora based
Terminal linux commands_ Fedora basedNavin Thapa
 

Similar to Make Sure Your Applications Crash (20)

Five
FiveFive
Five
 
3 1. preprocessor, math, stdlib
3 1. preprocessor, math, stdlib3 1. preprocessor, math, stdlib
3 1. preprocessor, math, stdlib
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11
 
GE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingGE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python Programming
 
Fantastic DSL in Python
Fantastic DSL in PythonFantastic DSL in Python
Fantastic DSL in Python
 
Python utan-stodhjul-motorsag
Python utan-stodhjul-motorsagPython utan-stodhjul-motorsag
Python utan-stodhjul-motorsag
 
Linux cheat sheet
Linux cheat sheetLinux cheat sheet
Linux cheat sheet
 
Python Asíncrono - Async Python
Python Asíncrono - Async PythonPython Asíncrono - Async Python
Python Asíncrono - Async Python
 
8799.pdfOr else the work is fine only. Lot to learn buddy.... Improve your ba...
8799.pdfOr else the work is fine only. Lot to learn buddy.... Improve your ba...8799.pdfOr else the work is fine only. Lot to learn buddy.... Improve your ba...
8799.pdfOr else the work is fine only. Lot to learn buddy.... Improve your ba...
 
Or else the work is fine only. Lot to learn buddy.... Improve your basics in ...
Or else the work is fine only. Lot to learn buddy.... Improve your basics in ...Or else the work is fine only. Lot to learn buddy.... Improve your basics in ...
Or else the work is fine only. Lot to learn buddy.... Improve your basics in ...
 
PYTHONOr else the work is fine only. Lot to learn buddy.... Improve your basi...
PYTHONOr else the work is fine only. Lot to learn buddy.... Improve your basi...PYTHONOr else the work is fine only. Lot to learn buddy.... Improve your basi...
PYTHONOr else the work is fine only. Lot to learn buddy.... Improve your basi...
 
python codes
python codespython codes
python codes
 
Bash cheat sheet
Bash cheat sheetBash cheat sheet
Bash cheat sheet
 
Bash cheat sheet
Bash cheat sheetBash cheat sheet
Bash cheat sheet
 
Functions and modules in python
Functions and modules in pythonFunctions and modules in python
Functions and modules in python
 
Can you fix the errors- It isn't working when I try to run import s.pdf
Can you fix the errors- It isn't working when I try to run    import s.pdfCan you fix the errors- It isn't working when I try to run    import s.pdf
Can you fix the errors- It isn't working when I try to run import s.pdf
 
Marrow: A Meta-Framework for Python 2.6+ and 3.1+
Marrow: A Meta-Framework for Python 2.6+ and 3.1+Marrow: A Meta-Framework for Python 2.6+ and 3.1+
Marrow: A Meta-Framework for Python 2.6+ and 3.1+
 
Commit2015 kharchenko - python generators - ext
Commit2015   kharchenko - python generators - extCommit2015   kharchenko - python generators - ext
Commit2015 kharchenko - python generators - ext
 
Think Async: Asynchronous Patterns in NodeJS
Think Async: Asynchronous Patterns in NodeJSThink Async: Asynchronous Patterns in NodeJS
Think Async: Asynchronous Patterns in NodeJS
 
Terminal linux commands_ Fedora based
Terminal  linux commands_ Fedora basedTerminal  linux commands_ Fedora based
Terminal linux commands_ Fedora based
 

Recently uploaded

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Make Sure Your Applications Crash

  • 1. Make Sure Your Applications Crash Moshe Zadka
  • 3. Python doesn't crash Memory managed, no direct pointer arithmetic
  • 4. ...except it does C bugs, untrapped exception, infinite loops, blocking calls, thread dead-lock, inconsistent resident state
  • 5. Recovery is important "[S]ystem failure can usually be considered to be the result of two program errors[...] the second, in the recovery routine[...]"
  • 6. Crashes and inconsistent data A crash results in data from an arbitrary program state.
  • 7. Avoid storage Caches are better than master copies.
  • 9. Atomic operations File rename
  • 10. Example: Counting def update_counter(): fp = file("counter.txt") s = fp.read() counter = int(s.strip()) counter += 1 # If there is a crash before this point, # no changes have been done. fp = file("counter.txt.tmp", 'w') print >>fp, counter fp.close() # If there is a crash before this point, # only a temp file has been modified # The following is an atomic operation os.rename("counter.txt.tmp", "counter.txt")
  • 11. Efficient caches, reliable masters Mark inconsistency of cache
  • 13. Availability If data is consistent, just restart!
  • 14. Improving availability Limit impact Fast detection Fast start-up
  • 15. Vertical splitting Different execution paths, different processes
  • 16. Horizontal splitting Different code bases, different processes
  • 17. Watchdog Monitor -> Flag -> Remediate
  • 18. Watchdog principles Keep it simple, keep it safe!
  • 19. Watchdog: Heartbeats ## In a Twisted process def beat(): file('beats/my-name', 'a').close() task.LoopingCall(beat).start(30)
  • 20. Watchdog: Get time-outs def getTimeout() timeout = dict() now = time.time() for heart in glob.glob('hearts/*'): beat = int(file(heart).read().strip()) timeout[heart] = now-beat return timeout
  • 21. Watchdog: Mark problems def markProblems(): timeout = getTimeout() for heart in glob.glob('beats/*'): mtime = os.path.getmtime(heart) problem = 'problems/'+heart if (mtime<timeout[heart] and not os.path.isfile(problem)): fp = file('problems/'+heart, 'w') fp.write('watchdog') fp.close()
  • 22. Watchdog: check solutions def checkSolutions(): now = time.time() problemTimeout = now-30 for problem in glob.glob('problems/*'): mtime = os.path.getmtime(problem) if mtime<problemTimeout: subprocess.call(['restart-system'])
  • 23. Watchdog: Loop ## Watchdog while True: markProblems() checkSolutions() time.sleep(1)
  • 24. Watchdog: accuracy of Custom checkers can manufacture problems
  • 25. Watchdog: reliability of Use cron for main loop
  • 26. Watchdog: reliability of Use software/hardware watchdogs
  • 29. Welcome to the back-up slides Extra! Extra!
  • 30. Example: Counting on Windows def update_counter(): fp = file("counter.txt") s = fp.read() counter = int(s.strip()) counter += 1 # If there is a crash before this point, # no changes have been done. fp = file("counter.txt.tmp", 'w') print >>fp, counter fp.close() # If there is a crash before this point, # only a temp file has been modified os.remove("counter.txt") # At this point, the state is inconsistent* # The following is an atomic operation
  • 32. Example: Counting on Windows (Recovery) def recover(): if not os.path.exists("counter.txt"): # The permanent file has been removed # Therefore, the temp file is valid os.rename("counter.txt.tmp", "counter.txt")
  • 33. Example: Counting with versions def update_counter(): files = [int(name.split('.')[-1]) for name in os.listdir('.') if name.startswith('counter.')] last = max(files) counter = int(file('counter.%s' % last ).read().strip()) counter += 1 # If there is a crash before this point, # no changes have been done. fp = file("tmp.counter", 'w') print >>fp, counter fp.close() # If there is a crash before this point, # only a temp file has been modified
  • 34. os.rename('tmp.counter', 'counter.%s' % (last+1)) os.remove('counter.%s' % last)
  • 35. Example: Counting with versions (cleanup) # This is not a recovery routine, but a cleanup # routine. # Even in its absence, the state is consistent def cleanup(): files = [int(name.split('.')[-1]) for name in os.listdir('.') if name.startswith('counter.')] files.sort() files.pop() for n in files: os.remove('counter.%d' % n) if os.path.exists('tmp.counter'): os.remove('tmp.counter')
  • 36. Correct ordering def activate_due(): scheduled = rs.smembers('scheduled') now = time.time() for el in scheduled: due = int(rs.get(el+':due')) if now<due: continue rs.sadd('activated', el) rs.delete(el+':due') rs.sremove('scheduled', el)
  • 37. Correct ordering (recovery) def recover(): inconsistent = rs.sinter('activated', 'scheduled') for el in inconsistent: rs.delete(el+':due') #* rs.sremove('scheduled', el)
  • 38. Example: Key/value stores 0.log: ['add', 'key-0', 'value-0'] ['add', 'key-1', 'value-1'] ['add', 'key-0', 'value-2'] ['remove', 'key-1'] . . . 1.log: . . . 2.log:
  • 39. . . .
  • 40. Example: Key/value stores (utility functions) ## Get the level of a file def getLevel(s) return int(s.split('.')[0]) ## Get all files of a given type def getType(tp): return [(getLevel(s), s) for s in files if s.endswith(tp)]
  • 41. Example: Key/value stores (classifying files) ## Get all relevant files def relevant(d): files = os.listdir(d): mlevel, master = max(getType('.master')) logs = getType('.log') logs.sort() return master+[log for llevel, log in logs if llevel>mlevel]
  • 42. Example: Key/value stores (reading) ## Read in a single file def update(result, fp): for line in fp: val = json.loads(line) if val[0] == 'add': result[val[1]] = val[2] else: del result[val[1]] ## Read in several files def read(files): result = dict() for fname in files: try: update(result, file(fname))
  • 43. except ValueError: pass return result
  • 44. Example: Key/value stores (writer class) class Writer(object): def __init__(self, level): self.level = level self.fp = None self._next() def _next(self): self.level += 1 if self.fp: self.fp.close() name ='%3d.log' % self.currentLevel self.fp = file(name, 'w') self.rows = 0 def write(self, value):
  • 46. Example: Key/value stores (storage class) ## The actual data store abstraction. class Store(object): def __init__(self): files = relevant(d) self.result = read(files) level = getLevel(files[-1]) self.writer = Writer(level) def get(self, key): return self.result[key] def add(self, key, value): self.writer.write(['add', key, value]) def remove(self, key): self.writer.write(['remove', key])
  • 47. Example: Key/value stores (compression code) ## This should be run periodically # from a different thread def compress(d): files = relevant(d)[:-1] if len(files)<2: return result = read(files) master = getLevel(files[-1])+1 fp = file('%3d.master.tmp' % master, 'w') for key, value in result.iteritems(): towrite = ['add', key, value]) print >>fp, json.dumps(towrite) fp.close()
  • 48. Vertical splitting: Example def forking_server(): s = socket.socket() s.bind(('', 8080)) s.listen(5) while True: client = s.accept() newpid = os.fork() if newpid: f = client.makefile() f.write("Sunday, May 22, 1983 " "18:45:59-PST") f.close() os._exit()
  • 49. Horizontal splitting: front-end ## Process one class SchedulerResource(resource.Resource): isLeaf = True def __init__(self, filepath): resource.Resource.__init__(self) self.filepath = filepath def render_PUT(self, request): uuid, = request.postpath content = request.content.read() child = self.filepath.child(uuid) child.setContent(content) fp = filepath.FilePath("things") r = SchedulerResource(fp) s = server.Site(r) reactor.listenTCP(8080, s)
  • 50. Horizontal splitting: scheduler ## Process two rs = redis.Redis(host='localhost', port=6379, db=9) while True: for fname in os.listdir("things"): when = int(file(fname).read().strip()) rs.set(uuid+':due', when) rs.sadd('scheduled', uuid) os.remove(fname) time.sleep(1)
  • 51. Horizontal splitting: runner ## Process three rs = redis.Redis(host='localhost', port=6379, db=9) recover() while True: activate_due() time.sleep(1)
  • 52. Horizontal splitting: message queues No direct dependencies
  • 53. Horizontal splitting: message queues: sender ## Process four rs = redis.Redis(host='localhost', port=6379, db=9) params = pika.ConnectionParameters('localhost') conn = pika.BlockingConnection(params) channel = conn.channel() channel.queue_declare(queue='active') while True: activated = rs.smembers('activated') finished = set(rs.smembers('finished')) for el in activated: if el in finished: continue
  • 54. channel.basic_publish( exchange='', routing_key='active', body=el) rs.add('finished', el)
  • 55. Horizontal splitting: message queues: receiver ## Process five # It is possible to get "dups" of bodies. # Application logic should deal with that params = pika.ConnectionParameters('localhost') conn = pika.BlockingConnection(params) channel = conn.channel() channel.queue_declare(queue='active') def callback(ch, method, properties, el): syslog.syslog('Activated %s' % el) channel.basic_consume(callback, queue='hello', no_ack=True) channel.start_consuming()
  • 56. Horizontal splitting: point-to-point Use HTTP (preferably, REST)