Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Make Sure Your Applications Crash

3,294 views

Published on

Presentation for PyCon 2012 about application reliability.

Published in: Technology, Automotive
  • ⇒⇒⇒WRITE-MY-PAPER.net ⇐⇐⇐ I love this site. It always finds me the best tutors in accordance with my needs. I have been using it since last year. The prices are not expensive compared to other sites. I am glad I discored this site:)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Yes you are right. There are many research paper writing services available now. But almost services are fake and illegal. Only a genuine service will treat their customer with quality research papers. ⇒ www.HelpWriting.net ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating for everyone is here: ♥♥♥ http://bit.ly/2ZDZFYj ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Sex in your area is here: ❶❶❶ http://bit.ly/2ZDZFYj ❶❶❶
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Make Sure Your Applications Crash

  1. Make Sure Your Applications Crash Moshe Zadka
  2. True story
  3. Python doesnt crashMemory managed, no direct pointer arithmetic
  4. ...except it does C bugs, untrapped exception, infinite loops,blocking calls, thread dead-lock, inconsistent resident state
  5. Recovery is important"[S]ystem failure can usually be considered to be the result of two program errors[...] the second, in the recovery routine[...]"
  6. Crashes and inconsistent dataA crash results in data from an arbitrary program state.
  7. Avoid storageCaches are better than master copies.
  8. DatabasesTransactions maintain consistency Databases can crash too!
  9. Atomic operations File rename
  10. Example: Countingdef update_counter(): fp = file("counter.txt") s = fp.read() counter = int(s.strip()) counter += 1 # If there is a crash before this point, # no changes have been done. fp = file("counter.txt.tmp", w) print >>fp, counter fp.close() # If there is a crash before this point, # only a temp file has been modified # The following is an atomic operation os.rename("counter.txt.tmp", "counter.txt")
  11. Efficient caches, reliable masters Mark inconsistency of cache
  12. No shutdownCrash in testing
  13. AvailabilityIf data is consistent, just restart!
  14. Improving availability Limit impact Fast detection Fast start-up
  15. Vertical splittingDifferent execution paths, different processes
  16. Horizontal splittingDifferent code bases, different processes
  17. WatchdogMonitor -> Flag -> Remediate
  18. Watchdog principlesKeep it simple, keep it safe!
  19. Watchdog: Heartbeats## In a Twisted processdef beat(): file(beats/my-name, a).close()task.LoopingCall(beat).start(30)
  20. Watchdog: Get time-outsdef getTimeout() timeout = dict() now = time.time() for heart in glob.glob(hearts/*): beat = int(file(heart).read().strip()) timeout[heart] = now-beat return timeout
  21. Watchdog: Mark problemsdef markProblems(): timeout = getTimeout() for heart in glob.glob(beats/*): mtime = os.path.getmtime(heart) problem = problems/+heart if (mtime<timeout[heart] and not os.path.isfile(problem)): fp = file(problems/+heart, w) fp.write(watchdog) fp.close()
  22. Watchdog: check solutionsdef checkSolutions(): now = time.time() problemTimeout = now-30 for problem in glob.glob(problems/*): mtime = os.path.getmtime(problem) if mtime<problemTimeout: subprocess.call([restart-system])
  23. Watchdog: Loop## Watchdogwhile True: markProblems() checkSolutions() time.sleep(1)
  24. Watchdog: accuracy ofCustom checkers can manufacture problems
  25. Watchdog: reliability of Use cron for main loop
  26. Watchdog: reliability ofUse software/hardware watchdogs
  27. ConclusionsEverything crashes -- plan for it
  28. Questions?
  29. Welcome to the back-up slides Extra! Extra!
  30. Example: Counting on Windowsdef update_counter(): fp = file("counter.txt") s = fp.read() counter = int(s.strip()) counter += 1 # If there is a crash before this point, # no changes have been done. fp = file("counter.txt.tmp", w) print >>fp, counter fp.close() # If there is a crash before this point, # only a temp file has been modified os.remove("counter.txt") # At this point, the state is inconsistent* # The following is an atomic operation
  31. os.rename("counter.txt.tmp", "counter.txt")
  32. Example: Counting on Windows (Recovery)def recover(): if not os.path.exists("counter.txt"): # The permanent file has been removed # Therefore, the temp file is valid os.rename("counter.txt.tmp", "counter.txt")
  33. Example: Counting with versionsdef update_counter(): files = [int(name.split(.)[-1]) for name in os.listdir(.) if name.startswith(counter.)] last = max(files) counter = int(file(counter.%s % last ).read().strip()) counter += 1 # If there is a crash before this point, # no changes have been done. fp = file("tmp.counter", w) print >>fp, counter fp.close() # If there is a crash before this point, # only a temp file has been modified
  34. os.rename(tmp.counter, counter.%s % (last+1))os.remove(counter.%s % last)
  35. Example: Counting with versions (cleanup)# This is not a recovery routine, but a cleanup# routine.# Even in its absence, the state is consistentdef cleanup(): files = [int(name.split(.)[-1]) for name in os.listdir(.) if name.startswith(counter.)] files.sort() files.pop() for n in files: os.remove(counter.%d % n) if os.path.exists(tmp.counter): os.remove(tmp.counter)
  36. Correct orderingdef activate_due(): scheduled = rs.smembers(scheduled) now = time.time() for el in scheduled: due = int(rs.get(el+:due)) if now<due: continue rs.sadd(activated, el) rs.delete(el+:due) rs.sremove(scheduled, el)
  37. Correct ordering (recovery)def recover(): inconsistent = rs.sinter(activated, scheduled) for el in inconsistent: rs.delete(el+:due) #* rs.sremove(scheduled, el)
  38. Example: Key/value stores0.log: [add, key-0, value-0] [add, key-1, value-1] [add, key-0, value-2] [remove, key-1] . . .1.log: . . .2.log:
  39. ...
  40. Example: Key/value stores (utility functions)## Get the level of a filedef getLevel(s) return int(s.split(.)[0])## Get all files of a given typedef getType(tp): return [(getLevel(s), s) for s in files if s.endswith(tp)]
  41. Example: Key/value stores (classifying files)## Get all relevant filesdef relevant(d): files = os.listdir(d): mlevel, master = max(getType(.master)) logs = getType(.log) logs.sort() return master+[log for llevel, log in logs if llevel>mlevel]
  42. Example: Key/value stores (reading)## Read in a single filedef update(result, fp): for line in fp: val = json.loads(line) if val[0] == add: result[val[1]] = val[2] else: del result[val[1]]## Read in several filesdef read(files): result = dict() for fname in files: try: update(result, file(fname))
  43. except ValueError: passreturn result
  44. Example: Key/value stores (writer class)class Writer(object): def __init__(self, level): self.level = level self.fp = None self._next() def _next(self): self.level += 1 if self.fp: self.fp.close() name =%3d.log % self.currentLevel self.fp = file(name, w) self.rows = 0 def write(self, value):
  45. print >>self.fp, json.dumps(value)self.fp.flush()self.rows += 1if self.rows>200: self._next()
  46. Example: Key/value stores (storage class)## The actual data store abstraction.class Store(object): def __init__(self): files = relevant(d) self.result = read(files) level = getLevel(files[-1]) self.writer = Writer(level) def get(self, key): return self.result[key] def add(self, key, value): self.writer.write([add, key, value]) def remove(self, key): self.writer.write([remove, key])
  47. Example: Key/value stores (compression code)## This should be run periodically# from a different threaddef compress(d): files = relevant(d)[:-1] if len(files)<2: return result = read(files) master = getLevel(files[-1])+1 fp = file(%3d.master.tmp % master, w) for key, value in result.iteritems(): towrite = [add, key, value]) print >>fp, json.dumps(towrite) fp.close()
  48. Vertical splitting: Exampledef forking_server(): s = socket.socket() s.bind((, 8080)) s.listen(5) while True: client = s.accept() newpid = os.fork() if newpid: f = client.makefile() f.write("Sunday, May 22, 1983 " "18:45:59-PST") f.close() os._exit()
  49. Horizontal splitting: front-end## Process oneclass SchedulerResource(resource.Resource): isLeaf = True def __init__(self, filepath): resource.Resource.__init__(self) self.filepath = filepath def render_PUT(self, request): uuid, = request.postpath content = request.content.read() child = self.filepath.child(uuid) child.setContent(content)fp = filepath.FilePath("things")r = SchedulerResource(fp)s = server.Site(r)reactor.listenTCP(8080, s)
  50. Horizontal splitting: scheduler## Process twors = redis.Redis(host=localhost, port=6379, db=9)while True: for fname in os.listdir("things"): when = int(file(fname).read().strip()) rs.set(uuid+:due, when) rs.sadd(scheduled, uuid) os.remove(fname) time.sleep(1)
  51. Horizontal splitting: runner## Process threers = redis.Redis(host=localhost, port=6379, db=9)recover()while True: activate_due() time.sleep(1)
  52. Horizontal splitting: message queues No direct dependencies
  53. Horizontal splitting: message queues: sender## Process fourrs = redis.Redis(host=localhost, port=6379, db=9)params = pika.ConnectionParameters(localhost)conn = pika.BlockingConnection(params)channel = conn.channel()channel.queue_declare(queue=active)while True: activated = rs.smembers(activated) finished = set(rs.smembers(finished)) for el in activated: if el in finished: continue
  54. channel.basic_publish( exchange=, routing_key=active, body=el)rs.add(finished, el)
  55. Horizontal splitting: message queues: receiver## Process five# It is possible to get "dups" of bodies.# Application logic should deal with thatparams = pika.ConnectionParameters(localhost)conn = pika.BlockingConnection(params)channel = conn.channel()channel.queue_declare(queue=active)def callback(ch, method, properties, el): syslog.syslog(Activated %s % el)channel.basic_consume(callback, queue=hello, no_ack=True)channel.start_consuming()
  56. Horizontal splitting: point-to-point Use HTTP (preferably, REST)

×