Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PyCon 2011 Scaling Disqus

29,456 views

Published on

Disqus talks about how they scale their Python web application to over 500 million visitors a month.

Video is available here: http://pycon.blip.tv/file/4880330/

Published in: Technology
  • ⇒⇒⇒WRITE-MY-PAPER.net ⇐⇐⇐ has really great writers to help you get the grades you need, they are fast and do great research. Support will always contact you if there is any confusion with the requirements of your paper so they can make sure you are getting exactly what you need.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • If we are speaking about saving time and money this site ⇒ www.HelpWriting.net ⇐ is going to be the best option!! I personally used lots of times and remain highly satisfied.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Sex in your area is here: ❤❤❤ http://bit.ly/39mQKz3 ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Follow the link, new dating source: ❤❤❤ http://bit.ly/39mQKz3 ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

PyCon 2011 Scaling Disqus

  1. Python at 400 500 million visitors DISQ US Jason Yan @ jasonyan David Cramer @ zeeg Got feedback? Use hashtag #sckrw
  2. Agenda <ul><li>What is DISQUS ? </li></ul><ul><li>An Overview of the Infrastructure </li></ul><ul><li>Iterative Development and Deployment </li></ul><ul><li>Why We Love Python </li></ul>
  3. What is DISQUS? We are a comment system with an emphasis on connecting communities http://disqus.com/about/ dis·cuss • dĭ-skŭs'
  4. Embeddable Comments
  5. A Brief History
  6. Startup-ish <ul><li>Founded just about 4 years ago </li></ul><ul><li>16 employees, 8 engineers </li></ul><ul><li>Traffic increasing 15-20% a month </li></ul><ul><li>Flat organizational structure, every engineer is a product manager </li></ul><ul><li>Fast turnaround, new feature launches every week (sometimes daily) </li></ul>
  7. Traffic March 2008 through March 2011
  8. DjangoCon 2010 <ul><li>17,000 requests/second peak </li></ul><ul><li>450,000 websites </li></ul><ul><li>15 million profiles </li></ul><ul><li>75 million comments </li></ul><ul><li>250 million visitors </li></ul>
  9. Six Months Later <ul><li>25,000 requests/second peak </li></ul><ul><li>700,000 websites </li></ul><ul><li>30 million profiles </li></ul><ul><li>170 million comments </li></ul><ul><li>500 million visitors </li></ul><ul><li>17,000 requests/second peak </li></ul><ul><li>450,000 websites </li></ul><ul><li>15 million profiles </li></ul><ul><li>75 million comments </li></ul><ul><li>250 million visitors </li></ul>
  10. Six Months Later <ul><li>September 2010: 250 million uniques </li></ul><ul><li>March 2011: 500 million uniques </li></ul><ul><li>Handling over 2x the traffic </li></ul>
  11. Six Months Later <ul><li>September 2010: ~100 servers </li></ul><ul><li>March 2011: ~100 servers </li></ul><ul><li>Scale diagonally </li></ul>
  12. Scaling Diagonally <ul><li>We still rent hardware , so there is no “commodity hardware” </li></ul><ul><ul><li>Cheaper to upgrade </li></ul></ul><ul><li>Everything is redundant </li></ul><ul><li>Partition data where you need to, scale partitions vertically </li></ul><ul><li>Upgrade hardware (more RAM, more drives, more cores) </li></ul><ul><ul><li>Python apps tend to be CPU bound </li></ul></ul>
  13. Infrastructure <ul><li>35% Web Servers (Apache + mod_wsgi) </li></ul><ul><li>15% Utility Servers (Python scripts, background workers) </li></ul><ul><li>20% Databases (PostgreSQL, Redis, Membase) </li></ul><ul><li>20% Load Balancing / High Availability (HAProxy + Heartbeat) </li></ul><ul><li>10% Caching servers (Memcached, Varnish) </li></ul><ul><li>Half of our servers run Python </li></ul>
  14. Python Web Servers
  15. Background Workers <ul><li>Lots of tasks that don’t need to be done in web application process: </li></ul><ul><ul><li>Crawling URLs </li></ul></ul><ul><ul><li>Updating avatars </li></ul></ul><ul><ul><li>Email notifications </li></ul></ul><ul><ul><li>Analytics </li></ul></ul><ul><ul><li>Counters </li></ul></ul>
  16. Background Workers (cont’d) <ul><li>Most jobs are I/O bound </li></ul><ul><ul><li>Slow external calls </li></ul></ul><ul><ul><ul><li>Twitter is slow </li></ul></ul></ul><ul><ul><ul><li>Facebook is slow </li></ul></ul></ul><ul><li>Could parallelize with multiple processes, but... </li></ul>
  17. Background Workers (cont’d) <ul><li>Waste of memory </li></ul><ul><li>Use non-blocking I/O </li></ul><ul><ul><li>Celery 2.2 adds support for gevent/eventlet </li></ul></ul>
  18. Monitoring <ul><li>Application side: Graphite </li></ul><ul><ul><li>Real-time(ish) graphing </li></ul></ul><ul><ul><li>Django front-end, Python backend </li></ul></ul><ul><li>Etsy’s StatsD proxy to Graphite </li></ul><ul><ul><li>UDP (fire and forget) </li></ul></ul><ul><ul><li>Batches updates </li></ul></ul>
  19. Monitoring <ul><li>Track application metrics </li></ul><ul><ul><li>Errors, exceptions </li></ul></ul><ul><ul><li>New comments, users, sites, etc. </li></ul></ul><ul><ul><li>Anything </li></ul></ul>
  20. Monitoring <ul><li>Check out Etsy’s posts: </li></ul><ul><ul><li>Measure Anything, Measure Everything http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/ </li></ul></ul><ul><ul><li>Tracking Every Release http://codeascraft.etsy.com/2010/12/08/track-every-release/ </li></ul></ul>
  21. What about the code?
  22. Powered By Django
  23. Which means... <ul><li>Largest Django-powered web application </li></ul><ul><li>We fork , and even sometimes monkey patch to make it scale to our needs </li></ul><ul><ul><li>Fortunately, we don’t have to do too much (Yay, Django!) </li></ul></ul><ul><ul><li>Unfortunately, we can’t use the whole of the Django internal components (and if we do, we do it in atypical ways) </li></ul></ul>
  24. Iterative Development Release Early Release Often
  25. Iterating Quickly <ul><li>Abstracting our application environment </li></ul><ul><ul><li>Less dependancies locally </li></ul></ul><ul><ul><li>Rely on CI for dependency coverage </li></ul></ul><ul><li>Heavy use of open source packages </li></ul><ul><ul><li>No NIH syndrome </li></ul></ul><ul><li>Deploy frequently , 3-7 times a day </li></ul><ul><li>Lots of branches, but master is “stable” </li></ul><ul><li>Realtime reporting on exceptions, metrics </li></ul><ul><li>Our test suite is the main blocker (slow) </li></ul>
  26. Dealing with Deploys
  27. Gargoyle Being users of our product , we actively use early versions of features before public release Deploy features to portions of a user base at a time to ensure smooth, measurable releases
  28. The Deployment Problem <ul><li>Make some changes locally </li></ul><ul><li>Run a subset of the test suite </li></ul><ul><li>Push your commits </li></ul><ul><li>CI server begins running tests </li></ul><ul><li>.... </li></ul>
  29. Waiting on the test suite...
  30. Rinse and Repeat <ul><li>30 minutes later tests fail , start over </li></ul><ul><li>Finally, deploy to a subset of servers </li></ul><ul><ul><li>Open Sentry (our exception logger) </li></ul></ul><ul><ul><li>Monitor Graphite </li></ul></ul><ul><li>Deploy to 35 servers ( ~8 minutes ) </li></ul><ul><ul><li>Full rollback in < 30 seconds </li></ul></ul>
  31. Wait, Sentry?
  32. Testing
  33. Testing Code <ul><li>Test suite takes around 25 minutes usually </li></ul><ul><li>“ Stuck” with Hudson (or Jenkins ) </li></ul><ul><ul><li>Most tightly integrated plugins are geared towards Java developers </li></ul></ul><ul><li>Which framework do we use? </li></ul><ul><ul><li>unittest(2), nose, doctests, LETTUCE? </li></ul></ul><ul><ul><li>We use unittest and nose </li></ul></ul><ul><li>Need to report code coverage , speed of tests , pylint (or pyflakes ) </li></ul>
  34. We Love Python
  35. Love-ish <ul><li>Many of us started with PHP or Rails </li></ul><ul><li>Clean syntax , clear standards </li></ul><ul><ul><li>All languages need PEP8.py and PyFlakes </li></ul></ul><ul><li>Interpreted , fast... enough </li></ul><ul><li>Very easy to learn </li></ul><ul><ul><li>We all started by learning Django first , then Python </li></ul></ul>
  36. Haters Gonna Hate If you could choose one thing in Python to hate on...
  37. Better package management
  38. What can we do? <ul><li>Too many forks, too many frameworks </li></ul><ul><ul><li>We need less clones , and more combined effort </li></ul></ul><ul><li>Improving existing Python solutions </li></ul><ul><li>More Python solutions for existing products </li></ul>
  39. Python Rocks!
  40. Questions? DISQ US psst, we’re hiring [email_address]
  41. References <ul><li>Sentry (our exception tracking tool) http://github.com/dcramer/django-sentry </li></ul><ul><li>Gargoyle (feature switches) https://github.com/disqus/gargoyle </li></ul><ul><li>Django DB Utils (collection of db helpers for Django) https://github.com/disqus/django-db-utils </li></ul><ul><li>Jenkins CI http://jenkins-ci.org/ </li></ul>code.disqus.com

×