Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
PHP at 5000 Requests / Sec 
Hootsuite’s Scaling Story 
Bill Monkman 
Lead Technical Engineer - Platform 
@bmonkman
Overview - Selected Current Architecture 
Users lb1 lb2 lb3 ... Nginx Load balancers 
web1 web2 web3 ... Nginx web servers...
Technologies - at first 
• Apache 
• PHP 
• MySQL
Then...
Problem 
It’s hard to scale MySQL horizontally
Solution - Caching 
Memcached. 
● Distributed cache, cluster of boxes with lots of RAM, trivial to scale 
● Cache as much ...
“There are only two hard things in 
Computer Science: cache invalidation and 
naming things.” 
• Phil Karlton
Solution - Caching 
MvcModelBaseCaching 
MvcModelBase 
MvcModelMysql 
SocialNetwork
Solution - Caching 
SELECT * FROM member WHERE org_id=888 
set individual cache records 
member_1 {data} 
member_5 {data} ...
Solution - Caching 
It’s hard to scale MySQL horizontally 
Now: 
● No need to scale MySQL 
● Able to serve the whole site ...
Then...
Problem 
Need a way to perform asynchronous, distributed tasks using a 
single-threaded language.
Solution - Gearman 
Gearman. 
● Distribute work to other servers to handle (workers also using 
PHP, same codebase) 
● Pre...
Solution - Gearman 
geard1 geard2 
gearworker1 gearworker2 gearworker6
Solution - Gearman 
Need a way to perform asynchronous, distributed tasks using a 
single-threaded language. 
Now: 
● Move...
Then...
Problem 
Need to store data with the potential to grow too big to handle 
effectively with MySQL.
Solution - MongoDB 
MongoDB. 
● Certain data did not need to be highly relational 
● NoSQL DB, many other solutions these ...
Solution - MongoDB 
Need to store data with the potential to grow too big to handle 
effectively with MySQL. 
Now: 
● Mult...
Technologies 
• Apache 
• PHP 
• MySQL 
• Memcached 
• Gearman 
• MongoDB
Then...
Problem 
With a codebase and an engineering team increasing in size, how do 
we keep up the pace of development and mainta...
Solution - Dark Launching 
Dark Launching. 
● Wrap code in block with a specific name 
● That name will appear in a manage...
Solution - Dark Launching 
if (In_Feature::isEnabled(‘TWITTER_ADS’)) { 
// execute new code 
} else { 
// execute old code...
Dark Launching - Reasons 
• Control your code 
• Limit risk -> raise confidence -> speed up pace of releases 
• “Branching...
Solution - Dark Launching 
With a codebase and an engineering team increasing in size, how do 
we keep up the pace of deve...
Then...
Problem 
With a rapidly increasing codebase and amount of users / traffic 
how do we keep visibility into the performance ...
Solution - Monitoring 
Statsd / Graphite. 
Logstash / Elasticsearch / Kibana. 
Sensu 
● Statsd for metrics 
● Logstash for...
Solution - Monitoring 
Statsd::timing('apiCall.facebookGraph', microtime(true) - $startTime);
Solution - Monitoring 
Logger::event('user liked from in-stream', In_Log::CATEGORY_UX, $logData);
Solution - Monitoring 
• Visibility into the performance and behaviour of your application 
• Iterate upon your code, meas...
Solution - Monitoring 
With a rapidly increasing codebase and amount of users / traffic 
how do we keep visibility into th...
Optimizations
Optimizations 
• Things expand beyond their initial scope 
• Case in point: Translations
Optimizations - Push work to users 
• Within reason, push work up to users 
• Make your users into a distributed processin...
Optimizations - Performance / Risks 
• Performance is more important than clean code, business reqts 
(in the instances wh...
Technologies 
Linux 
Nginx 
ElasticSearch Varnish 
PHP-FPM 
MySQL 
Jenkins 
Scala 
MongoDB 
Consul 
Gearman 
Redis 
Akka 
...
Problem 
With a huge and growing monolithic codebase and over 80 
engineers, how to keep scaling in a manageable way?
Solution - SOA 
SOA. 
● Split up the system into independent services which communicate only via APIs 
● Teams can work on...
Solution - SOA 
SOM. 
● “Service Oriented Monolith” 
● When splitting up a monolithic codebase, dependencies are what kill...
Solution - SOA 
With a huge and growing monolithic codebase and over 130 
engineers, how to keep scaling in a manageable w...
Conclusion
Thank You! 
Bill Monkman 
@bmonkman 
More Info: 
code.hootsuite.com
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
Upcoming SlideShare
Loading in …5
×

PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story

Bill Monkman, Lead Engineer at Hootsuite, presenting on how Hootsuite went from zero to hundreds of millions of requests per day with its PHP codebase, and how dealing with that growth has shaped its future direction. Tips, optimizations, and horror stories from a rapidly-scaling PHP startup.

Video: https://www.youtube.com/watch?v=TZGeBAIMPII

PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story

  1. 1. PHP at 5000 Requests / Sec Hootsuite’s Scaling Story Bill Monkman Lead Technical Engineer - Platform @bmonkman
  2. 2. Overview - Selected Current Architecture Users lb1 lb2 lb3 ... Nginx Load balancers web1 web2 web3 ... Nginx web servers PHP-FPM PHP-FPM PHP-FPM PHP-FPM Memcached cluster mem1 ... Mysql cluster master slave MongoDB cluster master slave master slave shard1 shard2 Gearman cluster geard1 geard2 worker1 ... ... ... Services
  3. 3. Technologies - at first • Apache • PHP • MySQL
  4. 4. Then...
  5. 5. Problem It’s hard to scale MySQL horizontally
  6. 6. Solution - Caching Memcached. ● Distributed cache, cluster of boxes with lots of RAM, trivial to scale ● Cache as much as possible, invalidate only when necessary ● Use cache instead of DB ● No joins - decouple entities (collection caching) ● Twemproxy!
  7. 7. “There are only two hard things in Computer Science: cache invalidation and naming things.” • Phil Karlton
  8. 8. Solution - Caching MvcModelBaseCaching MvcModelBase MvcModelMysql SocialNetwork
  9. 9. Solution - Caching SELECT * FROM member WHERE org_id=888 set individual cache records member_1 {data} member_5 {data} member_9 {data} set collection cache member_org_888 [1,5,9] Automatic invalidation of collection cache
  10. 10. Solution - Caching It’s hard to scale MySQL horizontally Now: ● No need to scale MySQL ● Able to serve the whole site on 1 MySQL server ● 500 MySQL SELECTs per second. 50,000 Memcached GETs. ● 99+% hit rate
  11. 11. Then...
  12. 12. Problem Need a way to perform asynchronous, distributed tasks using a single-threaded language.
  13. 13. Solution - Gearman Gearman. ● Distribute work to other servers to handle (workers also using PHP, same codebase) ● Precursor to SOA where everything is truly distributed ● Many other solutions, queueing systems.
  14. 14. Solution - Gearman geard1 geard2 gearworker1 gearworker2 gearworker6
  15. 15. Solution - Gearman Need a way to perform asynchronous, distributed tasks using a single-threaded language. Now: ● Moved key tasks to Gearman ● Another cluster, scalable separately from web ● Discrete tasks, callable sync or async
  16. 16. Then...
  17. 17. Problem Need to store data with the potential to grow too big to handle effectively with MySQL.
  18. 18. Solution - MongoDB MongoDB. ● Certain data did not need to be highly relational ● NoSQL DB, many other solutions these days ● Mongo can be a pain, lots of moving parts ● Had to make our own sequencer where auto-incremented ids were necessary
  19. 19. Solution - MongoDB Need to store data with the potential to grow too big to handle effectively with MySQL. Now: ● Multiple clusters containing amounts of data that likely would have crushed MySQL ● Billions of rows per collection, many TB of data on disk
  20. 20. Technologies • Apache • PHP • MySQL • Memcached • Gearman • MongoDB
  21. 21. Then...
  22. 22. Problem With a codebase and an engineering team increasing in size, how do we keep up the pace of development and maintain control of the system? (SVN, big branches, merge hell)
  23. 23. Solution - Dark Launching Dark Launching. ● Wrap code in block with a specific name ● That name will appear in a management page ● Can control whether or not that block is executed by modifying it’s value ● Boolean , random percentage, session-based, member list, organization list, etc.
  24. 24. Solution - Dark Launching if (In_Feature::isEnabled(‘TWITTER_ADS’)) { // execute new code } else { // execute old code }
  25. 25. Dark Launching - Reasons • Control your code • Limit risk -> raise confidence -> speed up pace of releases • “Branching in Production” • Learning happens in Production
  26. 26. Solution - Dark Launching With a codebase and an engineering team increasing in size, how do we keep up the pace of development and maintain control of the system? Now: ● Work fast with more confidence ● Huge amount of control over production systems ● Typically 10+ code releases to production per day ● Push-based distribution with Consul
  27. 27. Then...
  28. 28. Problem With a rapidly increasing codebase and amount of users / traffic how do we keep visibility into the performance of the code?
  29. 29. Solution - Monitoring Statsd / Graphite. Logstash / Elasticsearch / Kibana. Sensu ● Statsd for metrics ● Logstash for log events ● Sensu for monitoring / alerting
  30. 30. Solution - Monitoring Statsd::timing('apiCall.facebookGraph', microtime(true) - $startTime);
  31. 31. Solution - Monitoring Logger::event('user liked from in-stream', In_Log::CATEGORY_UX, $logData);
  32. 32. Solution - Monitoring • Visibility into the performance and behaviour of your application • Iterate upon your code, measure results • Pairs well with dark launching • Also systems like New Relic
  33. 33. Solution - Monitoring With a rapidly increasing codebase and amount of users / traffic how do we keep visibility into the performance of the code? Now: ● Able to watch performance / behaviour in real time. ● Able to view important events both in the aggregate or very granular ● Able to control the system and watch the effect of changes
  34. 34. Optimizations
  35. 35. Optimizations • Things expand beyond their initial scope • Case in point: Translations
  36. 36. Optimizations - Push work to users • Within reason, push work up to users • Make your users into a distributed processing grid • e.g. Stream rendering
  37. 37. Optimizations - Performance / Risks • Performance is more important than clean code, business reqts (in the instances where they may be mutually exclusive) • Fine line between future proofing and premature optimization • Don’t add burdensome processes, but make it easy for your team to do things the right way • Know your weak spots, protect against abuse
  38. 38. Technologies Linux Nginx ElasticSearch Varnish PHP-FPM MySQL Jenkins Scala MongoDB Consul Gearman Redis Akka Python Memcached HAProxy jQuery ZeroMQ Backbone RabbitMQ EC2 Zend Docker Cloudfront CDN Logstash Zookeeper Kibana Statsd/Graphite Packer Vagrant Nagios VirtualBox Spark/Shark Sensu Symfony Riak Composer Websockets Comet Hadoop Ansible Git Webpack Redshift
  39. 39. Problem With a huge and growing monolithic codebase and over 80 engineers, how to keep scaling in a manageable way?
  40. 40. Solution - SOA SOA. ● Split up the system into independent services which communicate only via APIs ● Teams can work on their own services with encapsulated business logic and have their own deployment schedules. ● We chose to use Scala/Akka for services, communicating via ZeroMQ ● SOA transition made easier by the “no joins” philosophy ● Tons of work
  41. 41. Solution - SOA SOM. ● “Service Oriented Monolith” ● When splitting up a monolithic codebase, dependencies are what kill you ● Fulfill dependencies by writing interim services using existing PHP code ● Maintain the contract and future scala services will be drop-in replacements
  42. 42. Solution - SOA With a huge and growing monolithic codebase and over 130 engineers, how to keep scaling in a manageable way? Today: ● Transitioning to Scala SOA ● PHP will still be used as the Façade, a thin layer built on top of the business logic of the services it interacts with.
  43. 43. Conclusion
  44. 44. Thank You! Bill Monkman @bmonkman More Info: code.hootsuite.com

×