SlideShare a Scribd company logo
1 of 56
Download to read offline
Big Bird.
(scaling twitter)
Rails Scales.
(but not out of the box)
First, Some Facts
• 600 requests per second. Growing fast.
• 180 Rails Instances (Mongrel). Growing fast.
• 1 Database Server (MySQL) + 1 Slave.
• 30-odd Processes for Misc. Jobs
• 8 Sun X4100s
• Many users, many updates.
Joy          Pain




Oct   Nov   Dec    Jan   Feb     March   Apr
IM IN UR RAILZ




     MAKIN EM GO FAST
It’s Easy, Really.
1. Realize Your Site is Slow
2. Optimize the Database
3. Cache the Hell out of Everything
4. Scale Messaging
5. Deal With Abuse
It’s Easy, Really.
1. Realize Your Site is Slow
2. Optimize the Database
3. Cache the Hell out of Everything
4. Scale Messaging
5. Deal With Abuse
6. Profit
the
     more
      you
        know

{ Part the First }
We Failed at This.
Don’t Be Like Us

• Munin
• Nagios
• AWStats & Google Analytics
• Exception Notifier / Exception Logger
• Immediately add reporting to track problems.
Test Everything

•   Start Before You Start

•   No Need To Be Fancy

•   Tests Will Save Your Life

•   Agile Becomes
    Important When Your
    Site Is Down
<!-- served to you through a copper wire by sampaati at 22 Apr
    15:02 in 343 ms (d 102 / r 217). thank you, come again. -->
 <!-- served to you through a copper wire by kolea.twitter.com at
22 Apr 15:02 in 235 ms (d 87 / r 130). thank you, come again. -->
 <!-- served to you through a copper wire by raven.twitter.com at
22 Apr 15:01 in 450 ms (d 96 / r 337). thank you, come again. -->



                  Benchmarks?
                       let your users do it.
 <!-- served to you through a copper wire by kolea.twitter.com at
22 Apr 15:00 in 409 ms (d 88 / r 307). thank you, come again. -->
  <!-- served to you through a copper wire by firebird at 22 Apr
   15:03 in 2094 ms (d 643 / r 1445). thank you, come again. -->
   <!-- served to you through a copper wire by quetzal at 22 Apr
     15:01 in 384 ms (d 70 / r 297). thank you, come again. -->
The Database
  { Part the Second }
“The Next Application I Build is Going
to Be Easily Partitionable” - S. Butterfield
“The Next Application I Build is Going
to Be Easily Partitionable” - S. Butterfield
“The Next Application I Build is Going
to Be Easily Partitionable” - S. Butterfield
Too Late.
Index Everything
class AddIndex < ActiveRecord::Migration
     def self.up
       add_index :users, :email
     end

     def self.down
       remove_index :users, :email
     end
   end


Repeat for any column that appears in a WHERE clause

             Rails won’t do this for you.
Denormalize A Lot
class DenormalizeFriendsIds < ActiveRecord::Migration
  def self.up
    add_column "users", "friends_ids", :text
  end

  def self.down
    remove_column "users", "friends_ids"
  end
end
class Friendship < ActiveRecord::Base
  belongs_to :user
  belongs_to :friend

 after_create :add_to_denormalized_friends
 after_destroy :remove_from_denormalized_friends

  def add_to_denormalized_friends
    user.friends_ids << friend.id
    user.friends_ids.uniq!
    user.save_without_validation
  end

  def remove_from_denormalized_friends
    user.friends_ids.delete(friend.id)
    user.save_without_validation
  end
end
Don’t be Stupid
bob.friends.map(&:email)
     Status.count()
“email like ‘%#{search}%’”
That’s where we are.
                  Seriously.
  If your Rails application is doing anything more
complex than that, you’re doing something wrong*.



        * or you observed the First Rule of Butterfield.
Partitioning Comes Later.
   (we’ll let you know how it goes)
The Cache
 { Part the Third }
MemCache
MemCache
MemCache
!
class Status < ActiveRecord::Base
  class << self
    def count_with_memcache(*args)
      return count_without_memcache unless args.empty?
      count = CACHE.get(“status_count”)
      if count.nil?
        count = count_without_memcache
        CACHE.set(“status_count”, count)
      end
      count
    end
    alias_method_chain :count, :memcache
  end
  after_create :increment_memcache_count
  after_destroy :decrement_memcache_count
  ...
end
class User < ActiveRecord::Base
  def friends_statuses
    ids = CACHE.get(“friends_statuses:#{id}”)
    Status.find(:all, :conditions => [“id IN (?)”, ids])
  end
end

class Status < ActiveRecord::Base
  after_create :update_caches
  def update_caches
    user.friends_ids.each do |friend_id|
      ids = CACHE.get(“friends_statuses:#{friend_id}”)
      ids.pop
      ids.unshift(id)
      CACHE.set(“friends_statuses:#{friend_id}”, ids)
    end
  end
end
The Future


            ve d
          ti r
         co
         Ac
           e
         R
90% API Requests
     Cache Them!
“There are only two hard things in CS:
 cache invalidation and naming things.”

             – Phil Karlton, via Tim Bray
Messaging
{ Part the Fourth }
You Already Knew All
That Other Stuff, Right?
Producer             Consumer
           Message
Producer             Consumer
           Queue
Producer             Consumer
DRb
• The Good:
 • Stupid Easy
 • Reasonably Fast
• The Bad:
 • Kinda Flaky
 • Zero Redundancy
 • Tightly Coupled
ejabberd


            Jabber Client
                (drb)




           Incoming         Outgoing
Presence
           Messages         Messages


              MySQL
Server
     DRb.start_service ‘druby://localhost:10000’, myobject




                         Client
myobject = DRbObject.new_with_uri(‘druby://localhost:10000’)
Rinda

• Shared Queue (TupleSpace)
• Built with DRb
• RingyDingy makes it stupid easy
• See Eric Hodel’s documentation
• O(N) for take(). Sigh.
Timestamp: 12/22/06 01:53:14 (4 months ago)
      Author: lattice
      Message: Fugly. Seriously. Fugly.




        SELECT * FROM messages WHERE
substring(truncate(id,0),-2,1) = #{@fugly_dist_idx}
It Scales.
(except it stopped on Tuesday)
Options

• ActiveMQ (Java)
• RabbitMQ (erlang)
• MySQL + Lightweight Locking
• Something Else?
erlang?


What are you doing?
 Stabbing my eyes out with a fork.
Starling

• Ruby, will be ported to something faster
• 4000 transactional msgs/s
• First pass written in 4 hours
• Speaks MemCache (set, get)
Use Messages to
Invalidate Cache
   (it’s really not that hard)
Abuse
{ Part the Fifth }
The Italians
9000 friends in 24 hours
        (doesn’t scale)
http://flickr.com/photos/heather/464504545/
http://flickr.com/photos/curiouskiwi/165229284/
http://flickr.com/photo_zoom.gne?id=42914103&size=l
http://flickr.com/photos/madstillz/354596905/
http://flickr.com/photos/laughingsquid/382242677/
http://flickr.com/photos/bng/46678227/

More Related Content

What's hot

Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetLucian Oprea
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsSpark Summit
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergendistributed matters
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
あなたのScalaを爆速にする7つの方法(日本語版)
あなたのScalaを爆速にする7つの方法(日本語版)あなたのScalaを爆速にする7つの方法(日本語版)
あなたのScalaを爆速にする7つの方法(日本語版)x1 ichi
 
Secrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on KubernetesSecrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on KubernetesBruno Borges
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf toolsBrendan Gregg
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
twMVC#44 讓我們用 k6 來進行壓測吧
twMVC#44 讓我們用 k6 來進行壓測吧twMVC#44 讓我們用 k6 來進行壓測吧
twMVC#44 讓我們用 k6 來進行壓測吧twMVC
 
Capture the Streams of Database Changes
Capture the Streams of Database ChangesCapture the Streams of Database Changes
Capture the Streams of Database Changesconfluent
 
Prometheus at Preferred Networks
Prometheus at Preferred NetworksPrometheus at Preferred Networks
Prometheus at Preferred NetworksPreferred Networks
 
HTTP2 最速実装 〜入門編〜
HTTP2 最速実装 〜入門編〜HTTP2 最速実装 〜入門編〜
HTTP2 最速実装 〜入門編〜Kaoru Maeda
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introductionRico Chen
 
Consumer offset management in Kafka
Consumer offset management in KafkaConsumer offset management in Kafka
Consumer offset management in KafkaJoel Koshy
 
Learn O11y from Grafana ecosystem.
Learn O11y from Grafana ecosystem.Learn O11y from Grafana ecosystem.
Learn O11y from Grafana ecosystem.HungWei Chiu
 
ksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time EventsksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time Eventsconfluent
 

What's hot (20)

Kafka internals
Kafka internalsKafka internals
Kafka internals
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Docker infiniband
Docker infinibandDocker infiniband
Docker infiniband
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_Cheatsheet
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark Applications
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
あなたのScalaを爆速にする7つの方法(日本語版)
あなたのScalaを爆速にする7つの方法(日本語版)あなたのScalaを爆速にする7つの方法(日本語版)
あなたのScalaを爆速にする7つの方法(日本語版)
 
Secrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on KubernetesSecrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on Kubernetes
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
twMVC#44 讓我們用 k6 來進行壓測吧
twMVC#44 讓我們用 k6 來進行壓測吧twMVC#44 讓我們用 k6 來進行壓測吧
twMVC#44 讓我們用 k6 來進行壓測吧
 
Capture the Streams of Database Changes
Capture the Streams of Database ChangesCapture the Streams of Database Changes
Capture the Streams of Database Changes
 
Prometheus at Preferred Networks
Prometheus at Preferred NetworksPrometheus at Preferred Networks
Prometheus at Preferred Networks
 
HTTP2 最速実装 〜入門編〜
HTTP2 最速実装 〜入門編〜HTTP2 最速実装 〜入門編〜
HTTP2 最速実装 〜入門編〜
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
 
Consumer offset management in Kafka
Consumer offset management in KafkaConsumer offset management in Kafka
Consumer offset management in Kafka
 
Learn O11y from Grafana ecosystem.
Learn O11y from Grafana ecosystem.Learn O11y from Grafana ecosystem.
Learn O11y from Grafana ecosystem.
 
ksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time EventsksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time Events
 

Similar to Scaling Twitter

Hiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret SauceHiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret SauceJesse Vincent
 
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret SauceBeijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret SauceJesse Vincent
 
Microblogging via XMPP
Microblogging via XMPPMicroblogging via XMPP
Microblogging via XMPPStoyan Zhekov
 
Aprendendo solid com exemplos
Aprendendo solid com exemplosAprendendo solid com exemplos
Aprendendo solid com exemplosvinibaggio
 
Socket applications
Socket applicationsSocket applications
Socket applicationsJoão Moura
 
Dynomite at Erlang Factory
Dynomite at Erlang FactoryDynomite at Erlang Factory
Dynomite at Erlang Factorymoonpolysoft
 
Performance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsPerformance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsSerge Smetana
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...MongoDB
 
WebPerformance: Why and How? – Stefan Wintermeyer
WebPerformance: Why and How? – Stefan WintermeyerWebPerformance: Why and How? – Stefan Wintermeyer
WebPerformance: Why and How? – Stefan WintermeyerElixir Club
 
NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0Cosimo Streppone
 
Fisl - Deployment
Fisl - DeploymentFisl - Deployment
Fisl - DeploymentFabio Akita
 
SD, a P2P bug tracking system
SD, a P2P bug tracking systemSD, a P2P bug tracking system
SD, a P2P bug tracking systemJesse Vincent
 
RubyEnRails2007 - Dr Nic Williams - Keynote
RubyEnRails2007 - Dr Nic Williams - KeynoteRubyEnRails2007 - Dr Nic Williams - Keynote
RubyEnRails2007 - Dr Nic Williams - KeynoteDr Nic Williams
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...PROIDEA
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Appsadunne
 
How to avoid hanging yourself with Rails
How to avoid hanging yourself with RailsHow to avoid hanging yourself with Rails
How to avoid hanging yourself with RailsRowan Hick
 
Monkeybars in the Manor
Monkeybars in the ManorMonkeybars in the Manor
Monkeybars in the Manormartinbtt
 

Similar to Scaling Twitter (20)

Hiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret SauceHiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret Sauce
 
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret SauceBeijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
 
Microblogging via XMPP
Microblogging via XMPPMicroblogging via XMPP
Microblogging via XMPP
 
Aprendendo solid com exemplos
Aprendendo solid com exemplosAprendendo solid com exemplos
Aprendendo solid com exemplos
 
Socket applications
Socket applicationsSocket applications
Socket applications
 
From crash to testcase
From crash to testcaseFrom crash to testcase
From crash to testcase
 
Dynomite at Erlang Factory
Dynomite at Erlang FactoryDynomite at Erlang Factory
Dynomite at Erlang Factory
 
Performance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsPerformance Optimization of Rails Applications
Performance Optimization of Rails Applications
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
WebPerformance: Why and How? – Stefan Wintermeyer
WebPerformance: Why and How? – Stefan WintermeyerWebPerformance: Why and How? – Stefan Wintermeyer
WebPerformance: Why and How? – Stefan Wintermeyer
 
NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0
 
Fisl - Deployment
Fisl - DeploymentFisl - Deployment
Fisl - Deployment
 
SD, a P2P bug tracking system
SD, a P2P bug tracking systemSD, a P2P bug tracking system
SD, a P2P bug tracking system
 
RubyEnRails2007 - Dr Nic Williams - Keynote
RubyEnRails2007 - Dr Nic Williams - KeynoteRubyEnRails2007 - Dr Nic Williams - Keynote
RubyEnRails2007 - Dr Nic Williams - Keynote
 
Sinatra for REST services
Sinatra for REST servicesSinatra for REST services
Sinatra for REST services
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Apps
 
How to avoid hanging yourself with Rails
How to avoid hanging yourself with RailsHow to avoid hanging yourself with Rails
How to avoid hanging yourself with Rails
 
Monkeybars in the Manor
Monkeybars in the ManorMonkeybars in the Manor
Monkeybars in the Manor
 

More from Blaine

Social Privacy for HTTP over Webfinger
Social Privacy for HTTP over WebfingerSocial Privacy for HTTP over Webfinger
Social Privacy for HTTP over WebfingerBlaine
 
Social Software for Robots
Social Software for RobotsSocial Software for Robots
Social Software for RobotsBlaine
 
Building the Real Time Web
Building the Real Time WebBuilding the Real Time Web
Building the Real Time WebBlaine
 
You & Me & Everyone We Know
You & Me & Everyone We KnowYou & Me & Everyone We Know
You & Me & Everyone We KnowBlaine
 
Social Software for Robots
Social Software for RobotsSocial Software for Robots
Social Software for RobotsBlaine
 

More from Blaine (6)

Social Privacy for HTTP over Webfinger
Social Privacy for HTTP over WebfingerSocial Privacy for HTTP over Webfinger
Social Privacy for HTTP over Webfinger
 
Social Software for Robots
Social Software for RobotsSocial Software for Robots
Social Software for Robots
 
OAuth
OAuthOAuth
OAuth
 
Building the Real Time Web
Building the Real Time WebBuilding the Real Time Web
Building the Real Time Web
 
You & Me & Everyone We Know
You & Me & Everyone We KnowYou & Me & Everyone We Know
You & Me & Everyone We Know
 
Social Software for Robots
Social Software for RobotsSocial Software for Robots
Social Software for Robots
 

Recently uploaded

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 

Recently uploaded (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Scaling Twitter

  • 2. Rails Scales. (but not out of the box)
  • 3. First, Some Facts • 600 requests per second. Growing fast. • 180 Rails Instances (Mongrel). Growing fast. • 1 Database Server (MySQL) + 1 Slave. • 30-odd Processes for Misc. Jobs • 8 Sun X4100s • Many users, many updates.
  • 4.
  • 5.
  • 6.
  • 7. Joy Pain Oct Nov Dec Jan Feb March Apr
  • 8. IM IN UR RAILZ MAKIN EM GO FAST
  • 9. It’s Easy, Really. 1. Realize Your Site is Slow 2. Optimize the Database 3. Cache the Hell out of Everything 4. Scale Messaging 5. Deal With Abuse
  • 10. It’s Easy, Really. 1. Realize Your Site is Slow 2. Optimize the Database 3. Cache the Hell out of Everything 4. Scale Messaging 5. Deal With Abuse 6. Profit
  • 11. the more you know { Part the First }
  • 12. We Failed at This.
  • 13. Don’t Be Like Us • Munin • Nagios • AWStats & Google Analytics • Exception Notifier / Exception Logger • Immediately add reporting to track problems.
  • 14. Test Everything • Start Before You Start • No Need To Be Fancy • Tests Will Save Your Life • Agile Becomes Important When Your Site Is Down
  • 15. <!-- served to you through a copper wire by sampaati at 22 Apr 15:02 in 343 ms (d 102 / r 217). thank you, come again. --> <!-- served to you through a copper wire by kolea.twitter.com at 22 Apr 15:02 in 235 ms (d 87 / r 130). thank you, come again. --> <!-- served to you through a copper wire by raven.twitter.com at 22 Apr 15:01 in 450 ms (d 96 / r 337). thank you, come again. --> Benchmarks? let your users do it. <!-- served to you through a copper wire by kolea.twitter.com at 22 Apr 15:00 in 409 ms (d 88 / r 307). thank you, come again. --> <!-- served to you through a copper wire by firebird at 22 Apr 15:03 in 2094 ms (d 643 / r 1445). thank you, come again. --> <!-- served to you through a copper wire by quetzal at 22 Apr 15:01 in 384 ms (d 70 / r 297). thank you, come again. -->
  • 16. The Database { Part the Second }
  • 17. “The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
  • 18. “The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
  • 19. “The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
  • 22. class AddIndex < ActiveRecord::Migration def self.up add_index :users, :email end def self.down remove_index :users, :email end end Repeat for any column that appears in a WHERE clause Rails won’t do this for you.
  • 24. class DenormalizeFriendsIds < ActiveRecord::Migration def self.up add_column "users", "friends_ids", :text end def self.down remove_column "users", "friends_ids" end end
  • 25. class Friendship < ActiveRecord::Base belongs_to :user belongs_to :friend after_create :add_to_denormalized_friends after_destroy :remove_from_denormalized_friends def add_to_denormalized_friends user.friends_ids << friend.id user.friends_ids.uniq! user.save_without_validation end def remove_from_denormalized_friends user.friends_ids.delete(friend.id) user.save_without_validation end end
  • 27. bob.friends.map(&:email) Status.count() “email like ‘%#{search}%’”
  • 28. That’s where we are. Seriously. If your Rails application is doing anything more complex than that, you’re doing something wrong*. * or you observed the First Rule of Butterfield.
  • 29. Partitioning Comes Later. (we’ll let you know how it goes)
  • 30. The Cache { Part the Third }
  • 34. !
  • 35. class Status < ActiveRecord::Base class << self def count_with_memcache(*args) return count_without_memcache unless args.empty? count = CACHE.get(“status_count”) if count.nil? count = count_without_memcache CACHE.set(“status_count”, count) end count end alias_method_chain :count, :memcache end after_create :increment_memcache_count after_destroy :decrement_memcache_count ... end
  • 36. class User < ActiveRecord::Base def friends_statuses ids = CACHE.get(“friends_statuses:#{id}”) Status.find(:all, :conditions => [“id IN (?)”, ids]) end end class Status < ActiveRecord::Base after_create :update_caches def update_caches user.friends_ids.each do |friend_id| ids = CACHE.get(“friends_statuses:#{friend_id}”) ids.pop ids.unshift(id) CACHE.set(“friends_statuses:#{friend_id}”, ids) end end end
  • 37. The Future ve d ti r co Ac e R
  • 38. 90% API Requests Cache Them!
  • 39. “There are only two hard things in CS: cache invalidation and naming things.” – Phil Karlton, via Tim Bray
  • 41. You Already Knew All That Other Stuff, Right?
  • 42. Producer Consumer Message Producer Consumer Queue Producer Consumer
  • 43. DRb • The Good: • Stupid Easy • Reasonably Fast • The Bad: • Kinda Flaky • Zero Redundancy • Tightly Coupled
  • 44. ejabberd Jabber Client (drb) Incoming Outgoing Presence Messages Messages MySQL
  • 45. Server DRb.start_service ‘druby://localhost:10000’, myobject Client myobject = DRbObject.new_with_uri(‘druby://localhost:10000’)
  • 46. Rinda • Shared Queue (TupleSpace) • Built with DRb • RingyDingy makes it stupid easy • See Eric Hodel’s documentation • O(N) for take(). Sigh.
  • 47. Timestamp: 12/22/06 01:53:14 (4 months ago) Author: lattice Message: Fugly. Seriously. Fugly. SELECT * FROM messages WHERE substring(truncate(id,0),-2,1) = #{@fugly_dist_idx}
  • 48. It Scales. (except it stopped on Tuesday)
  • 49. Options • ActiveMQ (Java) • RabbitMQ (erlang) • MySQL + Lightweight Locking • Something Else?
  • 50. erlang? What are you doing? Stabbing my eyes out with a fork.
  • 51. Starling • Ruby, will be ported to something faster • 4000 transactional msgs/s • First pass written in 4 hours • Speaks MemCache (set, get)
  • 52. Use Messages to Invalidate Cache (it’s really not that hard)
  • 55. 9000 friends in 24 hours (doesn’t scale)