SlideShare a Scribd company logo
1 of 123
Download to read offline
The Etsy Shard Architecture
    Starts With S and Ends With Hard


        jgoulah@etsy.com / @johngoulah
1.5B page views / mo.
525MM sales in 2011
40MM unique visitors/mo.
800K shops / 150 countries
25K+ queries/sec avg
3TB InnoDB buffer pool
15TB+ data stored
99.99% queries under 1ms
50+ MySQL servers

      Server Spec
      HP DL 380 G7
       96GB RAM
16 spindles / 1TB RAID 10
        24 Core
Ross Snyder
Scaling Etsy - What Went Wrong, What Went Right
           http://bit.ly/rpcxtP


             Matt Graham
 Migrating From PG to MySQL Without Downtime
          http://bit.ly/rQpqZG
Architecture
Redundancy
Master - Master
Master - Master

  R/W      R/W
Master - Master

  R/W      R/W

 Side A   Side B
Scalability
shard 1   shard 2         shard N

                    ...
shard 1    shard 2            shard N

                        ...



          shard N + 1
shard 1        shard 2                shard N

                               ...
Migrate     Migrate           Migrate


                shard N + 1
Bird’s-Eye View
tickets             index




shard 1             shard 2           shard N
tickets             index
 Unique IDs

shard 1             shard 2           shard N
tickets                 index
                              Shard Lookup

shard 1             shard 2               shard N
tickets             index




shard 1             shard 2           shard N
          Store/Retrieve Data
Basics
users_groups


user_id   group_id
  1          A
  1          B
  2          A
  2          C

  3          A

  3          B

  3          C
users_groups


user_id   group_id
  1          A
  1          B
  2          A
  2          C

  3          A

  3          B

  3          C
users_groups


user_id   group_id
  1          A
  1          B
  2          A                      user_id   group_id
  2          C                        3          A
  3          A                        3          B
  3          B                        3          C

  3          C
users_groups
          shard 1
user_id         group_id
  1                 A
  1                 B
                                                    shard 2
  2                 A                     user_id         group_id
  2                 C                       3                 A

                                            3                 B

                                            3                 C
Index Servers
Shards NOT Determined by
          key hashing
        range partitions
    partitioning by function
Look-Up Data
index




shard 1   shard 2   shard N
index    select shard_id from user_index
                  where user_id = X




shard 1   shard 2               shard N
index    select shard_id from user_index
                  where user_id = X

                    returns 1

shard 1   shard 2               shard N
index       select join_date from users
                  where user_id = X




shard 1   shard 2                shard N
index       select join_date from users
                  where user_id = X


                returns 2012-02-05
shard 1   shard 2                shard N
Ticket Servers
Globally Unique ID
CREATE TABLE `tickets` (
 `id` bigint(20) unsigned NOT NULL auto_increment,
 `stub` char(1) NOT NULL default '',
 PRIMARY KEY (`id`),
 UNIQUE KEY `stub` (`stub`)
) ENGINE=MyISAM
Ticket Generation
REPLACE INTO tickets (stub) VALUES ('a');
SELECT LAST_INSERT_ID();
Ticket Generation
REPLACE INTO tickets (stub) VALUES ('a');
SELECT LAST_INSERT_ID();

SELECT * FROM tickets;
      id            stub

    4589294          a
tickets A
            auto-increment-increment = 2
              auto-increment-offset = 1

tickets B
            auto-increment-increment = 2
              auto-increment-offset = 2
tickets A
            auto-increment-increment = 2
              auto-increment-offset = 1

tickets B
            auto-increment-increment = 2
              auto-increment-offset = 2

  NOT master-master
Shards
Object Hashing
A      B




user_id : 500
A               B




user_id : 500 % (# active replicants)
A                                     B
'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',




   user_id : 500 % (# active replicants)
A                                     B
'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',




   user_id : 500 % (# active replicants)
A            B




user_id : 500 % (2)
A                 B




user_id : 500 % (2) == 0
A                 B




                           select ...
user_id : 500 % (2) == 0   insert ...
                           update ...
A              B




user_id : 500 % (2) == 0
       user_id : 501 % (2) == 1
500          A          B     501
select ...                    select ...
insert ...                    insert ...
update ...                    update ...



user_id : 500 % (2) == 0
       user_id : 501 % (2) == 1
Failure
A              B




user_id : 500 % (2) == 0
       user_id : 501 % (2) == 1
A              B




user_id : 500 % (2) == 0
       user_id : 501 % (2) == 1
A              B




user_id : 500 % (2) == 0
       user_id : 501 % (2) == 1
A                                     B
'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',




   user_id : 500 % (2) == 0
          user_id : 501 % (2) == 1
A                                     B
'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',




   user_id : 500 % (2) == 0
          user_id : 501 % (2) == 1
A              B




user_id : 500 % (1) == 0
       user_id : 501 % (1) == 0
ORM
connection handling
    shard lookup
 replicant selection
CRUD
cache handling
 data validation
data abstraction
Shard Selection
Non-Writable Shards
$config["non_writable_shards"] = array(1, 2, 3, 4);


  public static function getKnownWritableShards(){
    return array_values(
      array_diff(
        self::getKnownShards(),
        self::getNonwritableShards()
    ));
  }
Initial Selection
$shards = EtsyORM::getKnownWritableShards();

$user_shard = $shards[rand(0, count($shards) - 1)];




              user_id      shard_id

                500
Initial Selection
$shards = EtsyORM::getKnownWritableShards();

$user_shard = $shards[rand(0, count($shards) - 1)];




              user_id      shard_id

                500           2
Later....
            select shard_id from user_index
  index             where user_id = X




  shard 1   shard 2               shard N
Variants
shard 1                  shard 2



      user_id    group_id      user_id    group_id

        1             A          3             A

        1             B          3             B

        2             A          4             A

        2             C          5             C




SELECT user_id FROM users_groups WHERE group_id = ‘A’
shard 1                     shard 2



      user_id    group_id       user_id      group_id

        1             A             3             A

        1             B             3             B

        2             A             4             A

        2             C             5             C




SELECT user_id FROM users_groups WHERE group_id = ‘A’
                          Broken!
shard 1                       shard 2



      user_id    group_id           user_id    group_id

        1
        1
                      A
                      B
                            JOIN?     3
                                      3
                                                    A
                                                    B

        2             A               4             A

        2             C               5             C




SELECT user_id FROM users_groups WHERE group_id = ‘A’
                          Broken!
shard 1                       shard 2



      user_id    group_id           user_id    group_id

        1
        1
                      A
                      B
                            JOIN?     3
                                      3
                                                    A
                                                    B

        2             A               4             A

        2             C               5             C




SELECT user_id FROM users_groups WHERE group_id = ‘A’
                          Broken!
users_groups         groups_users
user_id   group_id   group_id   user_id

  1          A          A         1

  1          B          A         3

  2          A          A         2

  2          C          B         3

  3          A          B         1

  3          B          C         2

  3          C          C         3
users_groups_index    groups_users_index
             user_id   shard_id   group_id   shard_id
index          1          1          A          1
               2          1          B          2
               3          2          C          2
               4          3          D          3




         separate indexes for
        different slices of data
users_groups_index        groups_users_index
           user_id   shard_id         group_id   shard_id
index         1         1                 A         1
              2         1                 B         2
              3         2                 C         2
              4         3                 D         3




                         user_id   group_id
        shard 3             4         A
                            4         B
                            4         C
                            4         D
Schema Changes
shard 1   shard 2   shard N
shard 1   shard 2   shard N
Schemanator
shard 1   shard 2   shard N
shard 1             shard 2             shard N




SET SQL_LOG_BIN = 0; ALTER TABLE user ....
shard migration
Why?
Prevent disk from filling
Prevent disk from filling
High traffic objects (shops, users)
Prevent disk from filling
High traffic objects (shops, users)
Shard rebalancing
When?
Balance
Added Shards
per object migration
         <object type> <object id> <shard>

# migrate_object User 5307827 2
percentage migration
<object type> <percent> <old shard> <new shard>


 # migrate_pct User 25 3 6
index
           user_id         shard_id   migration_lock   old_shard_id

             1                1             0               0




 shard 1         shard 2                          shard N
index
           user_id           shard_id   migration_lock   old_shard_id

             1                  1             1               0

           •Lock



 shard 1           shard 2                          shard N
index
           user_id          shard_id   migration_lock   old_shard_id

              1                1             1               0

           •Lock
           •Migrate



 shard 1          shard 2                          shard N
index
           user_id         shard_id   migration_lock   old_shard_id

             1                1             1               0

           •Lock
           •Migrate
           •Checksum


 shard 1         shard 2                          shard N
index
           user_id         shard_id   migration_lock   old_shard_id

             1                1             1               0

           •Lock
           •Migrate
           •Checksum


 shard 1         shard 2                          shard N
index
           user_id         shard_id   migration_lock   old_shard_id

             1                2             0               1

           •Lock
           •Migrate
           •Checksum
           •Unlock

 shard 1         shard 2                          shard N
index
           user_id          shard_id   migration_lock   old_shard_id

              1                2             0               1

           •Lock
           •Migrate
           •Checksum
           •Unlock
           •Delete (from old shard)
 shard 1          shard 2                          shard N
Usage Patterns
Arbitrary Key Hash
tag1     tag2     co_occurrence _count




“red”   “cloth”           666
tag1        tag2      shard_id
 “red”       “cloth”       1
“vintage”    “doll”        3
“antique”   “radio”        5
  “gift”     “vinyl”       2            hash_bucket   shard_id
 “toy”       “car”         1                1            2
 “wool”      “felt”        2
 “floral”
“wood”
            “wreath”
             “table”
                           5
                           8
                                   OR       2
                                            3
                                                         3
                                                         1

 “box”      “wood”         4                4            2
 “doll”     “happy”        5                5            3
 “smile”    “clown”        3
 “radio”    “vintage”     10
 “blue”     “luggage”      8
“shoes”     “green”       12
    ...        ...         ...
1. provide some key
1. provide some key
2. compute corresponding hash bucket
1. provide some key
2. compute corresponding hash bucket
3. lookup hash bucket on index to find shard
1,000,000 'buckets' each with a row in
   arbitrary_key_index which points to a shard
             hash_bucket     shard_id
                 1              2
                 2              3
                 3              1
                 4              2
                 5              3




hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
1,000,000 'buckets' each with a row in
   arbitrary_key_index which points to a shard
             hash_bucket     shard_id
                 1              2
                 2              3
                 3              1
                 4              2
                 5              3




hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
1,000,000 'buckets' each with a row in
   arbitrary_key_index which points to a shard
             hash_bucket     shard_id
                 1              2
                 2              3
                 3              1
                 4              2
                 5              3




hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
1,000,000 'buckets' each with a row in
   arbitrary_key_index which points to a shard
             hash_bucket     shard_id
                 1              2
                 2              3
                 3              1
                 4              2
                 5              3




hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
Partitions
PARTITION BY RANGE (reference_timestamp)(
 PARTITION P5 VALUES LESS THAN (1317441600),
 PARTITION P6 VALUES LESS THAN (1320120000),
 PARTITION P7 VALUES LESS THAN (1322715600),
 PARTITION P8 VALUES LESS THAN (1325394000));
Deleting a large partition:
few hours, tons of disk IO
Deleting a large partition:
      few hours, tons of disk IO
Dropping a 2G partition with 2M rows :
Deleting a large partition:
      few hours, tons of disk IO
Dropping a 2G partition with 2M rows :
                < 1s
# file= "shop_stats_syndication_hourly#P#P1345867200.ibd"
# ln $file $file.remove"
# file= "shop_stats_syndication_hourly#P#P1345867200.ibd"
# ln $file $file.remove"


# stat "shop_stats_syndication_hourly#P#P1345867200.ibd"
 File: `shop_stats_syndication_hourly#P#P1345867200.ibd'
 Size: 65536 Blocks: 136 IO Block: 4096 regular file
Device: 6804h/26628d Inode: 41321163 Links: 2
Access: (0660/-rw-rw----) Uid: ( 104/ mysql) Gid: ( 106/ mysql)
tickets             index




shard 1             shard 2           shard N
Thank you
etsy.com/jobs

More Related Content

What's hot

Lessons learnt building a Distributed Linked List on S3
Lessons learnt building a Distributed Linked List on S3Lessons learnt building a Distributed Linked List on S3
Lessons learnt building a Distributed Linked List on S3AWS User Group Bengaluru
 
Chp5 - Diagramme d'Etat Transition
Chp5 - Diagramme d'Etat TransitionChp5 - Diagramme d'Etat Transition
Chp5 - Diagramme d'Etat TransitionLilia Sfaxi
 
業務で使うIRC
業務で使うIRC業務で使うIRC
業務で使うIRConozaty
 
Les attaques par injection sql
Les attaques par injection sqlLes attaques par injection sql
Les attaques par injection sqlMohamed Yassin
 
PSConfEU - Offensive Active Directory (With PowerShell!)
PSConfEU - Offensive Active Directory (With PowerShell!)PSConfEU - Offensive Active Directory (With PowerShell!)
PSConfEU - Offensive Active Directory (With PowerShell!)Will Schroeder
 
Exception handling and logging best practices
Exception handling and logging best practicesException handling and logging best practices
Exception handling and logging best practicesAngelin R
 
Algebre relationelle
Algebre relationelleAlgebre relationelle
Algebre relationellehnsfr
 
Telecharger Exercices corrigés PL/SQL
Telecharger Exercices corrigés PL/SQLTelecharger Exercices corrigés PL/SQL
Telecharger Exercices corrigés PL/SQLwebreaker
 
Python avancé : Ensemble, dictionnaire et base de données
Python avancé : Ensemble, dictionnaire et base de donnéesPython avancé : Ensemble, dictionnaire et base de données
Python avancé : Ensemble, dictionnaire et base de donnéesECAM Brussels Engineering School
 
範囲証明つき準同型暗号とその対話的プロトコル
範囲証明つき準同型暗号とその対話的プロトコル範囲証明つき準同型暗号とその対話的プロトコル
範囲証明つき準同型暗号とその対話的プロトコルMITSUNARI Shigeo
 
TD1-UML-correction
TD1-UML-correctionTD1-UML-correction
TD1-UML-correctionLilia Sfaxi
 
RSA暗号運用でやってはいけない n のこと #ssmjp
RSA暗号運用でやってはいけない n のこと #ssmjpRSA暗号運用でやってはいけない n のこと #ssmjp
RSA暗号運用でやってはいけない n のこと #ssmjpsonickun
 
Systèmes multi agents concepts et mise en oeuvre avec le middleware jade
Systèmes multi agents concepts et mise en oeuvre avec le middleware jadeSystèmes multi agents concepts et mise en oeuvre avec le middleware jade
Systèmes multi agents concepts et mise en oeuvre avec le middleware jadeENSET, Université Hassan II Casablanca
 

What's hot (20)

Lessons learnt building a Distributed Linked List on S3
Lessons learnt building a Distributed Linked List on S3Lessons learnt building a Distributed Linked List on S3
Lessons learnt building a Distributed Linked List on S3
 
Chp5 - Diagramme d'Etat Transition
Chp5 - Diagramme d'Etat TransitionChp5 - Diagramme d'Etat Transition
Chp5 - Diagramme d'Etat Transition
 
業務で使うIRC
業務で使うIRC業務で使うIRC
業務で使うIRC
 
Les attaques par injection sql
Les attaques par injection sqlLes attaques par injection sql
Les attaques par injection sql
 
PSConfEU - Offensive Active Directory (With PowerShell!)
PSConfEU - Offensive Active Directory (With PowerShell!)PSConfEU - Offensive Active Directory (With PowerShell!)
PSConfEU - Offensive Active Directory (With PowerShell!)
 
Exception handling and logging best practices
Exception handling and logging best practicesException handling and logging best practices
Exception handling and logging best practices
 
Algebre relationelle
Algebre relationelleAlgebre relationelle
Algebre relationelle
 
暗認本読書会5
暗認本読書会5暗認本読書会5
暗認本読書会5
 
新しい暗号技術
新しい暗号技術新しい暗号技術
新しい暗号技術
 
Telecharger Exercices corrigés PL/SQL
Telecharger Exercices corrigés PL/SQLTelecharger Exercices corrigés PL/SQL
Telecharger Exercices corrigés PL/SQL
 
Detection de fraude
Detection de fraudeDetection de fraude
Detection de fraude
 
Python avancé : Ensemble, dictionnaire et base de données
Python avancé : Ensemble, dictionnaire et base de donnéesPython avancé : Ensemble, dictionnaire et base de données
Python avancé : Ensemble, dictionnaire et base de données
 
Data mining - Associativité
Data mining - AssociativitéData mining - Associativité
Data mining - Associativité
 
範囲証明つき準同型暗号とその対話的プロトコル
範囲証明つき準同型暗号とその対話的プロトコル範囲証明つき準同型暗号とその対話的プロトコル
範囲証明つき準同型暗号とその対話的プロトコル
 
Présentation Cryptographie
Présentation CryptographiePrésentation Cryptographie
Présentation Cryptographie
 
TD1-UML-correction
TD1-UML-correctionTD1-UML-correction
TD1-UML-correction
 
RSA暗号運用でやってはいけない n のこと #ssmjp
RSA暗号運用でやってはいけない n のこと #ssmjpRSA暗号運用でやってはいけない n のこと #ssmjp
RSA暗号運用でやってはいけない n のこと #ssmjp
 
Systèmes multi agents concepts et mise en oeuvre avec le middleware jade
Systèmes multi agents concepts et mise en oeuvre avec le middleware jadeSystèmes multi agents concepts et mise en oeuvre avec le middleware jade
Systèmes multi agents concepts et mise en oeuvre avec le middleware jade
 
Diagramme d'activité en UML
Diagramme d'activité en UMLDiagramme d'activité en UML
Diagramme d'activité en UML
 
Support de cours Spring M.youssfi
Support de cours Spring  M.youssfiSupport de cours Spring  M.youssfi
Support de cours Spring M.youssfi
 

Viewers also liked

Java Concurrency Idioms
Java Concurrency IdiomsJava Concurrency Idioms
Java Concurrency IdiomsAlex Miller
 
Polymer & the web components revolution 6:25:14
Polymer & the web components revolution 6:25:14Polymer & the web components revolution 6:25:14
Polymer & the web components revolution 6:25:14mattsmcnulty
 
Downtown & Infill Tax Increment Districts: Strategies for Success
Downtown & Infill Tax Increment Districts: Strategies for SuccessDowntown & Infill Tax Increment Districts: Strategies for Success
Downtown & Infill Tax Increment Districts: Strategies for SuccessVierbicher
 
Appraisal and Performance Management in Schools - A practical approach
Appraisal and Performance Management in Schools - A practical approachAppraisal and Performance Management in Schools - A practical approach
Appraisal and Performance Management in Schools - A practical approachMark S. Steed
 
The Economics of Green Building
The Economics of Green BuildingThe Economics of Green Building
The Economics of Green Buildingnilskok
 
Increment letter format
Increment letter formatIncrement letter format
Increment letter formatDeepti Joshi
 
Downtown & Infill Tax Increment Districts
Downtown & Infill Tax Increment DistrictsDowntown & Infill Tax Increment Districts
Downtown & Infill Tax Increment DistrictsVierbicher
 
Increment Strategy ppt 2012-13 : Play this in slide show mode
Increment Strategy ppt 2012-13 : Play this in slide show modeIncrement Strategy ppt 2012-13 : Play this in slide show mode
Increment Strategy ppt 2012-13 : Play this in slide show modeVipul Saxena
 
Lecture 8 increment_and_decrement_operators
Lecture 8 increment_and_decrement_operatorsLecture 8 increment_and_decrement_operators
Lecture 8 increment_and_decrement_operatorseShikshak
 
Scrum - Agile Methodology
Scrum - Agile MethodologyScrum - Agile Methodology
Scrum - Agile MethodologyNiel Deckx
 
Iocl compensation
Iocl compensationIocl compensation
Iocl compensationmukti91
 
Normal forest – growing stock and increment
Normal forest – growing stock and incrementNormal forest – growing stock and increment
Normal forest – growing stock and incrementiqbalforestry
 
An overview of techniques for detecting software variability concepts in sour...
An overview of techniques for detecting software variability concepts in sour...An overview of techniques for detecting software variability concepts in sour...
An overview of techniques for detecting software variability concepts in sour...Angela Lozano
 
C Prog. - Operators and Expressions
C Prog. - Operators and ExpressionsC Prog. - Operators and Expressions
C Prog. - Operators and Expressionsvinay arora
 

Viewers also liked (20)

Java Concurrency Idioms
Java Concurrency IdiomsJava Concurrency Idioms
Java Concurrency Idioms
 
Polymer & the web components revolution 6:25:14
Polymer & the web components revolution 6:25:14Polymer & the web components revolution 6:25:14
Polymer & the web components revolution 6:25:14
 
Conflict Resolution In Kai
Conflict Resolution In KaiConflict Resolution In Kai
Conflict Resolution In Kai
 
Agile Development
Agile DevelopmentAgile Development
Agile Development
 
Downtown & Infill Tax Increment Districts: Strategies for Success
Downtown & Infill Tax Increment Districts: Strategies for SuccessDowntown & Infill Tax Increment Districts: Strategies for Success
Downtown & Infill Tax Increment Districts: Strategies for Success
 
Appraisal and Performance Management in Schools - A practical approach
Appraisal and Performance Management in Schools - A practical approachAppraisal and Performance Management in Schools - A practical approach
Appraisal and Performance Management in Schools - A practical approach
 
The Economics of Green Building
The Economics of Green BuildingThe Economics of Green Building
The Economics of Green Building
 
Increment letter format
Increment letter formatIncrement letter format
Increment letter format
 
Downtown & Infill Tax Increment Districts
Downtown & Infill Tax Increment DistrictsDowntown & Infill Tax Increment Districts
Downtown & Infill Tax Increment Districts
 
Increment Strategy ppt 2012-13 : Play this in slide show mode
Increment Strategy ppt 2012-13 : Play this in slide show modeIncrement Strategy ppt 2012-13 : Play this in slide show mode
Increment Strategy ppt 2012-13 : Play this in slide show mode
 
Lecture 8 increment_and_decrement_operators
Lecture 8 increment_and_decrement_operatorsLecture 8 increment_and_decrement_operators
Lecture 8 increment_and_decrement_operators
 
String
StringString
String
 
Scrum - Agile Methodology
Scrum - Agile MethodologyScrum - Agile Methodology
Scrum - Agile Methodology
 
Iocl compensation
Iocl compensationIocl compensation
Iocl compensation
 
Incremental
IncrementalIncremental
Incremental
 
Intro To Scrum.V3
Intro To Scrum.V3Intro To Scrum.V3
Intro To Scrum.V3
 
Normal forest – growing stock and increment
Normal forest – growing stock and incrementNormal forest – growing stock and increment
Normal forest – growing stock and increment
 
Introduction to Redux
Introduction to ReduxIntroduction to Redux
Introduction to Redux
 
An overview of techniques for detecting software variability concepts in sour...
An overview of techniques for detecting software variability concepts in sour...An overview of techniques for detecting software variability concepts in sour...
An overview of techniques for detecting software variability concepts in sour...
 
C Prog. - Operators and Expressions
C Prog. - Operators and ExpressionsC Prog. - Operators and Expressions
C Prog. - Operators and Expressions
 

Similar to The Etsy Shard Architecture: Starts With S and Ends With Hard

From mysql to MongoDB(MongoDB2011北京交流会)
From mysql to MongoDB(MongoDB2011北京交流会)From mysql to MongoDB(MongoDB2011北京交流会)
From mysql to MongoDB(MongoDB2011北京交流会)Night Sailer
 
MongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop ConnectorMongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop ConnectorMongoDB
 
Outrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkOutrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkScyllaDB
 
Mysqlnd Async Ipc2008
Mysqlnd Async Ipc2008Mysqlnd Async Ipc2008
Mysqlnd Async Ipc2008Ulf Wendel
 
My sql查询优化实践
My sql查询优化实践My sql查询优化实践
My sql查询优化实践ghostsun
 
Introduction to Active Record at MySQL Conference 2007
Introduction to Active Record at MySQL Conference 2007Introduction to Active Record at MySQL Conference 2007
Introduction to Active Record at MySQL Conference 2007Rabble .
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redisDvir Volk
 
ROS2勉強会@別府 第7章Pythonクライアントライブラリrclpy
ROS2勉強会@別府 第7章PythonクライアントライブラリrclpyROS2勉強会@別府 第7章Pythonクライアントライブラリrclpy
ROS2勉強会@別府 第7章PythonクライアントライブラリrclpyAtsuki Yokota
 
Extending Moose
Extending MooseExtending Moose
Extending Moosesartak
 
Tame Accidental Complexity with Ruby and MongoMapper
Tame Accidental Complexity with Ruby and MongoMapperTame Accidental Complexity with Ruby and MongoMapper
Tame Accidental Complexity with Ruby and MongoMapperGiordano Scalzo
 
Fraud Detection and Neo4j
Fraud Detection and Neo4j Fraud Detection and Neo4j
Fraud Detection and Neo4j Max De Marzi
 
Mongodb index 讀書心得
Mongodb index 讀書心得Mongodb index 讀書心得
Mongodb index 讀書心得cc liu
 
はじめてのMongoDB
はじめてのMongoDBはじめてのMongoDB
はじめてのMongoDBTakahiro Inoue
 
What's new in Redis v3.2
What's new in Redis v3.2What's new in Redis v3.2
What's new in Redis v3.2Itamar Haber
 
gumiStudy#2 実践 memcached
gumiStudy#2 実践 memcachedgumiStudy#2 実践 memcached
gumiStudy#2 実践 memcachedgumilab
 

Similar to The Etsy Shard Architecture: Starts With S and Ends With Hard (20)

MySQL under the siege
MySQL under the siegeMySQL under the siege
MySQL under the siege
 
From mysql to MongoDB(MongoDB2011北京交流会)
From mysql to MongoDB(MongoDB2011北京交流会)From mysql to MongoDB(MongoDB2011北京交流会)
From mysql to MongoDB(MongoDB2011北京交流会)
 
Mac authentication amigopod radius
Mac authentication amigopod radiusMac authentication amigopod radius
Mac authentication amigopod radius
 
MongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop ConnectorMongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
 
Outrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkOutrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar Framework
 
Mysqlnd Async Ipc2008
Mysqlnd Async Ipc2008Mysqlnd Async Ipc2008
Mysqlnd Async Ipc2008
 
My sql查询优化实践
My sql查询优化实践My sql查询优化实践
My sql查询优化实践
 
Introduction to Active Record at MySQL Conference 2007
Introduction to Active Record at MySQL Conference 2007Introduction to Active Record at MySQL Conference 2007
Introduction to Active Record at MySQL Conference 2007
 
Undrop for InnoDB
Undrop for InnoDBUndrop for InnoDB
Undrop for InnoDB
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redis
 
ROS2勉強会@別府 第7章Pythonクライアントライブラリrclpy
ROS2勉強会@別府 第7章PythonクライアントライブラリrclpyROS2勉強会@別府 第7章Pythonクライアントライブラリrclpy
ROS2勉強会@別府 第7章Pythonクライアントライブラリrclpy
 
Extending Moose
Extending MooseExtending Moose
Extending Moose
 
Tame Accidental Complexity with Ruby and MongoMapper
Tame Accidental Complexity with Ruby and MongoMapperTame Accidental Complexity with Ruby and MongoMapper
Tame Accidental Complexity with Ruby and MongoMapper
 
Web security
Web securityWeb security
Web security
 
Fraud Detection and Neo4j
Fraud Detection and Neo4j Fraud Detection and Neo4j
Fraud Detection and Neo4j
 
Mongodb workshop
Mongodb workshopMongodb workshop
Mongodb workshop
 
Mongodb index 讀書心得
Mongodb index 讀書心得Mongodb index 讀書心得
Mongodb index 讀書心得
 
はじめてのMongoDB
はじめてのMongoDBはじめてのMongoDB
はじめてのMongoDB
 
What's new in Redis v3.2
What's new in Redis v3.2What's new in Redis v3.2
What's new in Redis v3.2
 
gumiStudy#2 実践 memcached
gumiStudy#2 実践 memcachedgumiStudy#2 実践 memcached
gumiStudy#2 実践 memcached
 

Recently uploaded

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 

Recently uploaded (20)

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 

The Etsy Shard Architecture: Starts With S and Ends With Hard

  • 1. The Etsy Shard Architecture Starts With S and Ends With Hard jgoulah@etsy.com / @johngoulah
  • 2.
  • 3. 1.5B page views / mo. 525MM sales in 2011 40MM unique visitors/mo. 800K shops / 150 countries
  • 4.
  • 5.
  • 6. 25K+ queries/sec avg 3TB InnoDB buffer pool 15TB+ data stored 99.99% queries under 1ms
  • 7. 50+ MySQL servers Server Spec HP DL 380 G7 96GB RAM 16 spindles / 1TB RAID 10 24 Core
  • 8.
  • 9. Ross Snyder Scaling Etsy - What Went Wrong, What Went Right http://bit.ly/rpcxtP Matt Graham Migrating From PG to MySQL Without Downtime http://bit.ly/rQpqZG
  • 13. Master - Master R/W R/W
  • 14. Master - Master R/W R/W Side A Side B
  • 16. shard 1 shard 2 shard N ...
  • 17. shard 1 shard 2 shard N ... shard N + 1
  • 18. shard 1 shard 2 shard N ... Migrate Migrate Migrate shard N + 1
  • 20. tickets index shard 1 shard 2 shard N
  • 21. tickets index Unique IDs shard 1 shard 2 shard N
  • 22. tickets index Shard Lookup shard 1 shard 2 shard N
  • 23. tickets index shard 1 shard 2 shard N Store/Retrieve Data
  • 25. users_groups user_id group_id 1 A 1 B 2 A 2 C 3 A 3 B 3 C
  • 26. users_groups user_id group_id 1 A 1 B 2 A 2 C 3 A 3 B 3 C
  • 27. users_groups user_id group_id 1 A 1 B 2 A user_id group_id 2 C 3 A 3 A 3 B 3 B 3 C 3 C
  • 28. users_groups shard 1 user_id group_id 1 A 1 B shard 2 2 A user_id group_id 2 C 3 A 3 B 3 C
  • 30. Shards NOT Determined by key hashing range partitions partitioning by function
  • 32. index shard 1 shard 2 shard N
  • 33. index select shard_id from user_index where user_id = X shard 1 shard 2 shard N
  • 34. index select shard_id from user_index where user_id = X returns 1 shard 1 shard 2 shard N
  • 35. index select join_date from users where user_id = X shard 1 shard 2 shard N
  • 36. index select join_date from users where user_id = X returns 2012-02-05 shard 1 shard 2 shard N
  • 39. CREATE TABLE `tickets` ( `id` bigint(20) unsigned NOT NULL auto_increment, `stub` char(1) NOT NULL default '', PRIMARY KEY (`id`), UNIQUE KEY `stub` (`stub`) ) ENGINE=MyISAM
  • 40. Ticket Generation REPLACE INTO tickets (stub) VALUES ('a'); SELECT LAST_INSERT_ID();
  • 41. Ticket Generation REPLACE INTO tickets (stub) VALUES ('a'); SELECT LAST_INSERT_ID(); SELECT * FROM tickets; id stub 4589294 a
  • 42. tickets A auto-increment-increment = 2 auto-increment-offset = 1 tickets B auto-increment-increment = 2 auto-increment-offset = 2
  • 43. tickets A auto-increment-increment = 2 auto-increment-offset = 1 tickets B auto-increment-increment = 2 auto-increment-offset = 2 NOT master-master
  • 46. A B user_id : 500
  • 47. A B user_id : 500 % (# active replicants)
  • 48. A B 'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', user_id : 500 % (# active replicants)
  • 49. A B 'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', user_id : 500 % (# active replicants)
  • 50. A B user_id : 500 % (2)
  • 51. A B user_id : 500 % (2) == 0
  • 52. A B select ... user_id : 500 % (2) == 0 insert ... update ...
  • 53. A B user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 54. 500 A B 501 select ... select ... insert ... insert ... update ... update ... user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 56. A B user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 57. A B user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 58. A B user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 59. A B 'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 60. A B 'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 61. A B user_id : 500 % (1) == 0 user_id : 501 % (1) == 0
  • 62. ORM
  • 63. connection handling shard lookup replicant selection
  • 64. CRUD cache handling data validation data abstraction
  • 66. Non-Writable Shards $config["non_writable_shards"] = array(1, 2, 3, 4); public static function getKnownWritableShards(){ return array_values( array_diff( self::getKnownShards(), self::getNonwritableShards() )); }
  • 67. Initial Selection $shards = EtsyORM::getKnownWritableShards(); $user_shard = $shards[rand(0, count($shards) - 1)]; user_id shard_id 500
  • 68. Initial Selection $shards = EtsyORM::getKnownWritableShards(); $user_shard = $shards[rand(0, count($shards) - 1)]; user_id shard_id 500 2
  • 69. Later.... select shard_id from user_index index where user_id = X shard 1 shard 2 shard N
  • 71. shard 1 shard 2 user_id group_id user_id group_id 1 A 3 A 1 B 3 B 2 A 4 A 2 C 5 C SELECT user_id FROM users_groups WHERE group_id = ‘A’
  • 72. shard 1 shard 2 user_id group_id user_id group_id 1 A 3 A 1 B 3 B 2 A 4 A 2 C 5 C SELECT user_id FROM users_groups WHERE group_id = ‘A’ Broken!
  • 73. shard 1 shard 2 user_id group_id user_id group_id 1 1 A B JOIN? 3 3 A B 2 A 4 A 2 C 5 C SELECT user_id FROM users_groups WHERE group_id = ‘A’ Broken!
  • 74. shard 1 shard 2 user_id group_id user_id group_id 1 1 A B JOIN? 3 3 A B 2 A 4 A 2 C 5 C SELECT user_id FROM users_groups WHERE group_id = ‘A’ Broken!
  • 75. users_groups groups_users user_id group_id group_id user_id 1 A A 1 1 B A 3 2 A A 2 2 C B 3 3 A B 1 3 B C 2 3 C C 3
  • 76. users_groups_index groups_users_index user_id shard_id group_id shard_id index 1 1 A 1 2 1 B 2 3 2 C 2 4 3 D 3 separate indexes for different slices of data
  • 77. users_groups_index groups_users_index user_id shard_id group_id shard_id index 1 1 A 1 2 1 B 2 3 2 C 2 4 3 D 3 user_id group_id shard 3 4 A 4 B 4 C 4 D
  • 79. shard 1 shard 2 shard N
  • 80. shard 1 shard 2 shard N
  • 82.
  • 83.
  • 84. shard 1 shard 2 shard N
  • 85. shard 1 shard 2 shard N SET SQL_LOG_BIN = 0; ALTER TABLE user ....
  • 87. Why?
  • 88. Prevent disk from filling
  • 89. Prevent disk from filling High traffic objects (shops, users)
  • 90. Prevent disk from filling High traffic objects (shops, users) Shard rebalancing
  • 91. When?
  • 92.
  • 95. per object migration <object type> <object id> <shard> # migrate_object User 5307827 2
  • 96. percentage migration <object type> <percent> <old shard> <new shard> # migrate_pct User 25 3 6
  • 97. index user_id shard_id migration_lock old_shard_id 1 1 0 0 shard 1 shard 2 shard N
  • 98. index user_id shard_id migration_lock old_shard_id 1 1 1 0 •Lock shard 1 shard 2 shard N
  • 99. index user_id shard_id migration_lock old_shard_id 1 1 1 0 •Lock •Migrate shard 1 shard 2 shard N
  • 100. index user_id shard_id migration_lock old_shard_id 1 1 1 0 •Lock •Migrate •Checksum shard 1 shard 2 shard N
  • 101. index user_id shard_id migration_lock old_shard_id 1 1 1 0 •Lock •Migrate •Checksum shard 1 shard 2 shard N
  • 102. index user_id shard_id migration_lock old_shard_id 1 2 0 1 •Lock •Migrate •Checksum •Unlock shard 1 shard 2 shard N
  • 103. index user_id shard_id migration_lock old_shard_id 1 2 0 1 •Lock •Migrate •Checksum •Unlock •Delete (from old shard) shard 1 shard 2 shard N
  • 106. tag1 tag2 co_occurrence _count “red” “cloth” 666
  • 107. tag1 tag2 shard_id “red” “cloth” 1 “vintage” “doll” 3 “antique” “radio” 5 “gift” “vinyl” 2 hash_bucket shard_id “toy” “car” 1 1 2 “wool” “felt” 2 “floral” “wood” “wreath” “table” 5 8 OR 2 3 3 1 “box” “wood” 4 4 2 “doll” “happy” 5 5 3 “smile” “clown” 3 “radio” “vintage” 10 “blue” “luggage” 8 “shoes” “green” 12 ... ... ...
  • 109. 1. provide some key 2. compute corresponding hash bucket
  • 110. 1. provide some key 2. compute corresponding hash bucket 3. lookup hash bucket on index to find shard
  • 111. 1,000,000 'buckets' each with a row in arbitrary_key_index which points to a shard hash_bucket shard_id 1 2 2 3 3 1 4 2 5 3 hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
  • 112. 1,000,000 'buckets' each with a row in arbitrary_key_index which points to a shard hash_bucket shard_id 1 2 2 3 3 1 4 2 5 3 hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
  • 113. 1,000,000 'buckets' each with a row in arbitrary_key_index which points to a shard hash_bucket shard_id 1 2 2 3 3 1 4 2 5 3 hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
  • 114. 1,000,000 'buckets' each with a row in arbitrary_key_index which points to a shard hash_bucket shard_id 1 2 2 3 3 1 4 2 5 3 hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
  • 116. PARTITION BY RANGE (reference_timestamp)( PARTITION P5 VALUES LESS THAN (1317441600), PARTITION P6 VALUES LESS THAN (1320120000), PARTITION P7 VALUES LESS THAN (1322715600), PARTITION P8 VALUES LESS THAN (1325394000));
  • 117. Deleting a large partition: few hours, tons of disk IO
  • 118. Deleting a large partition: few hours, tons of disk IO Dropping a 2G partition with 2M rows :
  • 119. Deleting a large partition: few hours, tons of disk IO Dropping a 2G partition with 2M rows : < 1s
  • 121. # file= "shop_stats_syndication_hourly#P#P1345867200.ibd" # ln $file $file.remove" # stat "shop_stats_syndication_hourly#P#P1345867200.ibd" File: `shop_stats_syndication_hourly#P#P1345867200.ibd' Size: 65536 Blocks: 136 IO Block: 4096 regular file Device: 6804h/26628d Inode: 41321163 Links: 2 Access: (0660/-rw-rw----) Uid: ( 104/ mysql) Gid: ( 106/ mysql)
  • 122. tickets index shard 1 shard 2 shard N