SlideShare a Scribd company logo
1 of 123
Download to read offline
The Etsy Shard Architecture
    Starts With S and Ends With Hard


        jgoulah@etsy.com / @johngoulah
1.5B page views / mo.
525MM sales in 2011
40MM unique visitors/mo.
800K shops / 150 countries
25K+ queries/sec avg
3TB InnoDB buffer pool
15TB+ data stored
99.99% queries under 1ms
50+ MySQL servers

      Server Spec
      HP DL 380 G7
       96GB RAM
16 spindles / 1TB RAID 10
        24 Core
Ross Snyder
Scaling Etsy - What Went Wrong, What Went Right
           http://bit.ly/rpcxtP


             Matt Graham
 Migrating From PG to MySQL Without Downtime
          http://bit.ly/rQpqZG
Architecture
Redundancy
Master - Master
Master - Master

  R/W      R/W
Master - Master

  R/W      R/W

 Side A   Side B
Scalability
shard 1   shard 2         shard N

                    ...
shard 1    shard 2            shard N

                        ...



          shard N + 1
shard 1        shard 2                shard N

                               ...
Migrate     Migrate           Migrate


                shard N + 1
Bird’s-Eye View
tickets             index




shard 1             shard 2           shard N
tickets             index
 Unique IDs

shard 1             shard 2           shard N
tickets                 index
                              Shard Lookup

shard 1             shard 2               shard N
tickets             index




shard 1             shard 2           shard N
          Store/Retrieve Data
Basics
users_groups


user_id   group_id
  1          A
  1          B
  2          A
  2          C

  3          A

  3          B

  3          C
users_groups


user_id   group_id
  1          A
  1          B
  2          A
  2          C

  3          A

  3          B

  3          C
users_groups


user_id   group_id
  1          A
  1          B
  2          A                      user_id   group_id
  2          C                        3          A
  3          A                        3          B
  3          B                        3          C

  3          C
users_groups
          shard 1
user_id         group_id
  1                 A
  1                 B
                                                    shard 2
  2                 A                     user_id         group_id
  2                 C                       3                 A

                                            3                 B

                                            3                 C
Index Servers
Shards NOT Determined by
          key hashing
        range partitions
    partitioning by function
Look-Up Data
index




shard 1   shard 2   shard N
index    select shard_id from user_index
                  where user_id = X




shard 1   shard 2               shard N
index    select shard_id from user_index
                  where user_id = X

                    returns 1

shard 1   shard 2               shard N
index       select join_date from users
                  where user_id = X




shard 1   shard 2                shard N
index       select join_date from users
                  where user_id = X


                returns 2012-02-05
shard 1   shard 2                shard N
Ticket Servers
Globally Unique ID
CREATE TABLE `tickets` (
 `id` bigint(20) unsigned NOT NULL auto_increment,
 `stub` char(1) NOT NULL default '',
 PRIMARY KEY (`id`),
 UNIQUE KEY `stub` (`stub`)
) ENGINE=MyISAM
Ticket Generation
REPLACE INTO tickets (stub) VALUES ('a');
SELECT LAST_INSERT_ID();
Ticket Generation
REPLACE INTO tickets (stub) VALUES ('a');
SELECT LAST_INSERT_ID();

SELECT * FROM tickets;
      id            stub

    4589294          a
tickets A
            auto-increment-increment = 2
              auto-increment-offset = 1

tickets B
            auto-increment-increment = 2
              auto-increment-offset = 2
tickets A
            auto-increment-increment = 2
              auto-increment-offset = 1

tickets B
            auto-increment-increment = 2
              auto-increment-offset = 2

  NOT master-master
Shards
Object Hashing
A      B




user_id : 500
A               B




user_id : 500 % (# active replicants)
A                                     B
'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',




   user_id : 500 % (# active replicants)
A                                     B
'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',




   user_id : 500 % (# active replicants)
A            B




user_id : 500 % (2)
A                 B




user_id : 500 % (2) == 0
A                 B




                           select ...
user_id : 500 % (2) == 0   insert ...
                           update ...
A              B




user_id : 500 % (2) == 0
       user_id : 501 % (2) == 1
500          A          B     501
select ...                    select ...
insert ...                    insert ...
update ...                    update ...



user_id : 500 % (2) == 0
       user_id : 501 % (2) == 1
Failure
A              B




user_id : 500 % (2) == 0
       user_id : 501 % (2) == 1
A              B




user_id : 500 % (2) == 0
       user_id : 501 % (2) == 1
A              B




user_id : 500 % (2) == 0
       user_id : 501 % (2) == 1
A                                     B
'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',




   user_id : 500 % (2) == 0
          user_id : 501 % (2) == 1
A                                     B
'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw',
'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',
'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw',




   user_id : 500 % (2) == 0
          user_id : 501 % (2) == 1
A              B




user_id : 500 % (1) == 0
       user_id : 501 % (1) == 0
ORM
connection handling
    shard lookup
 replicant selection
CRUD
cache handling
 data validation
data abstraction
Shard Selection
Non-Writable Shards
$config["non_writable_shards"] = array(1, 2, 3, 4);


  public static function getKnownWritableShards(){
    return array_values(
      array_diff(
        self::getKnownShards(),
        self::getNonwritableShards()
    ));
  }
Initial Selection
$shards = EtsyORM::getKnownWritableShards();

$user_shard = $shards[rand(0, count($shards) - 1)];




              user_id      shard_id

                500
Initial Selection
$shards = EtsyORM::getKnownWritableShards();

$user_shard = $shards[rand(0, count($shards) - 1)];




              user_id      shard_id

                500           2
Later....
            select shard_id from user_index
  index             where user_id = X




  shard 1   shard 2               shard N
Variants
shard 1                  shard 2



      user_id    group_id      user_id    group_id

        1             A          3             A

        1             B          3             B

        2             A          4             A

        2             C          5             C




SELECT user_id FROM users_groups WHERE group_id = ‘A’
shard 1                     shard 2



      user_id    group_id       user_id      group_id

        1             A             3             A

        1             B             3             B

        2             A             4             A

        2             C             5             C




SELECT user_id FROM users_groups WHERE group_id = ‘A’
                          Broken!
shard 1                       shard 2



      user_id    group_id           user_id    group_id

        1
        1
                      A
                      B
                            JOIN?     3
                                      3
                                                    A
                                                    B

        2             A               4             A

        2             C               5             C




SELECT user_id FROM users_groups WHERE group_id = ‘A’
                          Broken!
shard 1                       shard 2



      user_id    group_id           user_id    group_id

        1
        1
                      A
                      B
                            JOIN?     3
                                      3
                                                    A
                                                    B

        2             A               4             A

        2             C               5             C




SELECT user_id FROM users_groups WHERE group_id = ‘A’
                          Broken!
users_groups         groups_users
user_id   group_id   group_id   user_id

  1          A          A         1

  1          B          A         3

  2          A          A         2

  2          C          B         3

  3          A          B         1

  3          B          C         2

  3          C          C         3
users_groups_index    groups_users_index
             user_id   shard_id   group_id   shard_id
index          1          1          A          1
               2          1          B          2
               3          2          C          2
               4          3          D          3




         separate indexes for
        different slices of data
users_groups_index        groups_users_index
           user_id   shard_id         group_id   shard_id
index         1         1                 A         1
              2         1                 B         2
              3         2                 C         2
              4         3                 D         3




                         user_id   group_id
        shard 3             4         A
                            4         B
                            4         C
                            4         D
Schema Changes
shard 1   shard 2   shard N
shard 1   shard 2   shard N
Schemanator
shard 1   shard 2   shard N
shard 1             shard 2             shard N




SET SQL_LOG_BIN = 0; ALTER TABLE user ....
shard migration
Why?
Prevent disk from filling
Prevent disk from filling
High traffic objects (shops, users)
Prevent disk from filling
High traffic objects (shops, users)
Shard rebalancing
When?
Balance
Added Shards
per object migration
         <object type> <object id> <shard>

# migrate_object User 5307827 2
percentage migration
<object type> <percent> <old shard> <new shard>


 # migrate_pct User 25 3 6
index
           user_id         shard_id   migration_lock   old_shard_id

             1                1             0               0




 shard 1         shard 2                          shard N
index
           user_id           shard_id   migration_lock   old_shard_id

             1                  1             1               0

           •Lock



 shard 1           shard 2                          shard N
index
           user_id          shard_id   migration_lock   old_shard_id

              1                1             1               0

           •Lock
           •Migrate



 shard 1          shard 2                          shard N
index
           user_id         shard_id   migration_lock   old_shard_id

             1                1             1               0

           •Lock
           •Migrate
           •Checksum


 shard 1         shard 2                          shard N
index
           user_id         shard_id   migration_lock   old_shard_id

             1                1             1               0

           •Lock
           •Migrate
           •Checksum


 shard 1         shard 2                          shard N
index
           user_id         shard_id   migration_lock   old_shard_id

             1                2             0               1

           •Lock
           •Migrate
           •Checksum
           •Unlock

 shard 1         shard 2                          shard N
index
           user_id          shard_id   migration_lock   old_shard_id

              1                2             0               1

           •Lock
           •Migrate
           •Checksum
           •Unlock
           •Delete (from old shard)
 shard 1          shard 2                          shard N
Usage Patterns
Arbitrary Key Hash
tag1     tag2     co_occurrence _count




“red”   “cloth”           666
tag1        tag2      shard_id
 “red”       “cloth”       1
“vintage”    “doll”        3
“antique”   “radio”        5
  “gift”     “vinyl”       2            hash_bucket   shard_id
 “toy”       “car”         1                1            2
 “wool”      “felt”        2
 “floral”
“wood”
            “wreath”
             “table”
                           5
                           8
                                   OR       2
                                            3
                                                         3
                                                         1

 “box”      “wood”         4                4            2
 “doll”     “happy”        5                5            3
 “smile”    “clown”        3
 “radio”    “vintage”     10
 “blue”     “luggage”      8
“shoes”     “green”       12
    ...        ...         ...
1. provide some key
1. provide some key
2. compute corresponding hash bucket
1. provide some key
2. compute corresponding hash bucket
3. lookup hash bucket on index to find shard
1,000,000 'buckets' each with a row in
   arbitrary_key_index which points to a shard
             hash_bucket     shard_id
                 1              2
                 2              3
                 3              1
                 4              2
                 5              3




hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
1,000,000 'buckets' each with a row in
   arbitrary_key_index which points to a shard
             hash_bucket     shard_id
                 1              2
                 2              3
                 3              1
                 4              2
                 5              3




hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
1,000,000 'buckets' each with a row in
   arbitrary_key_index which points to a shard
             hash_bucket     shard_id
                 1              2
                 2              3
                 3              1
                 4              2
                 5              3




hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
1,000,000 'buckets' each with a row in
   arbitrary_key_index which points to a shard
             hash_bucket     shard_id
                 1              2
                 2              3
                 3              1
                 4              2
                 5              3




hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
Partitions
PARTITION BY RANGE (reference_timestamp)(
 PARTITION P5 VALUES LESS THAN (1317441600),
 PARTITION P6 VALUES LESS THAN (1320120000),
 PARTITION P7 VALUES LESS THAN (1322715600),
 PARTITION P8 VALUES LESS THAN (1325394000));
Deleting a large partition:
few hours, tons of disk IO
Deleting a large partition:
      few hours, tons of disk IO
Dropping a 2G partition with 2M rows :
Deleting a large partition:
      few hours, tons of disk IO
Dropping a 2G partition with 2M rows :
                < 1s
# file= "shop_stats_syndication_hourly#P#P1345867200.ibd"
# ln $file $file.remove"
# file= "shop_stats_syndication_hourly#P#P1345867200.ibd"
# ln $file $file.remove"


# stat "shop_stats_syndication_hourly#P#P1345867200.ibd"
 File: `shop_stats_syndication_hourly#P#P1345867200.ibd'
 Size: 65536 Blocks: 136 IO Block: 4096 regular file
Device: 6804h/26628d Inode: 41321163 Links: 2
Access: (0660/-rw-rw----) Uid: ( 104/ mysql) Gid: ( 106/ mysql)
tickets             index




shard 1             shard 2           shard N
Thank you
etsy.com/jobs

More Related Content

What's hot

Angular Dependency Injection
Angular Dependency InjectionAngular Dependency Injection
Angular Dependency InjectionNir Kaufman
 
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11Kenny Gryp
 
AOT and Native with Spring Boot 3.0
AOT and Native with Spring Boot 3.0AOT and Native with Spring Boot 3.0
AOT and Native with Spring Boot 3.0MoritzHalbritter
 
InnoDB MVCC Architecture (by 권건우)
InnoDB MVCC Architecture (by 권건우)InnoDB MVCC Architecture (by 권건우)
InnoDB MVCC Architecture (by 권건우)I Goo Lee.
 
Always on 可用性グループ 構築時のポイント
Always on 可用性グループ 構築時のポイントAlways on 可用性グループ 構築時のポイント
Always on 可用性グループ 構築時のポイントMasayuki Ozawa
 
Sql server logshipping
Sql server logshippingSql server logshipping
Sql server logshippingZeba Ansari
 
Microsoft Azure Storage 概要
Microsoft Azure Storage 概要Microsoft Azure Storage 概要
Microsoft Azure Storage 概要Takeshi Fukuhara
 
Consuming Libraries with CMake
Consuming Libraries with CMakeConsuming Libraries with CMake
Consuming Libraries with CMakeRichard Thomson
 
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZEMySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZENorvald Ryeng
 
[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL TuningPgDay.Seoul
 
働き方改革を後押しする Office 365 + リモートワークソリューション ~Azure Active Directoryとの組み合わせで実現する~リ...
働き方改革を後押しする Office 365 + リモートワークソリューション ~Azure Active Directoryとの組み合わせで実現する~リ...働き方改革を後押しする Office 365 + リモートワークソリューション ~Azure Active Directoryとの組み合わせで実現する~リ...
働き方改革を後押しする Office 365 + リモートワークソリューション ~Azure Active Directoryとの組み合わせで実現する~リ...NHN テコラス株式会社
 
仕組みがわかるActive Directory
仕組みがわかるActive Directory仕組みがわかるActive Directory
仕組みがわかるActive DirectorySuguru Kunii
 
PostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with groupingPostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with groupingAlexey Bashtanov
 
Understand oracle real application cluster
Understand oracle real application clusterUnderstand oracle real application cluster
Understand oracle real application clusterSatishbabu Gunukula
 
Complete MVC on NodeJS
Complete MVC on NodeJSComplete MVC on NodeJS
Complete MVC on NodeJSHüseyin BABAL
 
Best Practices in Handling Performance Issues
Best Practices in Handling Performance IssuesBest Practices in Handling Performance Issues
Best Practices in Handling Performance IssuesOdoo
 
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...Single Sign-On for APEX applications based on Kerberos (Important: latest ver...
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...Niels de Bruijn
 
Composer 套件管理
Composer 套件管理Composer 套件管理
Composer 套件管理Shengyou Fan
 

What's hot (20)

Angular Dependency Injection
Angular Dependency InjectionAngular Dependency Injection
Angular Dependency Injection
 
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
 
AOT and Native with Spring Boot 3.0
AOT and Native with Spring Boot 3.0AOT and Native with Spring Boot 3.0
AOT and Native with Spring Boot 3.0
 
InnoDB MVCC Architecture (by 권건우)
InnoDB MVCC Architecture (by 권건우)InnoDB MVCC Architecture (by 권건우)
InnoDB MVCC Architecture (by 권건우)
 
Always on 可用性グループ 構築時のポイント
Always on 可用性グループ 構築時のポイントAlways on 可用性グループ 構築時のポイント
Always on 可用性グループ 構築時のポイント
 
Sql server logshipping
Sql server logshippingSql server logshipping
Sql server logshipping
 
Microsoft Azure Storage 概要
Microsoft Azure Storage 概要Microsoft Azure Storage 概要
Microsoft Azure Storage 概要
 
Consuming Libraries with CMake
Consuming Libraries with CMakeConsuming Libraries with CMake
Consuming Libraries with CMake
 
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZEMySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
 
[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning
 
働き方改革を後押しする Office 365 + リモートワークソリューション ~Azure Active Directoryとの組み合わせで実現する~リ...
働き方改革を後押しする Office 365 + リモートワークソリューション ~Azure Active Directoryとの組み合わせで実現する~リ...働き方改革を後押しする Office 365 + リモートワークソリューション ~Azure Active Directoryとの組み合わせで実現する~リ...
働き方改革を後押しする Office 365 + リモートワークソリューション ~Azure Active Directoryとの組み合わせで実現する~リ...
 
仕組みがわかるActive Directory
仕組みがわかるActive Directory仕組みがわかるActive Directory
仕組みがわかるActive Directory
 
PostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with groupingPostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with grouping
 
Understand oracle real application cluster
Understand oracle real application clusterUnderstand oracle real application cluster
Understand oracle real application cluster
 
Complete MVC on NodeJS
Complete MVC on NodeJSComplete MVC on NodeJS
Complete MVC on NodeJS
 
Rac 12c optimization
Rac 12c optimizationRac 12c optimization
Rac 12c optimization
 
Best Practices in Handling Performance Issues
Best Practices in Handling Performance IssuesBest Practices in Handling Performance Issues
Best Practices in Handling Performance Issues
 
Spring jdbc
Spring jdbcSpring jdbc
Spring jdbc
 
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...Single Sign-On for APEX applications based on Kerberos (Important: latest ver...
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...
 
Composer 套件管理
Composer 套件管理Composer 套件管理
Composer 套件管理
 

Viewers also liked

Java Concurrency Idioms
Java Concurrency IdiomsJava Concurrency Idioms
Java Concurrency IdiomsAlex Miller
 
Polymer & the web components revolution 6:25:14
Polymer & the web components revolution 6:25:14Polymer & the web components revolution 6:25:14
Polymer & the web components revolution 6:25:14mattsmcnulty
 
Downtown & Infill Tax Increment Districts: Strategies for Success
Downtown & Infill Tax Increment Districts: Strategies for SuccessDowntown & Infill Tax Increment Districts: Strategies for Success
Downtown & Infill Tax Increment Districts: Strategies for SuccessVierbicher
 
Appraisal and Performance Management in Schools - A practical approach
Appraisal and Performance Management in Schools - A practical approachAppraisal and Performance Management in Schools - A practical approach
Appraisal and Performance Management in Schools - A practical approachMark S. Steed
 
The Economics of Green Building
The Economics of Green BuildingThe Economics of Green Building
The Economics of Green Buildingnilskok
 
Increment letter format
Increment letter formatIncrement letter format
Increment letter formatDeepti Joshi
 
Downtown & Infill Tax Increment Districts
Downtown & Infill Tax Increment DistrictsDowntown & Infill Tax Increment Districts
Downtown & Infill Tax Increment DistrictsVierbicher
 
Increment Strategy ppt 2012-13 : Play this in slide show mode
Increment Strategy ppt 2012-13 : Play this in slide show modeIncrement Strategy ppt 2012-13 : Play this in slide show mode
Increment Strategy ppt 2012-13 : Play this in slide show modeVipul Saxena
 
Lecture 8 increment_and_decrement_operators
Lecture 8 increment_and_decrement_operatorsLecture 8 increment_and_decrement_operators
Lecture 8 increment_and_decrement_operatorseShikshak
 
Scrum - Agile Methodology
Scrum - Agile MethodologyScrum - Agile Methodology
Scrum - Agile MethodologyNiel Deckx
 
Iocl compensation
Iocl compensationIocl compensation
Iocl compensationmukti91
 
Normal forest – growing stock and increment
Normal forest – growing stock and incrementNormal forest – growing stock and increment
Normal forest – growing stock and incrementiqbalforestry
 
An overview of techniques for detecting software variability concepts in sour...
An overview of techniques for detecting software variability concepts in sour...An overview of techniques for detecting software variability concepts in sour...
An overview of techniques for detecting software variability concepts in sour...Angela Lozano
 
C Prog. - Operators and Expressions
C Prog. - Operators and ExpressionsC Prog. - Operators and Expressions
C Prog. - Operators and Expressionsvinay arora
 

Viewers also liked (20)

Java Concurrency Idioms
Java Concurrency IdiomsJava Concurrency Idioms
Java Concurrency Idioms
 
Polymer & the web components revolution 6:25:14
Polymer & the web components revolution 6:25:14Polymer & the web components revolution 6:25:14
Polymer & the web components revolution 6:25:14
 
Conflict Resolution In Kai
Conflict Resolution In KaiConflict Resolution In Kai
Conflict Resolution In Kai
 
Agile Development
Agile DevelopmentAgile Development
Agile Development
 
Downtown & Infill Tax Increment Districts: Strategies for Success
Downtown & Infill Tax Increment Districts: Strategies for SuccessDowntown & Infill Tax Increment Districts: Strategies for Success
Downtown & Infill Tax Increment Districts: Strategies for Success
 
Appraisal and Performance Management in Schools - A practical approach
Appraisal and Performance Management in Schools - A practical approachAppraisal and Performance Management in Schools - A practical approach
Appraisal and Performance Management in Schools - A practical approach
 
The Economics of Green Building
The Economics of Green BuildingThe Economics of Green Building
The Economics of Green Building
 
Increment letter format
Increment letter formatIncrement letter format
Increment letter format
 
Downtown & Infill Tax Increment Districts
Downtown & Infill Tax Increment DistrictsDowntown & Infill Tax Increment Districts
Downtown & Infill Tax Increment Districts
 
Increment Strategy ppt 2012-13 : Play this in slide show mode
Increment Strategy ppt 2012-13 : Play this in slide show modeIncrement Strategy ppt 2012-13 : Play this in slide show mode
Increment Strategy ppt 2012-13 : Play this in slide show mode
 
Lecture 8 increment_and_decrement_operators
Lecture 8 increment_and_decrement_operatorsLecture 8 increment_and_decrement_operators
Lecture 8 increment_and_decrement_operators
 
String
StringString
String
 
Scrum - Agile Methodology
Scrum - Agile MethodologyScrum - Agile Methodology
Scrum - Agile Methodology
 
Iocl compensation
Iocl compensationIocl compensation
Iocl compensation
 
Incremental
IncrementalIncremental
Incremental
 
Intro To Scrum.V3
Intro To Scrum.V3Intro To Scrum.V3
Intro To Scrum.V3
 
Normal forest – growing stock and increment
Normal forest – growing stock and incrementNormal forest – growing stock and increment
Normal forest – growing stock and increment
 
Introduction to Redux
Introduction to ReduxIntroduction to Redux
Introduction to Redux
 
An overview of techniques for detecting software variability concepts in sour...
An overview of techniques for detecting software variability concepts in sour...An overview of techniques for detecting software variability concepts in sour...
An overview of techniques for detecting software variability concepts in sour...
 
C Prog. - Operators and Expressions
C Prog. - Operators and ExpressionsC Prog. - Operators and Expressions
C Prog. - Operators and Expressions
 

Similar to The Etsy Shard Architecture: Starts With S and Ends With Hard

From mysql to MongoDB(MongoDB2011北京交流会)
From mysql to MongoDB(MongoDB2011北京交流会)From mysql to MongoDB(MongoDB2011北京交流会)
From mysql to MongoDB(MongoDB2011北京交流会)Night Sailer
 
MongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop ConnectorMongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop ConnectorMongoDB
 
Outrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkOutrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkScyllaDB
 
Mysqlnd Async Ipc2008
Mysqlnd Async Ipc2008Mysqlnd Async Ipc2008
Mysqlnd Async Ipc2008Ulf Wendel
 
My sql查询优化实践
My sql查询优化实践My sql查询优化实践
My sql查询优化实践ghostsun
 
Introduction to Active Record at MySQL Conference 2007
Introduction to Active Record at MySQL Conference 2007Introduction to Active Record at MySQL Conference 2007
Introduction to Active Record at MySQL Conference 2007Rabble .
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redisDvir Volk
 
ROS2勉強会@別府 第7章Pythonクライアントライブラリrclpy
ROS2勉強会@別府 第7章PythonクライアントライブラリrclpyROS2勉強会@別府 第7章Pythonクライアントライブラリrclpy
ROS2勉強会@別府 第7章PythonクライアントライブラリrclpyAtsuki Yokota
 
Extending Moose
Extending MooseExtending Moose
Extending Moosesartak
 
Tame Accidental Complexity with Ruby and MongoMapper
Tame Accidental Complexity with Ruby and MongoMapperTame Accidental Complexity with Ruby and MongoMapper
Tame Accidental Complexity with Ruby and MongoMapperGiordano Scalzo
 
Fraud Detection and Neo4j
Fraud Detection and Neo4j Fraud Detection and Neo4j
Fraud Detection and Neo4j Max De Marzi
 
Mongodb index 讀書心得
Mongodb index 讀書心得Mongodb index 讀書心得
Mongodb index 讀書心得cc liu
 
はじめてのMongoDB
はじめてのMongoDBはじめてのMongoDB
はじめてのMongoDBTakahiro Inoue
 
What's new in Redis v3.2
What's new in Redis v3.2What's new in Redis v3.2
What's new in Redis v3.2Itamar Haber
 
gumiStudy#2 実践 memcached
gumiStudy#2 実践 memcachedgumiStudy#2 実践 memcached
gumiStudy#2 実践 memcachedgumilab
 

Similar to The Etsy Shard Architecture: Starts With S and Ends With Hard (20)

MySQL under the siege
MySQL under the siegeMySQL under the siege
MySQL under the siege
 
From mysql to MongoDB(MongoDB2011北京交流会)
From mysql to MongoDB(MongoDB2011北京交流会)From mysql to MongoDB(MongoDB2011北京交流会)
From mysql to MongoDB(MongoDB2011北京交流会)
 
Mac authentication amigopod radius
Mac authentication amigopod radiusMac authentication amigopod radius
Mac authentication amigopod radius
 
MongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop ConnectorMongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
 
Outrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkOutrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar Framework
 
Mysqlnd Async Ipc2008
Mysqlnd Async Ipc2008Mysqlnd Async Ipc2008
Mysqlnd Async Ipc2008
 
My sql查询优化实践
My sql查询优化实践My sql查询优化实践
My sql查询优化实践
 
Introduction to Active Record at MySQL Conference 2007
Introduction to Active Record at MySQL Conference 2007Introduction to Active Record at MySQL Conference 2007
Introduction to Active Record at MySQL Conference 2007
 
Undrop for InnoDB
Undrop for InnoDBUndrop for InnoDB
Undrop for InnoDB
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redis
 
ROS2勉強会@別府 第7章Pythonクライアントライブラリrclpy
ROS2勉強会@別府 第7章PythonクライアントライブラリrclpyROS2勉強会@別府 第7章Pythonクライアントライブラリrclpy
ROS2勉強会@別府 第7章Pythonクライアントライブラリrclpy
 
Extending Moose
Extending MooseExtending Moose
Extending Moose
 
Tame Accidental Complexity with Ruby and MongoMapper
Tame Accidental Complexity with Ruby and MongoMapperTame Accidental Complexity with Ruby and MongoMapper
Tame Accidental Complexity with Ruby and MongoMapper
 
Web security
Web securityWeb security
Web security
 
Fraud Detection and Neo4j
Fraud Detection and Neo4j Fraud Detection and Neo4j
Fraud Detection and Neo4j
 
Mongodb workshop
Mongodb workshopMongodb workshop
Mongodb workshop
 
Mongodb index 讀書心得
Mongodb index 讀書心得Mongodb index 讀書心得
Mongodb index 讀書心得
 
はじめてのMongoDB
はじめてのMongoDBはじめてのMongoDB
はじめてのMongoDB
 
What's new in Redis v3.2
What's new in Redis v3.2What's new in Redis v3.2
What's new in Redis v3.2
 
gumiStudy#2 実践 memcached
gumiStudy#2 実践 memcachedgumiStudy#2 実践 memcached
gumiStudy#2 実践 memcached
 

Recently uploaded

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 

Recently uploaded (20)

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 

The Etsy Shard Architecture: Starts With S and Ends With Hard

  • 1. The Etsy Shard Architecture Starts With S and Ends With Hard jgoulah@etsy.com / @johngoulah
  • 2.
  • 3. 1.5B page views / mo. 525MM sales in 2011 40MM unique visitors/mo. 800K shops / 150 countries
  • 4.
  • 5.
  • 6. 25K+ queries/sec avg 3TB InnoDB buffer pool 15TB+ data stored 99.99% queries under 1ms
  • 7. 50+ MySQL servers Server Spec HP DL 380 G7 96GB RAM 16 spindles / 1TB RAID 10 24 Core
  • 8.
  • 9. Ross Snyder Scaling Etsy - What Went Wrong, What Went Right http://bit.ly/rpcxtP Matt Graham Migrating From PG to MySQL Without Downtime http://bit.ly/rQpqZG
  • 13. Master - Master R/W R/W
  • 14. Master - Master R/W R/W Side A Side B
  • 16. shard 1 shard 2 shard N ...
  • 17. shard 1 shard 2 shard N ... shard N + 1
  • 18. shard 1 shard 2 shard N ... Migrate Migrate Migrate shard N + 1
  • 20. tickets index shard 1 shard 2 shard N
  • 21. tickets index Unique IDs shard 1 shard 2 shard N
  • 22. tickets index Shard Lookup shard 1 shard 2 shard N
  • 23. tickets index shard 1 shard 2 shard N Store/Retrieve Data
  • 25. users_groups user_id group_id 1 A 1 B 2 A 2 C 3 A 3 B 3 C
  • 26. users_groups user_id group_id 1 A 1 B 2 A 2 C 3 A 3 B 3 C
  • 27. users_groups user_id group_id 1 A 1 B 2 A user_id group_id 2 C 3 A 3 A 3 B 3 B 3 C 3 C
  • 28. users_groups shard 1 user_id group_id 1 A 1 B shard 2 2 A user_id group_id 2 C 3 A 3 B 3 C
  • 30. Shards NOT Determined by key hashing range partitions partitioning by function
  • 32. index shard 1 shard 2 shard N
  • 33. index select shard_id from user_index where user_id = X shard 1 shard 2 shard N
  • 34. index select shard_id from user_index where user_id = X returns 1 shard 1 shard 2 shard N
  • 35. index select join_date from users where user_id = X shard 1 shard 2 shard N
  • 36. index select join_date from users where user_id = X returns 2012-02-05 shard 1 shard 2 shard N
  • 39. CREATE TABLE `tickets` ( `id` bigint(20) unsigned NOT NULL auto_increment, `stub` char(1) NOT NULL default '', PRIMARY KEY (`id`), UNIQUE KEY `stub` (`stub`) ) ENGINE=MyISAM
  • 40. Ticket Generation REPLACE INTO tickets (stub) VALUES ('a'); SELECT LAST_INSERT_ID();
  • 41. Ticket Generation REPLACE INTO tickets (stub) VALUES ('a'); SELECT LAST_INSERT_ID(); SELECT * FROM tickets; id stub 4589294 a
  • 42. tickets A auto-increment-increment = 2 auto-increment-offset = 1 tickets B auto-increment-increment = 2 auto-increment-offset = 2
  • 43. tickets A auto-increment-increment = 2 auto-increment-offset = 1 tickets B auto-increment-increment = 2 auto-increment-offset = 2 NOT master-master
  • 46. A B user_id : 500
  • 47. A B user_id : 500 % (# active replicants)
  • 48. A B 'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', user_id : 500 % (# active replicants)
  • 49. A B 'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', user_id : 500 % (# active replicants)
  • 50. A B user_id : 500 % (2)
  • 51. A B user_id : 500 % (2) == 0
  • 52. A B select ... user_id : 500 % (2) == 0 insert ... update ...
  • 53. A B user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 54. 500 A B 501 select ... select ... insert ... insert ... update ... update ... user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 56. A B user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 57. A B user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 58. A B user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 59. A B 'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 60. A B 'etsy_index_A' => 'mysql:host=dbindex01.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_index_B' => 'mysql:host=dbindex02.ny4.etsy.com;port=3306;dbname=etsy_index;user=etsy_rw', 'etsy_shard_001_A' => 'mysql:host=dbshard01.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_001_B' => 'mysql:host=dbshard02.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_A' => 'mysql:host=dbshard03.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_002_B' => 'mysql:host=dbshard04.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_A' => 'mysql:host=dbshard05.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', 'etsy_shard_003_B' => 'mysql:host=dbshard06.ny4.etsy.com;port=3306;dbname=etsy_shard;user=etsy_rw', user_id : 500 % (2) == 0 user_id : 501 % (2) == 1
  • 61. A B user_id : 500 % (1) == 0 user_id : 501 % (1) == 0
  • 62. ORM
  • 63. connection handling shard lookup replicant selection
  • 64. CRUD cache handling data validation data abstraction
  • 66. Non-Writable Shards $config["non_writable_shards"] = array(1, 2, 3, 4); public static function getKnownWritableShards(){ return array_values( array_diff( self::getKnownShards(), self::getNonwritableShards() )); }
  • 67. Initial Selection $shards = EtsyORM::getKnownWritableShards(); $user_shard = $shards[rand(0, count($shards) - 1)]; user_id shard_id 500
  • 68. Initial Selection $shards = EtsyORM::getKnownWritableShards(); $user_shard = $shards[rand(0, count($shards) - 1)]; user_id shard_id 500 2
  • 69. Later.... select shard_id from user_index index where user_id = X shard 1 shard 2 shard N
  • 71. shard 1 shard 2 user_id group_id user_id group_id 1 A 3 A 1 B 3 B 2 A 4 A 2 C 5 C SELECT user_id FROM users_groups WHERE group_id = ‘A’
  • 72. shard 1 shard 2 user_id group_id user_id group_id 1 A 3 A 1 B 3 B 2 A 4 A 2 C 5 C SELECT user_id FROM users_groups WHERE group_id = ‘A’ Broken!
  • 73. shard 1 shard 2 user_id group_id user_id group_id 1 1 A B JOIN? 3 3 A B 2 A 4 A 2 C 5 C SELECT user_id FROM users_groups WHERE group_id = ‘A’ Broken!
  • 74. shard 1 shard 2 user_id group_id user_id group_id 1 1 A B JOIN? 3 3 A B 2 A 4 A 2 C 5 C SELECT user_id FROM users_groups WHERE group_id = ‘A’ Broken!
  • 75. users_groups groups_users user_id group_id group_id user_id 1 A A 1 1 B A 3 2 A A 2 2 C B 3 3 A B 1 3 B C 2 3 C C 3
  • 76. users_groups_index groups_users_index user_id shard_id group_id shard_id index 1 1 A 1 2 1 B 2 3 2 C 2 4 3 D 3 separate indexes for different slices of data
  • 77. users_groups_index groups_users_index user_id shard_id group_id shard_id index 1 1 A 1 2 1 B 2 3 2 C 2 4 3 D 3 user_id group_id shard 3 4 A 4 B 4 C 4 D
  • 79. shard 1 shard 2 shard N
  • 80. shard 1 shard 2 shard N
  • 82.
  • 83.
  • 84. shard 1 shard 2 shard N
  • 85. shard 1 shard 2 shard N SET SQL_LOG_BIN = 0; ALTER TABLE user ....
  • 87. Why?
  • 88. Prevent disk from filling
  • 89. Prevent disk from filling High traffic objects (shops, users)
  • 90. Prevent disk from filling High traffic objects (shops, users) Shard rebalancing
  • 91. When?
  • 92.
  • 95. per object migration <object type> <object id> <shard> # migrate_object User 5307827 2
  • 96. percentage migration <object type> <percent> <old shard> <new shard> # migrate_pct User 25 3 6
  • 97. index user_id shard_id migration_lock old_shard_id 1 1 0 0 shard 1 shard 2 shard N
  • 98. index user_id shard_id migration_lock old_shard_id 1 1 1 0 •Lock shard 1 shard 2 shard N
  • 99. index user_id shard_id migration_lock old_shard_id 1 1 1 0 •Lock •Migrate shard 1 shard 2 shard N
  • 100. index user_id shard_id migration_lock old_shard_id 1 1 1 0 •Lock •Migrate •Checksum shard 1 shard 2 shard N
  • 101. index user_id shard_id migration_lock old_shard_id 1 1 1 0 •Lock •Migrate •Checksum shard 1 shard 2 shard N
  • 102. index user_id shard_id migration_lock old_shard_id 1 2 0 1 •Lock •Migrate •Checksum •Unlock shard 1 shard 2 shard N
  • 103. index user_id shard_id migration_lock old_shard_id 1 2 0 1 •Lock •Migrate •Checksum •Unlock •Delete (from old shard) shard 1 shard 2 shard N
  • 106. tag1 tag2 co_occurrence _count “red” “cloth” 666
  • 107. tag1 tag2 shard_id “red” “cloth” 1 “vintage” “doll” 3 “antique” “radio” 5 “gift” “vinyl” 2 hash_bucket shard_id “toy” “car” 1 1 2 “wool” “felt” 2 “floral” “wood” “wreath” “table” 5 8 OR 2 3 3 1 “box” “wood” 4 4 2 “doll” “happy” 5 5 3 “smile” “clown” 3 “radio” “vintage” 10 “blue” “luggage” 8 “shoes” “green” 12 ... ... ...
  • 109. 1. provide some key 2. compute corresponding hash bucket
  • 110. 1. provide some key 2. compute corresponding hash bucket 3. lookup hash bucket on index to find shard
  • 111. 1,000,000 'buckets' each with a row in arbitrary_key_index which points to a shard hash_bucket shard_id 1 2 2 3 3 1 4 2 5 3 hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
  • 112. 1,000,000 'buckets' each with a row in arbitrary_key_index which points to a shard hash_bucket shard_id 1 2 2 3 3 1 4 2 5 3 hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
  • 113. 1,000,000 'buckets' each with a row in arbitrary_key_index which points to a shard hash_bucket shard_id 1 2 2 3 3 1 4 2 5 3 hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
  • 114. 1,000,000 'buckets' each with a row in arbitrary_key_index which points to a shard hash_bucket shard_id 1 2 2 3 3 1 4 2 5 3 hash_bucket == hash(‘red’, ‘cloth’) % BUCKETS
  • 116. PARTITION BY RANGE (reference_timestamp)( PARTITION P5 VALUES LESS THAN (1317441600), PARTITION P6 VALUES LESS THAN (1320120000), PARTITION P7 VALUES LESS THAN (1322715600), PARTITION P8 VALUES LESS THAN (1325394000));
  • 117. Deleting a large partition: few hours, tons of disk IO
  • 118. Deleting a large partition: few hours, tons of disk IO Dropping a 2G partition with 2M rows :
  • 119. Deleting a large partition: few hours, tons of disk IO Dropping a 2G partition with 2M rows : < 1s
  • 121. # file= "shop_stats_syndication_hourly#P#P1345867200.ibd" # ln $file $file.remove" # stat "shop_stats_syndication_hourly#P#P1345867200.ibd" File: `shop_stats_syndication_hourly#P#P1345867200.ibd' Size: 65536 Blocks: 136 IO Block: 4096 regular file Device: 6804h/26628d Inode: 41321163 Links: 2 Access: (0660/-rw-rw----) Uid: ( 104/ mysql) Gid: ( 106/ mysql)
  • 122. tickets index shard 1 shard 2 shard N