Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How to migrate_to_sharding_with_spider

302 views

Published on

Describing about how to migrate to sharding environment using Spider.

Published in: Data & Analytics
  • Login to see the comments

  • Be the first to like this

How to migrate_to_sharding_with_spider

  1. 1. How to migrate to sharding with Spider Kentoku SHIBA
  2. 2. 1. What is SPIDER? 2. Why SPIDER? what SPIDER can do for you? 3. How to migrate to sharding using Replication 4. How to migrate to sharding using Trigger 5. How to migrate to sharding using Spider function 6. How to migrate to sharding using Vertical Partitioning Storage Engine Agenda
  3. 3. What is Spider
  4. 4. What is the Spider Storage Engine? Spider is a sharding solution and proxying solution. Spider Storage Engine is a plugin of MariaDB/MySQL. Spider tables can be used to federate from other servers MariaDB/MySQL/OracleDB tables as if they stand on local server. And Spider can create database sharding by using table partitioning feature.
  5. 5. What is the Spider Storage Engine? 1.request 2. Execute SQL 4.response AP All databases can be used as ONE database through Spider. APAP AP AP SPIDER (MariaDB/MySQL) MariaDB tbl_a MySQL tbl_b SPIDER (MariaDB/MySQL) SPIDER (MariaDB/MySQL) OracleDB tbl_c 3. Distributed SQL3. Distributed SQL 3. Distributed SQL
  6. 6. What is the Spider Storage Engine? Spider is bundled in MariaDB from 10.0 and all patches for MariaDB is applied in 10.3
  7. 7. Why SPIDER? What SPIDER can do for you?
  8. 8. Why Spider? What Spider can do for you? For federation You can attach tables from other servers or from local server by using Spider. For sharding You can divide huge tables and huge traffics to multiple servers by using Spider.
  9. 9. Why Spider? What Spider can do for you? Cross shard join You can join all tables by using Spider, even if tables are on different servers.
  10. 10. simple sharding solution Join operation with simple sharding solution (without Spider) DB1 tbl_a1 1.Request 2. Execute SQL with JOIN 3.Response DB2 AP Join operation requires that all joined tables are on same server. APAP AP AP tbl_a2tbl_b1 tbl_b2
  11. 11. Join operation with Spider 1.request 2. Execute SQL with JOIN 3.response AP You can JOIN all tables, even if tables are on different servers. APAP AP AP SPIDER (MariaDB/MySQL) DB1 tbl_a1 DB2 tbl_a2tbl_b1 tbl_b2
  12. 12. Why Spider? What Spider can do for you? Join push down If it is possible, Spider executes JOIN operation at data node directly.
  13. 13. JOIN push down 1.request 2. Execute SQL with JOIN 3.response AP If all tables are on same data node, Spider executes JOIN operation on data node directly. APAP AP AP SPIDER (MariaDB/MySQL) DB1 tbl_a DB2 tbl_ctbl_b tbl_d
  14. 14. JOIN push down Simple join operation are two times faster on simple JOIN pushdown test. Also, in this pushdown of JOIN, when aggregate functions are included in the query, since the aggregation processing is also executed at the data node, the amount of data transfer is greatly reduced and it becomes super high speed.
  15. 15. How to migrate to sharding with Spider using Replication
  16. 16. Initial Structure There is 1 MariaDB server without Spider. DB1 tbl_a Create table tbl_a ( col_a int, col_b int, primary key(col_a) ) engine = InnoDB;
  17. 17. Step 1 (for migrating) Create table on DB3 and DB4. Then create Spider table on DB2. DB1 tbl_a DB3 tbl_a col_a%2=1col_a%2=0 DB2 DB4 tbl_a Create table tbl_a ( col_a int, col_b int, primary key(col_a) ) engine = Spider Connection ‘ table “tbl_a”, user “user”, password “pass” ‘ partition by list( mod(col_a, 2)) ( partition pt1 values in(0) comment ‘host “DB3”’, partition pt2 values in(1) comment ‘host “DB4”’ ); tbl_a
  18. 18. Step 2 DB1 tbl_a DB3 tbl_a col_a%2=1col_a%2=0 DB2 DB4 tbl_a Copy table data from DB1 to DB2. (Use mysqldump with “--master-data = 1 or 2” option) tbl_a
  19. 19. Step 3 Start replication from DB1 to DB2. Wait for resolving replication delay. DB1 tbl_a DB3 tbl_a col_a%2=1col_a%2=0 DB2 DB4 tbl_a tbl_a replication
  20. 20. Step 4 Stop client access for DB1. Wait for resolving replication delay. Switch client access from DB1 to DB2. DB1 tbl_a DB3 tbl_a col_a%2=1col_a%2=0 DB2 DB4 tbl_a tbl_a replication
  21. 21. Finish Stop replication on DB2. Remove DB1. DB3 tbl_a col_a%2=1col_a%2=0 DB2 DB4 tbl_a tbl_a
  22. 22. Pros and Cons of Replication way Pros 1. No need to manage lock size for coping. 2. Support non primary key table. Cons 1. Need to stop writing.
  23. 23. How to migrate to sharding with Spider using Trigger
  24. 24. Initial Structure There is 1 MariaDB server without Spider. DB1 tbl_aCreate table tbl_a ( col_a int, col_b int, primary key(col_a) ) engine = InnoDB;
  25. 25. Step 1 (for migrating) Create table on DB2 and DB3. Then create Spider table on DB1. DB1 tbl_a DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2 Create table tbl_a2 ( col_a int, col_b int, primary key(col_a) ) engine = Spider Connection ‘ table “tbl_a”, user “user”, password “pass” ‘ partition by list( mod(col_a, 2)) ( partition pt1 values in(0) comment ‘host “DB2”’, partition pt2 values in(1) comment ‘host “DB3”’ );
  26. 26. Step 2 Create triggers on DB1. (For copying insert, update and delete. If you use “truncate” for tbl_a, you should better to use other way) DB1 tbl_a DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2 delimiter | create trigger tbl_a_i after insert on tbl_a for each row insert into tbl_a2 (a,b) values (new.a, new.b); | create trigger tbl_a_u after update on tbl_a for each row update tbl_a2 set a = new.a, b = new.b where a = old.a; | create trigger tbl_a_d after delete on tbl_a for each row delete from tbl_a2 where a = old.a; | delimiter ;
  27. 27. Step 3 DB1 tbl_a DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2 Insert select from tbl_a to tbl_a2. (Please take care of locking time for tbl_a and tbl_a2.)
  28. 28. Step 4 DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a Rename table from tbl_a2 to tbl_a. Rename table tbl_a to tbl_a3, tbl_a2 to tbl_a; DB1 tbl_a3 tbl_a
  29. 29. Finish DB1 DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_atbl_a Drop table tbl_a3.
  30. 30. Pros and Cons of Trigger way Pros 1. No need to stop services. 2. Easy to copy.(Simple command) Cons 1. Impossible to support truncate. 2. Need to manage lock size at coping. 3. Impossible to support non primary key.
  31. 31. How to migrate to sharding with Spider using Spider function
  32. 32. Initial Structure There is 1 MariaDB server without Spider. DB1 tbl_aCreate table tbl_a ( col_a int, col_b int, primary key(col_a) ) engine = InnoDB;
  33. 33. Step 1 (for migrating) Create table on DB2 and DB3. Then create Spider table on DB1. DB1 tbl_a DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2 Create table tbl_a2 ( col_a int, col_b int, primary key(col_a) ) engine = Spider Connection ‘ table “tbl_a”, user “user”, password “pass” ‘ partition by list( mod(col_a, 2)) ( partition pt1 values in(0) comment ‘host “DB2”’, partition pt2 values in(1) comment ‘host “DB3”’ );
  34. 34. Step 2 Create tables on DB1. DB1 tbl_a DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2Create table tbl_a4 ( col_a int, col_b int, primary key(col_a) ) engine = Spider Connection ‘ host “localhost” table “tbl_a3 tbl_a2”, lst “0 2”, user “user”, password “pass” ‘; tbl_a4 tbl_a3 Create table tbl_a3 ( col_a int, col_b int, primary key(col_a) ) engine = InnoDB;
  35. 35. Step 3 Rename table on DB1. DB1 tbl_a3 DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2 Rename table tbl_a3 to tbl_a5, tbl_a to tbl_a3, tbl_a4 to tbl_a; tbl_a tbl_a5
  36. 36. Step 4 Copy data on DB1. DB1 tbl_a3 DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2 Select spider_copy_table(‘tbl_a’, ‘’, ‘’); tbl_a tbl_a5
  37. 37. Step 5 Rename table on DB1. DB1 tbl_a3 DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_aRename table tbl_a2 to tbl_a6, tbl_a5 to tbl_a2, tbl_a to tbl_a7, tbl_a6 to tbl_a; tbl_a7 tbl_a2
  38. 38. Finish DB1 DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_atbl_a Drop table tbl_a2, tbl_a3 and tbl_a7.
  39. 39. Pros and Cons of Spider function way Pros 1. No need to stop services. 2. Easy to copy.(Simple command. Lock size is managed by Spider) Cons 1. Impossible to support non primary key.
  40. 40. How to migrate to sharding with Spider using Vertical Partitioning Storage Engine
  41. 41. Initial Structure There is 1 MariaDB server without Spider. DB1 tbl_a Create table tbl_a ( col_a int, col_b int, primary key(col_a), key idx2(col_b) ) engine = InnoDB;
  42. 42. Step 1 (for migrating) Create table on DB2 and DB3. Then create tables on DB1. DB1 tbl_a DB2 col_a%2=1col_a%2=0 DB3 Create table tbl_pk ( col_a int, primary key(col_a) ) engine = Spider Connection ‘ table “tbl_pk”, user “user”, password “pass” ‘ partition by list( mod(col_a, 2)) ( partition pt1 values in(0) comment ‘host “DB2”’, partition pt2 values in(1) comment ‘host “DB3”’ ); tbl_pk tbl_pk tbl_pk
  43. 43. Step 2 Create table on DB4 and DB5. Then create tables on DB1. DB1 tbl_a DB2 col_a%2=1col_a%2=0 DB3 Create table tbl_a3 ( col_a int, col_b int, key idx1(col_a), key idx2(col_b) ) engine = Spider Connection ‘ table “tbl_a2”, user “user”, password “pass” ‘ partition by list( mod(col_b, 2)) ( partition pt1 values in(0) comment ‘host “DB4”’, partition pt2 values in(1) comment ‘host “DB5”’ ); tbl_pk tbl_a2 tbl_pk tbl_pk DB4 col_b%2=1col_b%2=0 DB5 tbl_a2 tbl_a2
  44. 44. Step 3 Create tables on DB1. DB1 tbl_a DB2 col_a%2=1col_a%2=0 DB3 Create table tbl_a3 ( col_a int, col_b int, primary key(col_a), key idx2(col_b) ) engine = VP Comment ‘ ctm “1”, ist “1”, pcm “1”, tnl “tbl_a4 tbl_pk tbl_a2” ‘; tbl_a3 tbl_pk tbl_a2 tbl_pk tbl_pk DB4 col_b%2=1col_b%2=0 DB5 tbl_a2 tbl_a2 Create table tbl_a4 ( col_a int, col_b int, primary key(col_a) ) engine = InnoDB; tbl_a4
  45. 45. Step 4 Rename tables on DB1. DB1 DB2 col_a%2=1col_a%2=0 DB3 tbl_a tbl_pk tbl_a2 tbl_pk tbl_pk DB4 col_b%2=1col_b%2=0 DB5 tbl_a2 tbl_a2 Rename table tbl_a4 to tbl_a5, tbl_a to tbl_a4, tbl_a3 to tbl_a; tbl_a5 tbl_a4
  46. 46. Step 5 Copy data from tbl_a4 to tbl_pk and tbl_a2 on DB1. DB1 DB2 col_a%2=1col_a%2=0 DB3 tbl_a tbl_pk tbl_a2 tbl_pk tbl_pk DB4 col_b%2=1col_b%2=0 DB5 tbl_a2 tbl_a2 Select vp_copy_tables(‘table_a’, ‘tbl_a4’, ‘tbl_pk tbl_a2’); tbl_a5 tbl_a4
  47. 47. Step 6 Alter table tbl_a on DB1. DB1 DB2 col_a%2=1col_a%2=0 DB3 tbl_a tbl_pk tbl_a2 tbl_pk tbl_pk DB4 col_b%2=1col_b%2=0 DB5 tbl_a2 tbl_a2Alter table tbl_a comment ‘ pcm “1”, tnl “tbl_pk tbl_a2” ‘; tbl_a5 tbl_a4
  48. 48. Finish Drop table tbl_a on DB1. DB1 DB2 col_a%2=1col_a%2=0 DB3 tbl_a tbl_pk tbl_a2 tbl_pk tbl_pk DB4 col_b%2=1col_b%2=0 DB5 tbl_a2 tbl_a2 Drop table tbl_a4, tbl_a5;
  49. 49. Pros and Cons of VP way Pros 1. No need to stop services. 2. Support spiltting by non unique columns. 3. Easy to copy.(Simple command. Lock size is managed by VP) Cons 1. VP storage engine is required. 2. Impossible to support non primary key.
  50. 50. Thank you for taking your time!!

×