Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
SHARE Interface in Flash Storage
for Relational and NoSQL Databases
Farzad Nozarian
Information Systems Group - Saarland U...
amazon.com
ADD TO CART
300 $
ADD TO CART
2
Motivation
Motivation
3
Page Write Atomicity
Guarantee write atomicity in databases
4
Copy-on-write[ ]Journaling[ ]
Write amplifications
Redundant Writes in MySQL/InnoDB
5
DRAM Buffer
Dirty Page Write
Clean Dirty
In-Place Update
Journaling
for Atomic Page Wri...
Copy-on-Write in Couchbase
6
A
B C
D1 D2 D3
Valid
Stale
Tree Nodes
Documents
A’
C’
D2’
new copy
C
A
old document
D2
Costly...
Problem of atomic
writes In flash storages
7
Seems similar…
8
Copy-on-write
In flash
Out-of-place write
in database
Explicitly change the address mapping inside flash ...
SHARE Architecture
9
SHARE Interface
LPN
PPN
Flash Storage
Page Mapping Table
(L2P)
Physical Address in
Flash Memory (PPN)...
10
void SHARE(double LPN1, double LPN2)
}
add_as_SATA_command() ;
if(is_available_SATA == false)
use_ioctl() ;
bool is_bat...
Extending MySQL/InnoDB
11
H
A
R
E
Writing another copy in the DWB
Writing each page to its original location
Call SHARE co...
Couchbase Compaction With SHARE
12
Valid
Stale
LPN
PPN
D1 D2 D3 D1’
Flash Storage
Page Mapping Table
(L2P)
Physical Addres...
13
Easy to incorporate into the existing
storage interface frameworks
Using SHARE with marginal code changes
14
500
line
new
codes
200
line
new
codes
Complicating
garbage collection process
Multiple reverse mapping for each physical page
15
No SSD memory space for extra data structures
1GB NAND
~1MB DRAM
Forward mapping
Cache
I/O buffers
16
LinkBench YCSB
Configurable benchmark for social graphs
Benchmark framework for cloud systems
Workloads
17
Workload-A: 50%...
The Effect of SHARE on Throughput
18
LinkBench throughput on MySQL/InnoDB
241
277
316
578
617
799
0
200
400
600
800
1000
5...
The Effect of SHARE on IO Activities
19
IO activities inside OpenSSD (50MB buffer cache, 4KB page)
0
50000
100000
150000
2...
Effect of SHARE on Workload-F
20
0
200
400
600
800
1000
1200
1400
1600
1800
1 4 16 64 256
WrittenBytes(MB)
Batch-Size
DWB-...
Effect of SHARE on Workload-A and Compaction
21
Effect of SHARE on compaction
Elapsed
Time (sec)
Written Bytes
(MB)
Origin...
22
Rewired User-space Memory Access[ ]
RUMA
Rewiring the mappings from virtual to physical memory at runtime
Rewiring Memory Layers + Swapping Pages
23
filephysicalvirtual
b ; b+p-1 b+p ; b+2.p-1
0 ; p-1 p ; 2p-1
mmap()
mmap() mmap...
Conclusion and future work
24
FTL
Write atomicity
with Journaling
and Copy-on-write
SHARE allows
applications to
change th...
25
Performing costly operations
minimal
writes
with
Just MySQL and Couchbase?
26
…
Related Work
27
FTL for journal mode Closest approach to SHARE Tailored to journaling FS
Copy-on-write mechanism Transacti...
Complicating
garbage collection process
Multiple reverse mapping for each physical page
Physical to logical
(PPN)
#12
Logi...
The Effect Of SHARE On Tail Tolerance
29
Transaction DWB-On SHARE
I/O
Type
Name Mean P50 Max Mean P50 Max
Read
Get_Node 51...
Picture References
1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
21. 22. 23. 24. 25. 26. 27. 28. ...
Picture References
[1] Icons made by Chris Veigt from www.flaticon.com is licensed by CC 3.0 BY
[2, 5, 6, 8, 9, 12, 13, 16...
Picture References
[34] https://upload.wikimedia.org/wikipedia/commons/6/67/Couchbase%2C_Inc._official_logo.png
[38] http:...
Upcoming SlideShare
Loading in …5
×

SHARE Interface in Flash Storage for Relational and NoSQL Databases

483 views

Published on

Based on a paper with the same name

Published in: Software
  • Login to see the comments

  • Be the first to like this

SHARE Interface in Flash Storage for Relational and NoSQL Databases

  1. 1. SHARE Interface in Flash Storage for Relational and NoSQL Databases Farzad Nozarian Information Systems Group - Saarland University infosys.cs.uni-saarland.de
  2. 2. amazon.com ADD TO CART 300 $ ADD TO CART 2 Motivation
  3. 3. Motivation 3 Page Write Atomicity
  4. 4. Guarantee write atomicity in databases 4 Copy-on-write[ ]Journaling[ ] Write amplifications
  5. 5. Redundant Writes in MySQL/InnoDB 5 DRAM Buffer Dirty Page Write Clean Dirty In-Place Update Journaling for Atomic Page Write Double Write in MySQL/InnoDB D’ C’ B A’ Database A’ C’ D B Double Write Buffer (DWB) A’ C’
  6. 6. Copy-on-Write in Couchbase 6 A B C D1 D2 D3 Valid Stale Tree Nodes Documents A’ C’ D2’ new copy C A old document D2 Costly compaction operation Wandering-tree write amplification
  7. 7. Problem of atomic writes In flash storages 7
  8. 8. Seems similar… 8 Copy-on-write In flash Out-of-place write in database Explicitly change the address mapping inside flash storage SHARE
  9. 9. SHARE Architecture 9 SHARE Interface LPN PPN Flash Storage Page Mapping Table (L2P) Physical Address in Flash Memory (PPN) D2’ A B C D1 D2 D3 Valid StaleTree Nodes Documents Write amplificationD2 D2D2 D2’
  10. 10. 10 void SHARE(double LPN1, double LPN2) } add_as_SATA_command() ; if(is_available_SATA == false) use_ioctl() ; bool is_batch_supported = true ; {
  11. 11. Extending MySQL/InnoDB 11 H A R E Writing another copy in the DWB Writing each page to its original location Call SHARE command with the LPN pair(s) Evicting dirty pages from the buffer
  12. 12. Couchbase Compaction With SHARE 12 Valid Stale LPN PPN D1 D2 D3 D1’ Flash Storage Page Mapping Table (L2P) Physical Address in Flash Memory (PPN) E F G D1’ D2 D3 Compaction File 2 SHARE(File2’s LPNs, File1’s LPNs) A B C D1 D2 D3 Tree Nodes Documents D1’ B’ A’ File 1
  13. 13. 13 Easy to incorporate into the existing storage interface frameworks
  14. 14. Using SHARE with marginal code changes 14 500 line new codes 200 line new codes
  15. 15. Complicating garbage collection process Multiple reverse mapping for each physical page 15
  16. 16. No SSD memory space for extra data structures 1GB NAND ~1MB DRAM Forward mapping Cache I/O buffers 16
  17. 17. LinkBench YCSB Configurable benchmark for social graphs Benchmark framework for cloud systems Workloads 17 Workload-A: 50% read, 50% update Workload-F: 100% read-modify-write 3 databases with page sizes of 4KB, 8KB,16KBKB
  18. 18. The Effect of SHARE on Throughput 18 LinkBench throughput on MySQL/InnoDB 241 277 316 578 617 799 0 200 400 600 800 1000 50MB 100MB 150MB TransactionsPerSecond(TPS) Buffer Size DWB-ON SHARE 241 118 60 578 271 131 0 200 400 600 800 1000 4KB 8KB 16KB TransactionsPerSecond(TPS) Page Size DWB-ON SHARE Figure (a) Figure (b)
  19. 19. The Effect of SHARE on IO Activities 19 IO activities inside OpenSSD (50MB buffer cache, 4KB page) 0 50000 100000 150000 200000 250000 300000 350000 400000 50MB 100MB 150MB WriteCount(page) Buffer Size DWB-ON SHARE 0 500 1000 1500 2000 2500 3000 3500 4000 4500 50MB 100MB 150MB GarbageCollection(GC)Count Buffer Size DWB-ON SHARE 0 20000 40000 60000 80000 100000 120000 140000 160000 180000 200000 50MB 100MB 150MB Copy-backCount(page) Buffer Size DWB-ON SHARE 45% 55% 75% Figure (a) Figure (b) Figure (c)
  20. 20. Effect of SHARE on Workload-F 20 0 200 400 600 800 1000 1200 1400 1600 1800 1 4 16 64 256 WrittenBytes(MB) Batch-Size DWB-ON SHARE 38.78 218.83 396.08 555.6 641.68 133.89 443.42 889.34 1192.35 1260.93 0 200 400 600 800 1000 1200 1400 1 4 16 64 256 OperationsPerSecond(OPS) Batch-Size DWB-ON SHARE Figure (a) YCSB throughput on Couchbase: Workload-F Figure (b)
  21. 21. Effect of SHARE on Workload-A and Compaction 21 Effect of SHARE on compaction Elapsed Time (sec) Written Bytes (MB) Original 277.52 1126.4 SHARE 88.38 150.6 118.58 455.86 801.15 1033.12 1108.71 264.67 856.35 1564.41 1767.3 1787.21 0 500 1000 1500 2000 1 4 16 64 256 OPERATIONSPERSECOND(OPS) BATCH-SIZE DWB-ON SHARE YCSB throughput on Couchbase 3.1X 7.5X
  22. 22. 22 Rewired User-space Memory Access[ ] RUMA Rewiring the mappings from virtual to physical memory at runtime
  23. 23. Rewiring Memory Layers + Swapping Pages 23 filephysicalvirtual b ; b+p-1 b+p ; b+2.p-1 0 ; p-1 p ; 2p-1 mmap() mmap() mmap() ppage42 ppage7 vpage0 vpage1 ppage42 ppage7 vpage0 vpage1vpage0 vpage1
  24. 24. Conclusion and future work 24 FTL Write atomicity with Journaling and Copy-on-write SHARE allows applications to change the FTL Extending MySQL and Couchbase to exploit SHARE Reducing SSDs performance and lifespan Write atomicity almost at no cost! Extending PostgreSQL and SQLite, Ext4, …
  25. 25. 25 Performing costly operations minimal writes with
  26. 26. Just MySQL and Couchbase? 26 …
  27. 27. Related Work 27 FTL for journal mode Closest approach to SHARE Tailored to journaling FS Copy-on-write mechanism Transactional atomicity Update-in-place JFTLJFTL(2009) X-FTL(2013) RUMA(2016)
  28. 28. Complicating garbage collection process Multiple reverse mapping for each physical page Physical to logical (PPN) #12 Logical to physical 28
  29. 29. The Effect Of SHARE On Tail Tolerance 29 Transaction DWB-On SHARE I/O Type Name Mean P50 Max Mean P50 Max Read Get_Node 51.4 12 1363.3 23.9 10 901.1 Count_Link 32.8 5 1244.4 14.4 5 747.4 Multiget_Link 40.7 5 1573.5 15.2 6 313.3 Get_Link_List 39.4 5 17467.4 17.2 5 6140.7 write Add_Node 6.3 0.4 1521.0 1.5 0.3 554.6 Update_Node 64.0 15 2071.4 28.3 14 823.8 Delete_Node 62.6 13 1104.6 26.3 12 596.7 Add_Link 119.3 40 2248.2 49.7 27 730.4 Delete_Link 70.5 16 1417.6 30.3 14 1132.8 Updae_Link 114.9 38 2270.5 49.4 26 1102.5 Distribution of LinkBench transaction latency (in millisec)
  30. 30. Picture References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56.
  31. 31. Picture References [1] Icons made by Chris Veigt from www.flaticon.com is licensed by CC 3.0 BY [2, 5, 6, 8, 9, 12, 13, 16, 18, 20, 24, 25, 27, 29, 39] Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY [3, 4, 37] Icons made by Gregor Cresnar from www.flaticon.com is licensed by CC 3.0 BY [7] Icons made by Pixel Buddha from www.flaticon.com is licensed by CC 3.0 BY [10] Icons made by Lucy G from www.flaticon.com is licensed by CC 3.0 BY [11, 14, 15, 21, 22, 35] Icons made by Madebyoliver from www.flaticon.com is licensed by CC 3.0 BY [17, 40] Icons made by Vectors Market from www.flaticon.com is licensed by CC 3.0 BY [19, 36] Icons made by Dave Gandy from www.flaticon.com is licensed by CC 3.0 BY [23] Icons made by Nikita Golubev from www.flaticon.com is licensed by CC 3.0 BY [26] Icons made by Twitter from www.flaticon.com is licensed by CC 3.0 BY [28] Icons made by DinosoftLabs from www.flaticon.com is licensed by CC 3.0 BY [30] Icons made by Becris from www.flaticon.com is licensed by CC 3.0 BY [31] https://upload.wikimedia.org/wikipedia/en/thumb/6/62/MySQL.svg/1200px-MySQL.svg.png [32] https://safenet.gemalto.com/uploadedImages/images/Logos/postgresql-logo.png [33] https://upload.wikimedia.org/wikipedia/commons/thumb/3/38/SQLite370.svg/2000px-SQLite370.svg.png 31
  32. 32. Picture References [34] https://upload.wikimedia.org/wikipedia/commons/6/67/Couchbase%2C_Inc._official_logo.png [38] http://www.pinsdaddy.com/ [41] https://g16languagebslmackay.files.wordpress.com/2014/04/landfill-ban-on-unsorted-waste-could-save-c2a32-1-billion.jpg [42] https://c.slashgear.com/wp-content/uploads/2012/10/google-datacenter-tech-21.jpg [43] http://assets.inoutput.io/images/code-minimalism-what-is-it-and-should-you-use-it/Code-minimalism-article_-FINAL-RESIZE.jpg [44] https://willnationsdev.files.wordpress.com/2017/03/orange-halves-02.jpg?w=613&h=460 [45] https://images-na.ssl-images-amazon.com/images/I/71hV7hZcRmL._SL1500_.jpg [46] https://www.etb-tech.com/media/catalog/product/cache/1/image/1200x800/8914f37fee28e25f390b3ea202924aa1/e/m/emc_hssdc2-hssdc2_4gb_fibre_cable_038-003-514-closeup.jpg [47] https://images-na.ssl-images-amazon.com/images/I/71kJ7lTJw-L._SL1010_.jpg [48] https://images.techhive.com/images/article/2013/05/intel_ssd_525_series_1160-100037016-orig.png [49] https://images-fe.ssl-images-amazon.com/images/I/71Ae-lqBdCL._SL1500_.jpg [50] http://www.procomponentes.com/70869-thickbox_default/startechcom-cable-2m-sff-8470-a-sff8088-infiniband-cx4-molex-lanelink-mini-sas-molex-ipass-isas88702.jpg [51] http://www.sps.cs.uni-saarland.de/resources/uds-logo.svg [52] https://garciamedialife.files.wordpress.com/2014/12/subway-10.jpg [53] http://www.bbq4all.it/wp-content/uploads/2017/05/codecode.jpg [54] http://www.all-electronics.de/wp-content/uploads/2016/10/Bild1-bigstock-Festplatte.jpg [55] https://www.iguides.ru/upload/medialibrary/e00/e00fbe4066eff1d84b279c039e0e32f6.jpg [56] http://s1.1zoom.me/b6862/130/Closeup_Electric_wire_Multicolor_512328_3840x2160.jpg 32

×