2. Storage is a Price Elastic Market
http://en.wikipedia.org/wiki/Alfred_Marshall
Price elasticity of demand
• Alfred Marshall (1890)
As the price of Storage approaches $0
• Demands for storage will approach infinity
If the price of a Cisco router
approaches $0
• Demands for routers will not approach
infinity - Storage is different
"3
3. Areal Density Growth
•
100000
Perpendicular expected to extend
to 0.5-1 Tb/in2
10000
Additional innovations required
at that point
• heat-assisted
recording
• bit patterned
media recording
Single particle
superparamagnetic limit
(estimated)
40%
HAMR
+BPM
1000
HAMR
Perpendicular Writing & GMR
100
Seagate Confidential: Subject to NDA
100%
10 Charap’s limiteffective Jan. 18, 2009,
No. 77103,
(broken)
and all applicable supplements GMR reading
• Inductive Writing/
1
0.1
29%
• Inductive Writing/ MR reading
• Inductive Writing & Reading
19
89
19
91
19
93
19
95
19
97
19
99
20
01
20
03
20
05
20
07
20
09
20
11
20
13
20
15
20
17
20
19
•
Late 1990s – super paramagnetic
limit demonstrated through
modeling
gigabit / in2
•
• Areal Density CAGR 40%
• Transfer Rate CAGR 20%
year
"4
4. Cloud Computing will increase this trend
•Jevons Paradox
• Cloud Computing increases the efficiency of computing....
Seagate Confidential: Subject to NDA
No. 77103, effective Jan. 18, 2009,
and all applicable supplements
http://en.wikipedia.org/wiki/Jevons_paradox
"5
5. Cloud Computing will increase this trend
•Jevons Paradox
• Cloud Computing increases the efficiency of computing....
Improved technology doubles
Seagate Confidential: Subject to NDA
the amount of Information
No. 77103, effective Jan. produced
18,
with a given amount of Storage 2009,
!
and all applicable supplements
Demand for Storage rises
http://en.wikipedia.org/wiki/Jevons_paradox
"5
7. Shingled Disks
•Write head larger than read
head
• Turns Disk into a sequentially
written media
Seagate Confidential: Subject to NDA
No. 77103, effective Jan. 18, 2009,
and all applicable supplements
•All updates to data and
metadata are written
sequentially to a continuous
stream, called a log
•Disk API of sectors is no
longer “natural”
http://www.ssrc.ucsc.edu/Papers/amer-ieeetm11.pdf
"7
8. Log Structured Storage
How much is erased on a reposition?
• Tape - the remainder of the tape
• Shingled disk - the remainder of the track group
• Flash - the entire page
All persistent Storage systems do/will implement log structure
• e.g. “NoSQL Database of sectors”
Does it make sense to layer a database on top of a
database?
• Could we use the log structure of the media to provide a more
natural storage systems, not mimicking an antique paradigm?
"8
9. Single System Performance Trend
Leading to disaggregation of servers
http://web.eecs.umich.edu/~twenisch/papers/isca09-disaggregate.pdf
"9
10. Scaling Storage
Distributed Hash Table
• Key/Value Store
RAM
Flash
Disk
Memcached
FAWN
Riak
http://en.wikipedia.org/wiki/Distributed_hash_table
"10
11. Metadata and Metadata Servers are Evil
Required by traditional file systems (POSIX) to translate
names to sectors
• Hard to scale, heavy HA requirements, expensive
Can we use a name as a key?
• Place the data into a scaled key value store?
• Eliminate costly metadata servers?
"11
12. Cumulative operations ordered by length
100%
operations
92% of the operations
80%
60%
40%
32KB
Cumulative percentage
data
20%
0.5% of the data
0%
1.00
10.00
100.00
1000.00
10000.00
100000.00
Length (KB)
"12
15. Seagate Kinetic Open Storage Platform
Dis-intermediates applications to drive
–Goes around file systems, volume managers, drivers
Enable ecosystem of value added software
–Partners (like Basho) can create their own system value
Lower TCO
–Eliminates complexity
"15
19. App
Proprietary
to System Vendor
• Application
• Clustering
• Management
App
D
LibKinetic
C++, Java, Python, Erlang, DIY
GPL
Standard
ProtoBuf
TCP/IP/GbE
• Interconnect
Proprietary
to Seagate
A
• Storage
S
"17
20. App
App
App
App
App
Proprietary
to System Vendor
• Application
• Clustering
• Management
App
D
LibKinetic
LibKinetic
LibKinetic
GPL
Standard
ProtoBuf
TCP/IP/GbE
• Interconnect
Proprietary
to Seagate
A
• Storage
S
"17
22. System Hardware
Typical JBOD architecture
• Does not require a server, just JBODs to the ToR Switch
• 10 JBODS × 60 drives × 4TB = 2.4PB/Rack
"18
23. Features
Provides RPC to Key/Value database
• Data is pre-indexed
• Compression and other value is easy and transparent
P2P (Drive to Drive) copy of key ranges
Communicate using existing Data Center Plumbing (TCP/IP)
Multiple masters - Data sharing between machines
Configurable caching per command
• Async, Sync, Flush
Local space management
"19
24. Kinetic Systems
Clustering (performance, reliability, management)
Compatibility with large scale applications (S3, etc.)
Centralized Management
• Reliability, availability, durability
"20
25. Lower TCO
Elimination of server layers
Less Human requirements
Reduced mistakes
Disaggregate storage from
servers
Power management
"21
26. Lower TCO
Elimination of server layers
Less Human requirements
Reduced mistakes
Disaggregate storage from
servers
Power management
"21
27. Goals of API
Data movement
• Get/put/delete/getnext/getprevious
• Versioned (== for success), options
Range operations
Multiple masters
• Authentication/Integrity/Authorization
Cluster-able
• Simple cluster configuration version enforcement
3rd party copy
Management
"22
28. Management (System Vendor)
Configures the drive
• Network
• authorized clients
Monitors
• Health
• Statistics
• Logs
Initiates recovery
• Change cluster version
• 3rdPartyCopy
"23
29. Data formats
Key Structure
• Variable number of octets (0-4KB)
Data Structure (Serialized to a byte stream)
• KeyOf
• Version
• E2E Data Integrity
–Algorithm name
• Data Variable length (0-xMB)
"24
30. Performance Metrics
Same normal performance expectations
•
•
•
•
Sequential Write
Random Write
Sequential Read
Sequential Write
Iometer for key/value
"25
33. Conclusion
Deliver more value to Seagate, Partners and Customers
• Dis-intermediates cloud applications to drive
• Enable innovation in hardware and software ecosystem
• Lower TCO
OpenSource Software
–Basho Riak, Swift, HDFS
More information
• http://seagate.com/www/kinetic
• https://developers.seagate.com
• http://guthub.com/Seagate
"28