20141206 4 q14_dataconference_i_am_your_db

I’m your DB( I need a database that scales ) FB/hyeongchae.lee
4Q14 DataConference.IO 1

I’m your DB! May the oracle be with you

Agenda•About me•DBMS vs NoSQL•Local vs Global•So... which databases scale? •Amazon Aurora

ABOUT ME----------------------------

INERVITMobileLitenhnCUBRIDTELCOWARETelcobaseALTIBASEAltibaseTIBEROTibero

Global Open Frontier Full-time•Project : MySQL RedisPlug-in ( +MariaDB, +MaxScale) –https://github.com/sql2/MySQL_Redis_Plugin_Dev

MySQL MemcachedPlug-in
MysqldMySQL ServerHandler APIMemcachedplugininnodb_memcachelocal cache(optional) InnoDBAPIInnoDBStorage EngineSQLMemcachedprotocolApplication

MySQL RedisPlug-in
MysqldMySQL ServerHandler APIRedisplugininnodb_redislocal cache(optional) InnoDBAPIInnoDBStorage EngineSQLRedisprotocolApplication

2015 : MaxScaleRedisCluster Plug-in
URL : https://mariadb.com/blog/maxscale-proxy-mysql-replication-relay

DBMSVS NoSQL

RankLastMonthDBMSDatabase ModelScoreChanges11OracleRelational DBMS
1452.13
-19.7722MySQLRelational DBMS1279.08+16.1133Microsoft SQL ServerRelational DBMS
1220.20
+0.5944PostgreSQLRelational DBMS257.36-0.3655MongoDBDocument store244.73+4.3366DB2Relational DBMS
206.23
-1.4477Microsoft AccessRelational DBMS
138.84
-2.8088SQLiteRelational DBMS
95.28
+0.33910CassandraWide column store91.99+6.29109Sybase ASERelational DBMS
84.62
-2.17DB-Engines Ranking
2014.11.24
http://db-engines.com/en/ranking

http://db-engines.com/en/ranking_categories

Winner !!

Magic Quadrant for Operational Database Management Systems
1Oracle's Letter to the EUConcerning MySQL
After an antitrust investigation, theEuropean Commission approved Oracle's acquisition of Sun Microsystems, including MySQL, on 21 January 2010. Wikileakssubsequently publishedcables indicating that the Obama administration applied pressure to the EU to approve the deal. Concerns about the MySQL acquisition had been addressed inOracle's 14 December 2009 pledges to customers, which were to extend for five years —thus expiring in early 2015. Oracle's pledges included commitments to maintain certain APIs, extensions of licenses to then-current licensees, continued use of GPL licensing, and others. The expiration of these commitments may change the nature of Oracle's relationships with a number of hardware and software vendors, as well as its posture regarding product investment, support for purchasing requirements, and other aspects of MySQL's business model.

LOCAL VS GLOBAL

Korean vs Japan50M vs 127M

Korea vs Japan
SlaveSlaveMasterSlaveSlaveSlaveMasterSlavex3

KakaoTalkvs LINE

We Love FusionIO!!
•facebook/flashcache

Dolphinics’ Dolphin Interconnect Solutions

MEMSCALE

SO... WHICH DATABASES SCALE?

Read Caching
•Pros : Read-cachingcan take overa lot of read operations. If reads make up most of your workload, this will obviously help a lot. Even if you have a heavy write workload, read-caching might be enough to keep you from having to scale-out to handle writes.
•Cons :Read-caching, by nature, involves a memory store. If your data-access patterns are really random, or involve a large percentage of records,you might wind up with a pretty expensive memory foot print. Figuring out the right cache-invalidation for your app can also bereallytricky. Many memory stores are prettybasic in terms of functionality—lack of support for transactions & joins can mean that you’ll need multiple process or network round-trips between the app & the cache.
http://spiegela.com/2014/04/28/but-i-need-a-database-that-scales-part-1

WriteCoalescing
•Pros:In short: you can achieve better throughputof incoming writes. With many caching systems, you can also query the data in the cache creating a set of real-time use cases including: event-processing, triggers & real-time analytics.
•Cons:Coalescing writes will inherently mean that your persistence layer isbehindyour ingestion layer.To takeadvantage of this technique, you’ll need to consider a lot of questions:
–Whichdata to query: cached, persisted, both?
–Does thisdata need to bemade durable (survives a reboot)? How quickly?
–Are there consistency concerns? Unique indices? Atomic transaction?

Connection Scaling
•Pros :Connection scaling increases the number of concurrentconnections (obviously, I think?) It’sbiggest benefit, though, is in reliability, since any cluster node can fail and clients can simply reconnect.
•Cons:Connection Scalingrequires shared storage. RAC,for example, typically uses OCFS, a clustered file-system, and SAN storage.The ability to handle more I/O transactionsis dependent on scaling up that shared storage tier, which can be very expensive. Connection Scaling also doesn’t help much with capacity or analysis scaling sincethe data isshared, not spread out across nodes.

Master-Slave Replication
•Pros :While there’s some setup involved, it’spretty seamless to yourapplication. There’s still only a single node that hascontrolover the data, so there are no new concerns around consistency. For read- constrainedapplications, nodes can be added quickly and the architecture remains relatively simple.
•Cons :MSRsolves one problem: reader transactions. If you need to scale other aspects, you’re not doing it here. If you need more write throughput, MSRoffloads the read transactionsfrom the master, butwrites are still limited to a single node. Also, slavescan lag in their updates from the master, if you need absolute consistency between the two, you’ll need to investigate options for synchronous replication which can impact performance of the masternode.

Vertical Partitioning ( aka cluster )
•Pros:Having smaller databases makes indices perform better, and allows you to improve just about anyaspect of scaling.
•Cons:If yourmodel requires relationships betweenmost or all of your tables forthe basic operations, vertically partitioning may not be a fit. Even when you model fits well into partitions today, having these divisions can impact flexibility of performing joinsacross models in the future.

Horizontal Partitioning ( aka shard )
•Pros:This type of partitioning provides scaling forall of the elements of scale, allowing for very large data-sets and very good performance.
•Cons:Shardingcanhave alot of drawbacks depending on the implementation. For one thing, the client must be aware of the partition key. When implementingshardingin MySQL, for example,an application will typicallyinfer the partition key, and address the desiredpartition. Increasing the number of nodes, or changing the key requires an update to the app each time. Other trade-offs like database features are up for grabs too:
–Joins:if my data for two collections is distributedacross multiple nodes,when I fetch the data back, I may need to join data acrossmore than one —which is likely to be slower
–Transactions:if I have a transaction that involves two nodes of the cluster, how to I execute them atomic-ly? Do I lock multiple nodes? All of them?
–Bulk commits:If I updaterecords in bulk acrossmultiple nodes, this is reallytwo transactions executed separately.

So... which databases scale?
•Scale Out Reads
•Capacity
•Scale Out Analysis
•Scale Out Writes
•Bulk Commits
•Joins
•Transactions
•Durability
•Consistency

Scaling Storytime•http://en.wikipedia.org/wiki/Brad_Fitzpatrick

One Server
MySQLApacheInternet•Simple:

Two Server
MySQLApacheInternet•Two SPOF

•Replication ! Five Server
MasterApacheInternetApacheApacheSlavereadwritereplication

More Server
•Chaos !
MasterApacheInternetApacheSlaveApacheApacheApacheApacheSlaveSlaveSlaveSlaveSlave

Cluster vs ShardMulti-Master  Cluster  Shard  Cluster + Shard

MySQL Recruit
•Big Table ( X )
Small Table ( O )
•Performance ( X )
Scale-up ( O ) Distributed ( O )
•Query Tuning
hard ...
•Clustering & Sharding
mission ...

AMAZON AURORA

http://www.theregister.co.uk/2014/11/26/inside_aurora_how_disruptive_is_amazons_mysql_clone/

20141206 4 q14_dataconference_i_am_your_db

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (15)

Similar to 20141206 4 q14_dataconference_i_am_your_db

Similar to 20141206 4 q14_dataconference_i_am_your_db (20)

More from hyeongchae lee

More from hyeongchae lee (11)

Recently uploaded

Recently uploaded (20)

20141206 4 q14_dataconference_i_am_your_db