TiDB Introduction

Introducing TiDB
(For those coming from MySQL..)
Make Data Creative
Morgan Tocker (@pingcap; @morgo)
October, 2018

● History and Community
● Technical Walkthrough
● Use Case with Mobike
● Live Demo: TiDB on GKE
● MySQL Compatibility
● Q&A
Agenda

● Sr Product / Community Manager
● ~15+ years MySQL Experience
○ MySQL AB, Sun Microsystems, Percona, Oracle
● Previously Product Manager for MySQL Server
A Little About Me...

A Little About PingCAP...
● Founded in April 2015 by 3 infrastructure engineers
● TiDB platform: (Ti = Titanium)
○ TiDB (stateless SQL layer compatible with MySQL)
○ TiKV (distributed transactional key-value store)
○ TiSpark (Apache Spark plug-in on top of TiKV)
● Open source from Day 1
○ Inspired by Google Spanner / F1
○ GA 1.0: October 2017
○ GA 2.0: April 2018

● Hybrid OLTP & OLAP (Minimize ETL)
● Horizontal Scalability
● MySQL Compatible
● Distributed Transaction (ACID Compliant)
● High Availability
● Cloud-Native
TiDB Core Features

Architecture
SparkSQL
TiDB
TiDB
Worker
Spark
Driver
TiKV Cluster (Storage)
Metadata
TiKV TiKV
TiKV
Data location
Job
TiSpark
DistSQL API
TiKV
TiDB
TSO/Data location
Worker
Worker
Spark Cluster
TiDB Cluster
TiDB
DistSQL API
PD
PD Cluster
TiKV TiKV
TiDB
KV API
MySQL
MySQL
PD
PD

2018 PingCAP
Stars
● TiDB: 15,000+
● TiKV: 3700+
Contributors
● TiDB: 200+
● TiKV: 100+
Community

Early Sign-up: https://www.pingcap.com/tidb-academy/
Sneak Peek!

Platform Architecture
TiDB
TiDB
Worker
Spark
Driver
TiKV Cluster (Storage)
Metadata
TiKV TiKV
TiKV
Data location
Job
TiSpark
DistSQL API
TiKV
TiDB
TSO/Data location
Worker
Worker
Spark Cluster
TiDB Cluster
TiDB
DistSQL API
PD
PD Cluster
TiKV TiKV
TiDB
KV API
MySQL
MySQL
SparkSQL
PD
PD
SparkSQL

TiKV: The Foundation [in CNCF]
RocksDB
Raft
Transaction
Txn KV API
Coprocessor
API
RocksDB
Raft
Transaction
Txn KV API
Coprocessor
API
RocksDB
Raft
Transaction
Txn KV API
Coprocessor
API
Raft
Group
Client
gRPC
TiKV Instance TiKV Instance TiKV Instance
gRPC gRPC
PD Cluster

TiDB: OLTP + Ad Hoc OLAP
Node1 Node2 Node3 Node4
MySQL Network Protocol
SQL Parser
Cost-based Optimizer
Distributed Executor (Coprocessor)
ODBC/JDBC MySQL Client
Any ORM which
supports MySQL
TiDB
TiKV

ID Name Email
1 Edward h@pingcap.com
2 Tom tom@pingcap.com
...
user/1 Edward,h@pingcap.com
user/2 Tom,tom@pingcap.com
...
In TiKV -∞
+∞
(-∞, +∞)
Sorted map
“User” Table
TiDB: Relational -> KV
Some region...

● Hash Join (fastest; if table <= 50 million rows)
● Sort Merge Join (join on indexed column or ordered data
source)
● Index Lookup Join (join on indexed column; ideally after filter,
result < 10,000 rows)
Chosen based on Cost-based Optimizer:
Join Support
Network cost Memory cost CPU cost

TiSpark: Complex OLAP
Spark ExecSpark Exec
Spark Driver
Spark Exec
TiKV TiKV TiKV TiKV
TiSpark
TiSpark TiSpark TiSpark
TiKV
Placement
Driver (PD)
gRPC
Distributed Storage Layer
gRPC
retrieve data location
retrieve real data from TiKV

2018 PingCAP
Who’s using TiDB?
300+
Companies

2018 PingCAP
1. MySQL Scalability
2. Hybrid OLTP/OLAP Architecture
3. Unifying Data Storage/Management
Three Big Use Cases

Mobike + TiDB
● 200 million users
● 200 cities
● 9 milllion smart bikes
● ~30 TB / day

● Locking and unlocking of smart bikes generate massive data
● Smooth experience is key to user retention
● TiDB supports this system by alerting administrators when
success rate of locking/unlocking drops, within minutes
● Quickly find malfunctioning bikes
Scenario #1: Locking/Unlocking

● Synchronize TiDB with MySQL
instances using Syncer (proprietary
tool)
● TiDB + TiSpark empower real-time
analysis with horizontal scalability
● No need for Hadoop + Hive
Scenario #2: Real-Time Analysis

● An innovative loyalty program that must
be on 24 x 7 x 365
● TiDB handles:
○ High-concurrency for peak or promotional season
○ Permanent storage
○ Horizontal scalability
● No interruption as business evolves
Scenario #3: Mobike Store

● Compatible with MySQL 5.7
○ Joins, Subqueries, DML, DDL etc.
● On the roadmap:
○ Views, Window Functions, GIS
● Missing:
○ Stored Procedures, Triggers, Events
Summary
pingcap.com
/docs/sql/mysql-compatibility/

● Some features work differently
○ Auto Increment
○ Optimistic Locking
● TiDB works better with smaller
transactions
○ Recommended to batch updates, deletes,
inserts to 5000 rows
Nuanced

Thank You!
Twitter: @PingCAP; @morgo
https://github.com/pingcap
(Give us a Watch/Star!)
Morgan Tocker
(morgan@pingcap.com)
Early Sign-up:
www.pingcap.com/tidb-academy/

Index Structure
Row:
Key: tablePrefix_rowPrefix_tableID_rowID (IDs are assigned by TiDB, all int64)
Value: [col1, col2, col3, col4]
Index:
Key: tablePrefix_idxPrefix_tableID_indexID_ColumnsValue_rowID
Value: [null]
Keys are ordered by byte array in TiKV, so can support SCAN
Every key is appended a timestamp, issued by Placement Driver

● Complex calculation pushdown
● Key-range pruning
● Index support:
○ Clustered index / non-clustered index
○ Index-only query optimization
● Cost-based optimization:
○ Stats gathered from TiDB in histogram
TiSpark: Features

PD: Dynamic Split and Merge
Region A
Region A
Region B
Region A
Region A
Region B
Split
Region A
Region A
Region B
Merge
TiKV_1 TiKV_2 TiKV_2TiKV_1

PD: Hotspot Removal
*Region A*
*Region B*
Region A
Region B
Workload
*Region A*
Region B
Region A
*Region B*
Workload
Workload
Hotspot Schedule
(Raft leader transfer)
TiKV_1 TiKV_2
TiKV_2TiKV_1

Geo-Replication + Data Location
*Region A*
Region B
Region A
Region B
Seattle_1 Seattle_2
Region A
*Region B*
New York_1
*Region A*
Region B
Region A
*Region B*
Seattle_2Seattle_1
Region A
Region B
New York_1

● Timestamp Oracle service (from Google’s Percolator paper)
● 2-Phase commit protocol (2PC)
● Problem: Single point of failure
● Solution: Placement Driver HA cluster
○ Replicated using Raft
Transaction Model

● Formal proof using TLA+
○ a formal specification and verification language to reason about and prove
aspects of complex systems
● Raft
● TSO/Percolator
● 2PC
● See details: https://github.com/pingcap/tla-plus
Guaranteeing Correctness

TiDB Introduction

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to TiDB Introduction

Similar to TiDB Introduction (20)

More from Morgan Tocker

More from Morgan Tocker (20)

Recently uploaded

Recently uploaded (20)

TiDB Introduction