Cloud Native Night February 2020, Munich: Talk by Franz Wimmer (@zalintyre, Software Engineer at QAware)
=== Please download slides if blurred! ===
Abstract: Most IT systems rely on some sort of persistent storage. This problem has been solved a long time ago and market niches seem to be satisfied. In this field, CockroachDB declares itself to be "resilient, horizontal scale across multiple clouds with always-on availability and data partitioned by location". Because databases like PostgreSQL or MySQL already offer high availability features, we will discuss if there is a need for new HA database at all. We learn about features, up- and downsides, distribution and resiliency of CockroachDB. CockroachDB can be used with a PostgreSQL driver, which enables existing projects to use it out of the box. We will examine if this really is that easy and which obstacles you might need to overcome. Also, we will have a look if CockroachDB is consistent, available and partition tolerant at the same time, like they claim on their website.
2. Franz Wimmer
Software Engineer
2019
2018
2017
- No Backend, no Problem! Static Websites with Jekyll
- Offensive Security – The Metasploit Framework
@QAware
- Evaluating private APIs with Apache Ignite
@MRMCD, Darmstadt
- Leveraging the power of SolrCloud and Spark with OpenShift
@Munich Kubernetes / Cloud Native Meetup
- Ransomware vs. Antivirus
@ MRMCD, Darmstadt
5. Many database systems offer some sort of distribution or replication. Choose wisely:
Distribution / Sharding:
Data is partitioned across multiple nodes.
Less availability in case of outages.
Reads are quite fast and easy to achieve.
Replication:
Data is copied to multiple nodes.
More availability in case of outages.
Writes are difficult to replicate consistently.
Examples:
MySQL Master / Slave (replicated, slave read only)
PostgreSQL High Availability / Load Balancing / Replication (replicated)
MongoDB, CouchDB (Shards & Replicas)
Solr Cloud (Shards & Replicas)
Introduction – Distributed / Replicated Databases
7. “In a network subject to communication failures, it
is impossible for any web service to implement an
atomic read/write shared memory that
guarantees a response to every request” 1
1
Gilbert, Seth, and Nancy Lynch. "Perspectives on the CAP Theorem." Computer 45.2 (2012): 30-36.
8. The CAP Theorem
8
Consistency Availability
Partition tolerance
x
All nodes see the exact
same copies of data at a
given time.
C A
P
The system is still available, even if
nodes or communication paths go
offline.
The system works even if messages are lost.
It can deal with the network splitting up in
several partitions.
9. AP: DNS
Highly available
Arbitrary servers can go offline
Consistency takes a long time (up to 24 hours)
CP: (Online) Banking
Consistency is key, even when the network is down
Availability is secondary
CA: RDBMS with highly available servers and networks
Consistent (transactions) and highly available
Fail when outages occur
Examples
11. “CockroachDB chooses consistency”
As a CP system, is CockroachDB not available? No!
Every piece of data (“range”) is replicated to at least 3 nodes.
In this setup, up to 2 nodes can go offline / be partitioned away.
But: Writes require a majority of replicas to be available.
CockroachDB and the CAP Theorem
12. In the cloud, communication failures are
common. So basically, you have the choice
between CP and AP.
The CAP Theorem (2)
Consistency Availability
Partition tolerance
x
C A
P
2PC2PC
GossipGossipConsensusConsensus
15. CockroachDB is …
A distributed SQL database
Every SQL query operates on key-value data.
Data is persisted using RocksDB key-value store.
Raft is the central consensus algorithm to manage the database cluster.
CockroachDB architecture defines layers:
SQL
Transaction
Distribution
Replication
Storage
CockroachDB Architecture
16. Range: A chunk of data. Think of “SQL Table”.
Replica: Each range is copied to (at least) 3 different nodes.
Leaseholder: The replica that coordinates reads and writes for a range.
Glossary
21. CockroachDB supports ACID transactions.
Atomicity
Writes are performed to the whole cluster at once.
If anything fails, the transaction is rolled back.
Consistency
Incomplete write operations are at no time served to reading clients.
Isolation
All CockroachDB transactions are upgraded to SERIALIZABLE.
Durability
Every successful transaction has been persisted to a majority of replicas (see: Raft).
Transactions
25. CockroachDB was built around …
… the SQL network protocol (“pgwire”)
… the SQL syntax (“using the PostgreSQL syntax parser”)
… and the SQL dialect semantics
of PostgreSQL.
This means that you can …
… use a standard PostgreSQL driver
… build the same SQL Queries you always did
… or use a SQL framework of your choice (Hibernate, EclipseLink, …)
… well, that’s not entirely true.
CockroachDB doesn’t support some PostgreSQL features
… and might implement own features in the future.
PostgreSQL compatibility
26. Deploying is easy with various cloud providers
YAML files for every occasion
Different Helm charts available
Deployment
27. $ wget -qO- https://binaries.cockroachdb.com/cockroach-v19.2.4.linux-amd64.tgz | tar xvz
$ cp -i cockroach-v19.2.4.linux-amd64/cockroach /usr/local/bin/
$ cockroach start
Deployment – the easiest way
33. Easy to install
Easy to use (Postgres driver!)
Don‘t worry about consistency, but prepare for occasional waits
Open Source
Free (if you don‘t need enterprise features)
Maybe this is the PostgreSQL database you always wanted.
Summary
34. QAware 34
Core Edition
Licensed under Apache 2.0 until version 19.1
From 19.2: Business Source License (BSL)
Forbids you to host CockroachDB as a Service.
Converts to Apache 2.0 license three years after release
Enterprise Edition: Geo partitioning, RBAC, Follower Reads, Encryption at Rest...
There are a lot more features not covered by this talk!
Licensing and features