This was one of the talks that I gave at the Strata San Jose conference. I migrated my topic a bit, but here is the original abstract:
Application developers and architects today are interested in making their applications as real-time as possible. To make an application respond to events as they happen, developers need a reliable way to move data as it is generated across different systems, one event at a time. In other words, these applications need messaging.
Messaging solutions have existed for a long time. However, when compared to legacy systems, newer solutions like Apache Kafka offer higher performance, more scalability, and better integration with the Hadoop ecosystem. Kafka and similar systems are based on drastically different assumptions than legacy systems and have vastly different architectures. But do these benefits outweigh any tradeoffs in functionality? Ted Dunning dives into the architectural details and tradeoffs of both legacy and new messaging solutions to find the ideal messaging system for Hadoop.
Topics include:
* Queues versus logs
* Security issues like authentication, authorization, and encryption
* Scalability and performance
* Handling applications that span multiple data centers
* Multitenancy considerations
* APIs, integration points, and more
2. ยฉ 2014 MapR Technologies 2
Contact Information
Ted Dunning
Chief Applications Architect at MapR Technologies
Committer & PMC for Apacheโs Drill, Zookeeper & others
VP of Incubator at Apache Foundation
Email tdunning@apache.org tdunning@maprtech.com
Twitter @ted_dunning
Hashtags today: #stratahadoop #ojai
3. ยฉ 2014 MapR Technologies 3
Donโt Miss These
โข Just-in-time optimizing a database
โ Me! at 4:20 PM, Room 230 C, today
โข Why flow instead of state?
โ Me! at 5:10 PM, Room 210 D/H, today
โข High Frequency Decisioning
โ Jack Norris! at 11:00 PM, Room 210 B/F, tomorrow
โข Threat detection on streaming data
โ Carol Macdonald! at 3:45 PM, Solutions Theater, tomorrow
โข Scaling Your Business โฆ Zeta Architecture
โ Jim Scott! at 5:10 PM, Room 210 D/H, tomorrow
4. ยฉ 2014 MapR Technologies 4
And Also, a Little Fun
Come jam with us
The Big Data Boys and the Real-time Stream Band
5:50 PM, MapR booth, today
5. ยฉ 2014 MapR Technologies 5
Goals
โข Real-time or near-time
โ Includes situations with deadlines
โ Also includes situations where delay is simply undesirable
โ Even includes situations where delay is just fine
โข Micro-services
โ Streaming is a convenient idiom for design
โ Micro-services โฆ you know we wanted it
โ Service isolation is a key requirement
6. ยฉ 2014 MapR Technologies 6
Real-time or Near-time?
โข The real point is flow versus state (see talk later today)
โข One consequence of flow-based computing is real-time and
near-time become relatively easy
โข Life may be a bitch, but it doesnโt happen in batches!
8. ยฉ 2014 MapR Technologies 9
A microservice is
loosely coupled
with bounded context
9. ยฉ 2014 MapR Technologies 10
How to Couple Services and Break micro-ness
โข Shared schemas, relational stores
โข Ad hoc communication between services
โข Enterprise service busses
โข Brittle protocols
โข Poor protocol versioning
Donโt do this!
10. ยฉ 2014 MapR Technologies 11
How to Decouple Services
โข Use self-describing data
โข Private databases
โข Infrastructural communication between services
โข Use modern protocols
โข Adopt future-proof protocol practices
โข Use shared storage where necessary due to scale
11. ยฉ 2014 MapR Technologies 13
What is the Right Structure for Flow Compute?
โข Traditional message queues?
โ Message queues are classic answer
โ Key feature/bug is out-of-order acknowledgement
โ Many implementations
โ You pay a huge performance hit for persistence
โข Kafka-esque Logs?
โ Logs are like queues, but with ordering
โ Out of order consumption is possible, acknowledgement not so much
โ Canonical base implementation is Kafka
โ Performance plus persistence
14. ยฉ 2014 MapR Technologies 16
Traditional Solution
POS
1..n
Fraud
detector
Last card
use
15. ยฉ 2014 MapR Technologies 17
What Happens Next?
POS
1..n
Fraud
detector
Last card
use
POS
1..n
Fraud
detector
POS
1..n
Fraud
detector
16. ยฉ 2014 MapR Technologies 18
What Happens Next?
POS
1..n
Fraud
detector
Last card
use
POS
1..n
Fraud
detector
POS
1..n
Fraud
detector
17. ยฉ 2014 MapR Technologies 19
How to Get Service Isolation
POS
1..n
Fraud
detector
Last card
use
Updater
card activity
18. ยฉ 2014 MapR Technologies 20
New Uses of Data
POS
1..n
Fraud
detector
Last card
use
Updater
Card
location
history
Other
card activity
19. ยฉ 2014 MapR Technologies 21
Scaling Through Isolation
POS
1..n
Last card
use
Updater
POS
1..n
Last card
use
Updater
card activity
Fraud
detector
Fraud
detector
20. ยฉ 2014 MapR Technologies 22
Lessons
โข De-coupling and isolation are key
โข Private data stores/tables are important,
โ but local storage of private data is a bug
โข Propagate events, not table updates
21. ยฉ 2014 MapR Technologies 23
Scenarios
IoT Data Aggregation
22. ยฉ 2014 MapR Technologies 24
Basic Situation
Each location
has many
pumps
pump data
Multiple
locations
23. ยฉ 2014 MapR Technologies 25
What Does a Pump Look Like
inlet
out let
m ot or
Temperature
Pressure
Flow
Temperature
Pressure
Flow
Winding temperature
Voltage
Current
24. ยฉ 2014 MapR Technologies 26
Basic Situation
Each location
has many
pumps
pump data
Multiple
locations
25. ยฉ 2014 MapR Technologies 27
pump data
pump data
pump data
pump data
Basic Architecture Reflects Business Structure
26. ยฉ 2014 MapR Technologies 28
Lessons
โข Data architecture should reflect business structure
โข Even very modest designs involve multiple data centers
โข Schemas cannot be frozen in the real world
โข Security must follow data ownership
27. ยฉ 2014 MapR Technologies 29
Scenarios
Global Data Recovery
34. ยฉ 2014 MapR Technologies 36
What Have We Learned?
โข Need persistence and performance
โ Possibly for years and to 100โs of millions t/s
โข Must have convergence
โ Need files, tables AND streams
โ Need volumes, snapshots, mirrors, permissions and โฆ
โข Must have platform security
โ Cannot depend on perimeter
โ Must follow business structure
โข Must have global scale and scope
โ Millions of topics for natural designs
โ Multi-master replication and update
35. ยฉ 2014 MapR Technologies 37
The Importance of Common APIโs
โข Commonality and interoperability are critical
โ Compare Hadoop eco-system and the noSQL world
โข Table stakes
โ Persistence
โ Performance
โ Polymorphism
โข Major trend so far is to adopt Kafka API
โ 0.9 API and beyond remove major abstraction leaks
โ Kafka API supported by all major Hadoop vendors
37. ยฉ 2014 MapR Technologies 39
Evolution of Data Storage
Functionality
Compatibility
Scalability
Linux
POSIX
Over decades of progress,
Unix-based systems have set the
standard for compatibility and
functionality
38. ยฉ 2014 MapR Technologies 40
Functionality
Compatibility
Scalability
Linux
POSIX
Hadoop
Hadoop achieves much higher
scalability by trading away
essentially all of this compatibility
Evolution of Data Storage
39. ยฉ 2014 MapR Technologies 41
Evolution of Data Storage
Functionality
Compatibility
Scalability
Linux
POSIX
Hadoop
MapR enhanced Apache Hadoop by
restoring the compatibility while
increasing scalability and performance
Functionality
Compatibility
Scalability
POSIX
40. ยฉ 2014 MapR Technologies 42
Functionality
Compatibility
Scalability
Linux
POSIX
Hadoop
Evolution of Data Storage
Adding tables and streams enhances
the functionality of the base file
system
42. ยฉ 2014 MapR Technologies 44
How we do this with MapR
โข MapR Streams is a C++ reimplementation of Kafka API
โ Advantages in predictability, performance, scale
โ Common security and permissions with entire MapR converged data
platform
โข Semantic extensions
โ A cluster contains volumes, files, tables โฆ and now streams
โ Streams contain topics
โ Can have default stream or can name stream by path name
โข Core MapR capabilities preserved
โ Consistent snapshots, mirrors, multi-master replication
43. ยฉ 2014 MapR Technologies 45
MapR core Innovations
โข Volumes
โ Distributed management
โ Data placement
โข Read/write random access file system
โ Allows distributed meta-data
โ Improved scaling
โ Enables NFS access
โข Application-level NIC bonding
โข Transactionally correct snapshots and mirrors
44. ยฉ 2014 MapR Technologies 46
MapR's Containers
๏ฌ Each container contains
๏ฌ Directories & files
๏ฌ Data blocks
๏ฌ Replicated on servers
๏ฌ No need to manage
directly
Files/directories are sharded into blocks, which
are placed into containers on disks
Containers are 16-
32 GB segments of
disk, placed on
nodes
45. ยฉ 2014 MapR Technologies 47
MapR's Containers
๏ฌ Each container has a
replication chain
๏ฌ Updates are transactional
๏ฌ Failures are handled by
rearranging replication
46. ยฉ 2014 MapR Technologies 48
Container locations and replication
CLDB
N1, N2
N3, N2
N1, N2
N1, N3
N3, N2
N1
N2
N3Container location database
(CLDB) keeps track of nodes
hosting each container and
replication chain order
47. ยฉ 2014 MapR Technologies 49
MapR Scaling
Containers represent 16 - 32GB of data
๏ฌ Each can hold up to 1 Billion files and directories
๏ฌ 100M containers = ~ 2 Exabytes (a very large cluster)
250 bytes DRAM to cache a container
๏ฌ 25GB to cache all containers for 2EB cluster
๏ญ But not necessary, can page to disk
๏ฌ Typical large 10PB cluster needs 2GB
Container-reports are 100x - 1000x < HDFS block-reports
๏ฌ Serve 100x more data-nodes
๏ฌ Increase container size to 64G to serve 4EB cluster
๏ฌ Map/reduce not affected
48. ยฉ 2014 MapR Technologies 50
But Wait, Thereโs More
โข Directories and files are implemented in terms of B-trees
โ Key is offset, value is data blob
โ Internal transactional semantics guarantees safety and consistency
โ Layout algorithms give very high layout linearization
โข Tables are implemented in terms of B-trees
โ Twisted B-tree implementation allows virtues of log-structured merge
tree without the compaction delays
โ Tablet splitting without pausing, integration with file system transactions
โข Common security and permissions scheme
49. ยฉ 2014 MapR Technologies 51
And More โฆ
โข Streams are implemented in terms of B-trees as well
โ Topics and consumer offsets are kept in stream, not ZK
โ Similar splitting technology as MapR DB tables
โ Consistent permissions, security, data replication
โข Standard Kafka 0.9 API
โข Plans to add OJAI for high-level structuring
โข Performance is very high
50. ยฉ 2014 MapR Technologies 52
Example
Files
Table
Streams
Directories
Cluster
Volume mount point
51. ยฉ 2014 MapR Technologies 53
Cluster
Volume mount point
52. ยฉ 2014 MapR Technologies 54
Lessons
โข APIโs matter more than implementations
โข There is plenty of room to innovate ahead of the community
โข Posix, HDFS, HBASE all define useful APIโs
โข Kafka 0.9+ does the same
53. ยฉ 2014 MapR Technologies 55
Call to action:
Support the Kafka APIโs
54. ยฉ 2014 MapR Technologies 56
Call to action:
Support the Kafka APIโs
And come by the MapR booth
to check out MapR Streams
56. ยฉ 2014 MapR Technologies 58
Short Books by Ted Dunning & Ellen Friedman
โข Published by OโReilly in 2014 - 2016
โข For sale from Amazon or OโReilly
โข Free e-books currently available courtesy of MapR
http://bit.ly/ebook-real-
world-hadoop
http://bit.ly/mapr-tsdb-
ebook
http://bit.ly/ebook-
anomaly
http://bit.ly/recommend
ation-ebook
57. ยฉ 2014 MapR Technologies 59
Streaming Architecture
by Ted Dunning and Ellen Friedman ยฉ 2016 (published by OโReilly)
Free copies at book
signing today
http://bit.ly/mapr-ebook-streams