SlideShare a Scribd company logo
1 of 49
Download to read offline
2014 © Trivadis
BASEL BERN BRUGG LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN
2014 © Trivadis
Big Data und Fast Data - Lambda
Architektur und deren Umsetzung
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
1
Guido Schmutz
DOAG Konferenz 2014
19.11.2014 – 16:00 Raum Oslo
2014 © Trivadis
Guido Schmutz
•  Working for Trivadis for more than 17 years
•  Oracle ACE Director for Fusion Middleware and SOA
•  Co-Author of different books
•  Consultant, Trainer Software Architect for Java, Oracle, SOA and
Big Data / Fast Data
•  Member of Trivadis Architecture Board
•  Technology Manager @ Trivadis
•  More than 25 years of software development
experience
•  Contact: guido.schmutz@trivadis.com
•  Blog: http://guidoschmutz.wordpress.com
•  Twitter: gschmutz
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
2
2014 © Trivadis
Trivadis is a market leader in IT consulting, system integration,
solution engineering and the provision of IT services focusing
on and technologies in Switzerland,
Germany and Austria.
We offer our services in the following strategic business fields:
Trivadis Services takes over the interacting operation of your IT systems.
Our company
O P E R A T I O N
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
3
2014 © Trivadis
AGENDA
1.  Big Data and Fast Data, what is it?
2.  Architecting (Big) Data Systems
3.  The Lambda Architecture
4.  Use Case and the Implementation
5.  Summary and Outlook
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
4
2014 © Trivadis
Big Data Definition (4 Vs)
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
+ Time to action ? – Big Data + Event
Processing = Fast Data
Characteristics of Big Data: Its Volume,
Velocity and Variety in combination
5
2014 © Trivadis
The world is changing …
The model of Generating/Consuming Data has changed ….
Old Model: few companies are generating data, all others are consuming
data
New Model: all of us are generating data, and all of us are consuming data
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
6
2014 © Trivadis
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
7
2014 © Trivadis
Internet Of Things – Sensors
are/will be everywhere
There are more devices tapping into
the internet than people on earth
How do we prepare our
systems/architecture for the future?
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
Source: CiscoSource: The Economist
8
2014 © Trivadis
The world is changing …
new data stores
Problem of traditional (R)DBMS approach:
§  Complex object graph
§  Schema evolution
§  Semi-structured data
§  Scaling
Polyglot persistence
§  Using multiple data storage technologies (RDMBS + NoSQL + NewSQL + In-
Memory)
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
9
ORDER
ADDRESS
CUSTOMER
ORDER_LINES
Order
ID: 1001
Order Date: 15.9.2012
Line Items
Customer
First Name: Peter
Last Name: Sample
Billing Address
Street: Somestreet 10
City: Somewhere
Postal Code: 55901
Name
Ipod Touch
Monster Beat
Apple Mouse
Quantity
1
2
1
Price
220.95
190.00
69.90
2014 © Trivadis
The world is changing … New platforms evolving (i.e.
Hadoop Ecosystem)
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
10
2014 © Trivadis
Data as an Asset – Store everything?
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
Data is

just too valuable

to delete!

We must 

store anything!
Nonsense! Just 

store the data 

you know 

you need today!
It depends …
Big Data technologies allow to
store the raw information from
new and existing data sources so
that you can later use it to create
new data-driven products, which
you haven’t thought about today!
11
2014 © Trivadis
AGENDA
1.  Big Data and Fast Data, what is it?
2.  Architecting (Big) Data Systems
3.  The Lambda Architecture
4.  Use Case and the Implementation
5.  Summary and Outlook
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
12
2014 © Trivadis
What is a data system?
•  A (data) system that manages the storage and querying of
data with a lifetime measured in years encompassing
every version of the application to ever exist, every
hardware failure and every human mistake ever made.
•  A data system answers questions based on information
that was acquired in the past
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
13
2014 © Trivadis
How do we build (data) systems today – Today’s
Architectures
Source of Truth is mutable!
•  CRUD pattern
What is the problem with this?
•  Lack of Human Fault Tolerance
•  Potential loss of information/
data
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
Mutable
Database
Application
(Query)
RDBMS
NoSQL
NewSQL
Mobile
Web
RIA
Rich Client
Source of Truth
Source of Truth
14
2014 © Trivadis
Lack of Human Fault Tolerance
Bugs will be deployed to production over the lifetime of a data system
Operational mistakes will be made
Humans are part of the overall system
•  Just like hard disks, CPUs, memory, software
•  design for human error like you design for any other fault
Examples of human error
•  Deploy a bug that increments counters by two instead of by one
•  Accidentally delete data from database
•  Accidental DOS on important internal service
Worst two consequences: data loss or data corruption
As long as an error doesn‘t lose or corrupt good data, you can fix what
went wrong
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
15
2014 © Trivadis
Lack of Human Fault Tolerance – Immutability vs.
Mutability
The U and D in CRUD
A mutable system updates the current
state of the world
Mutable systems inherently lack
human fault-tolerance
Easy to corrupt or lose data
An immutable system captures historical
records of events
Each event happens at a particular
time and is always true
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
Immutability restricts the range of errors causing data loss/data corruption
Vastly more human fault-tolerant
Conclusion: Your source of truth should always be immutable
16
2014 © Trivadis
A different kind of architecture with immutable source of
truth
Instead of using our traditional approach … why not building data systems
like this
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
HDFS
NoSQL
NewSQL
RDBMS
View on
Data
Mobile
Web
RIA
Rich Client
Source of Truth
Immutable
data
View on
Data
Application
(Query)
Source of Truth
17
2014 © Trivadis
How to create the views on the Immutable data?
On the fly ?
Materialized, i.e. Pre-computed ?
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
Immutable
data
View
Immutable
data
Pre-

Computed

Views
Query
Query
18
2014 © Trivadis
(Big) Data Processing
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
Immutable
data
Pre-
Computed
Views
Query??
Incoming
Data
How to compute the materialized views ?
How to compute queries from the views ?
19
2014 © Trivadis
Today Big Data Processing means Batch Processing …
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
HDFS
Data Store optimized
for appending large
results
Queries
Stream 1
Stream 2
Event
Hadoop cluster
(Map/Reduce)
Hadoop Distributed File System
20
2014 © Trivadis
Big Data Processing - Batch
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
1.2.13 Add iPAD 64GB
10.3.13 Add Sony RX-100
11..3.13 Add Canon GX-10
11.3.13 Remove Sony RX-100
12.3.13 Add Nikon S-100
14.4.13 Add BoseQC-15
15.4.13 Add MacBook Pro 15
20.4.13 Remove Canon GX10
iPAD 64GB
Nikon S-100
BoseQC-15
MacBook Pro 15
4derive derive
Favorite Product List Changes
Current Favorite 

Product List
Current
Product
Count
Raw information => data
Information => derived
21
2014 © Trivadis
Big Data Processing –
Batch
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
§  Using only batch processing, leaves you always with a portion of non-
processed data.
Fully processed data Last full
batch period
Time for

batch job
time
now
non-processed data
time
now
batch-processed data
But we are not done yet …
22
2014 © Trivadis
Big Data Processing - Adding Real-Time
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
Immutable
data
Batch
Views
Query
?
Data
Stream
Realtime
Views
Incoming
Data
How to compute queries 

from the views ?How to compute real-time views
23
2014 © Trivadis
Big Data Processing - Adding Real-Time
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
1.2.13 Add iPAD 64GB
10.3.13 Add Sony RX-100
11..3.13 Add Canon GX-10
11.3.13 Remove Sony RX-100
12.3.13 Add Nikon S-100
14.4.13 Add BoseQC-15
15.4.13 Add MacBook Pro 15
20.4.13 Remove Canon GX10
Now Add Canon Scanner
iPAD 64GB
Nikon S-100
BoseQC-15
MacBook Pro 15
5
compute
Favorite Product List Changes
Current Favorite 

Product List
Current
Product
Count
Now Canon ScannercomputeAdd Canon Scanner
Stream of
Favorite Product List Changes
Immutable data
Views
Data Stream
Query
incoming
24
2014 © Trivadis
Big Data Processing -
Batch & Real Time
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
time
Fully processed data Last full
batch period
now
Time for

batch job
batch processing

worked fine here
(e.g. Hadoop)
real time processing

works here
blended view for end user
Adapted from Ted Dunning (March 2012):
http://www.youtube.com/watch?v=7PcmbI5aC20
25
2014 © Trivadis
AGENDA
1.  Big Data and Fast Data, what is it?
2.  Architecting (Big) Data Systems
3.  The Lambda Architecture
4.  The Use Case and the Implementation
5.  Summary and Outlook
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
26
2014 © Trivadis
Lambda Architecture
Lambda => Query = function(all data)
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
27
Immutable
data
Batch
View
Query
Data
Stream
Realtime
View
Incoming
Data
Serving Layer
Speed Layer
Batch Layer
A
B
C D
E
F
G
2014 © Trivadis
Lambda Architecture
A.  All data is sent to both the batch and speed layer
B.  Master data set is an immutable, append-only set of data
C.  Batch layer pre-computes query functions from scratch, result is called Batch
Views. Batch layer constantly re-computes the batch views.
D.  Batch views are indexed and stored in a scalable database to get particular
values very quickly. Swaps in new batch views when they are available
E.  Speed layer compensates for the high latency of updates to the Batch Views
F.  Uses fast incremental algorithms and read/write databases to produce real-
time views
G.  Queries are resolved by getting results from both batch and real-time views
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
28
2014 © Trivadis
Lambda Architecture
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
Stores the immutable constantly growing dataset
Computes arbitrary views from this dataset using BigData
technologies (can take hours)
Can be always recreated
Computes the views from the constant stream of data it receives
Needed to compensate for the high latency of the batch layer
Incremental model and views are transient
Responsible for indexing and exposing the pre-computed batch
views so that they can be queried
Exposes the incremented real-time views
Merges the batch and the real-time views into a consistent result
Serving Layer
Batch Layer
Speed Layer
29
2014 © Trivadis
Lambda Architecture
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
Adapted from: Marz, N. & Warren, J. (2013) Big Data. Manning.
30
Distribution
Layer
Speed Layer
Precompute
Views
Visualization
Batch Layer
Precomputed
information
All data
Incremented
information
Process stream
Batch
recompute
Realtime
increment
Serving Layer
batch view
batch view
real time view
real time view
DataService(Merge)
Sensor
Layer
Incoming
Data
social
mobile
IoT
…
2014 © Trivadis
AGENDA
1.  Big Data and Fast Data, what is it?
2.  Architecting (Big) Data Systems
3.  The Lambda Architecture
4.  Use Case and the Implementation
5.  Summary and Outlook
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
31
2014 © Trivadis
Project Definition
•  Build a platform for analyzing Twitter communications in retrospective
and in real-time
•  Scalability and ability for future data fusion with other information is a
must
•  Provide a Web-based access to the analytical information
•  Invest into new, innovative and not widely-proven technology
•  PoC environment, a pre-invest for future systems
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
32
2014 © Trivadis
"profile_banner_url":"https://pbs.twimg.com/profile_banners/15032594/
1371570460",
"profile_link_color":"2FC2EF",
"profile_sidebar_border_color":"FFFFFF",
"profile_sidebar_fill_color":"252429",
"profile_text_color":"666666",
"profile_use_background_image":true,
"default_profile":false,
"default_profile_image":false,
"following":null,
"follow_request_sent":null,
"notifications":null},
"geo":{
"type":"Point","coordinates":[43.28261499,-2.96464655]},
"coordinates":{"type":"Point","coordinates":[-2.96464655,43.28261499]},
"place":{"id":"cd43ea85d651af92",
"url":"https://api.twitter.com/1.1/geo/id/cd43ea85d651af92.json",
"place_type":"city",
"name":"Bilbao",
"full_name":"Bilbao, Vizcaya",
"country_code":"ES",
"country":"Espau00f1a",
"bounding_box":{"type":"Polygon","coordinates":[[[-2.9860102,43.2136542],
[-2.9860102,43.2901452],[-2.8803248,43.2901452],[-2.8803248,43.2136542]]]},
"attributes":{}},
"contributors": null,
"retweet_count":0,
"favorite_count":0,
"entities":{"hashtags":[{"text":"quelosepash","indices":[58,70]}],
"symbols":[],
"urls":[],
"user_mentions":[]},
"favorited":false,
"retweeted":false,
"filter_level":"medium",
"lang":"es“
}
Anatomy of a tweet
33
{
"created_at":"Sun Aug 18 14:29:11 +0000 2013",
"id":369103686938546176,
"id_str":"369103686938546176",
"text":"Baloncesto preparaciu00f3n Eslovenia, Rajoy derrota a Merkel. #quelosepash",
"source":"u003ca href="http://twitter.com/download/iphone" rel="nofollow”
u003eTwitter for iPhoneu003c/au003e",
"truncated":false,
"in_reply_to_status_id":null,
"in_reply_to_status_id_str":null,
"in_reply_to_user_id":null,
"in_reply_to_user_id_str":null,
"in_reply_to_screen_name":null,
"user":{
"id":15032594,
"id_str":"15032594",
"name":"Juan Carlos Romou2122",
"screen_name":"jcsromo",
"location":"Sopuerta, Vizcaya",
"url":null,
"description":"Portugalujo, saturado de todo, de baloncesto no. Twitter personal.",
"protected":false,
"followers_count":1331,
"friends_count":1326,
"listed_count":31,
"created_at":"Fri Jun 06 21:21:22 +0000 2008",
"favourites_count":255,
"utc_offset":7200,
"time_zone":"Madrid",
"geo_enabled":true,
"verified":false,
"statuses_count":22787,
"lang":"es",
"contributors_enabled":false,
"is_translator":false,
…
"profile_image_url_https":"https://si0.twimg.com/profile_images/2649762203
be4973d9eb457a45077897879c47c8b7_normal.jpeg",
Time Space Content Social Technic
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
2014 © Trivadis
Views on Tweets in four dimensions
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
34
when ⇐ where+what+who

• Time series
• Timelines
where ⇐ when+what+who

• Geo maps
• Density plots
what ⇐ when+where+who

• Word clouds
• Topic trends
who ⇐ when+where+what

• Social network graphs
• Activity graphs
Time
Space
Social
Content
Time
Space
Social
Content
Time
Space
Social
Content
Time
Space
Social
Content
2014 © Trivadis
Accessing Twitter
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
35
Quelle Limitierungen Zugang
Twitter’s Search API 3200 / user
5000 / keyword
180 Anfragen / 15 Minuten
gratis
Twitter’s Streaming API 1%-40% des Volumens gratis
DataSift
keine
0.15 -0.20$ /
unit
Gnip keine Auf Anfrage
2014 © Trivadis
Lambda Architecture
Open Source Frameworks for implementing a Lambda Architecture
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
36
Distribution
Layer
Speed Layer
Precompute
Views
Visualization
Batch Layer
Precomputed
information
All data
Incremented
information
Process stream
Batch
recompute
Realtime
increment
Serving Layer
batch view
batch view
real time view
real time view
DataService(Merge)
Sensor
Layer
Incoming
Data
social
mobile
IoT
…
2014 © Trivadis
Lambda Architecture in Action
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
37
Cloudera Distribution
•  Distribution of Apache Hadoop: HDFS,
MapReduce, Hive, Flume, Pig, Impala
Cloudera Impala
•  distributed query execution engine that runs
against data stored in HDFS and HBase
Apache Zookeeper
•  Distributed, highly available coordination service.
Provides primitives such as distributed locks
Apache Storm & Trident
•  distributed, fault-tolerant realtime computation
system
Apache Cassandra
•  distributed database management system
designed to handle large amounts of data across
many commodity servers, providing high
availability with no single point of failure
Twitter Horsebird Client (hbc)
•  Twitter Java API over Streaming API
Spring Framework
•  Popular Java Framework used to modularize
part of the logic (sensor and serving layer)
Apache Kafka
•  Simple messaging framework based on file
system to distribute information to both batch
and speed layer
Apache Avro
•  Serialization system for efficient cross-language
RPC and persistent data storage
JSON
•  open standard format that uses human-
readable text to transmit data objects consisting
of attribute–value pairs.
2014 © Trivadis
Facts & Figures
Currently in total
•  2.7 TB Raw Data
•  1.1 TB Pre-Processed data in
Impala
•  1 TB Solr indices for full text search
Cloudera 4.7.0 with Hadoop, Pig,
Hive, Impala and Solr
Kafka 0.7, Storm 0.9, DataStax
Enterprise Edition
14 active twitter feeds
•  ~ 14 million tweets/day ( > 5 billion
tweets/year)
•  ~ 8 GB/day raw data, compressed (2
DVDs)
•  66 GB storage capacity / day
(replication & views/results included)
Cluster of 10 nodes
•  ~100 processors
•  ~40 TB HD capacity in total; 46%
used
•  >500 GB RAM
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
38
2014 © Trivadis
Lambda Architecture with Oracle Product Stack
Possible implementation with Oracle Product stack
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
39
Distribution
Layer
Speed Layer
Precompute
Views
Visualization
Batch Layer
Precomputed
information
All data
Incremented
information
Process stream
Batch
recompute
Realtime
increment
Serving Layer
batch view
batch view
real time view
real time view
DataService(Merge)
Sensor
Layer
Incoming
Data
social
mobile
IoT
…
Oracle NoSQL
Oracle RDBMS
Oracle Coherence
Oracle BigData Appliance
Oracle NoSQL
Oracle Coherence
Oracle Event Processing
Oracle GoldenGate
Oracle Data Integrator
Oracle GoldenGate
Oracle Event
Processing
For Embedded
Oracle Service Bus
OracleWebLogicServer
OBIEEOracleEndeca
OracleBigData

Connectors
Oracle Coherence
WebLogic JMS
OracleBAM
2014 © Trivadis
AGENDA
1.  Big Data and Fast Data, what is it?
2.  Architecting (Big) Data Systems
3.  The Lambda Architecture
4.  Use Case and the Implementation
5.  Summary and Outlook
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
40
2014 © Trivadis
Summary – The lambda architecture
•  Can discard batch views and real-time views and recreate
everything from scratch
•  Mistakes corrected via re-computation
•  Scalability through platform and distribution
•  Data storage layer optimized independently from query resolution layer
•  Still in a early stage …. But a very interesting idea!
•  Today a zoo of technologies are needed => Infrastructure group might not like
it
•  Better with so-called Hadoop distributions and Hadoop V2 (YARN)
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
41
2014 © Trivadis
Alternative Approaches – Motivation
Data Sharing in Map Reduce …
23/06/14
Obsidian
42
iter. 1
 iter. 2
 . . .
Input
HDFS"
read
HDFS"
write
HDFS"
read
HDFS"
write
Input
query 1
query 2
query 3
result 1
result 2
result 3
. . .
HDFS"
read
2014 © Trivadis
iter. 1
 iter. 2
 . . .
Input
Alternative Approaches – Motivation
What we would like …
23/06/14
Obsidian
43
Distributed"
memory
Input
query 1
query 2
query 3
. . .
one-time"
processing
2014 © Trivadis
Alternatives – Apache Spark
23/06/14
Obsidian
44
Spark
Spark
Streaming"
real-time
Spark SQL
structured
GraphX
graph
MLlib
machine
learning
…
YARN
HDFS
HDFS
Cassandra
2014 © Trivadis
Alternative Technologies – Apache Spark
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
45
Distribution
Layer
Speed Layer
Precompute
Views
Visualization
Batch Layer
Precomputed
information
All data
Incremented
information
Process stream
Batch
recompute
Realtime
increment
Serving Layer
batch view
batch view
real time view
real time view
DataService(Merge)
Sensor
Layer
Incoming
Data
social
mobile
IoT
…
2014 © Trivadis
“Kappa Architecture”
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
Adapted from: Marz, N. & Warren, J. (2013) Big Data. Manning.
46
Distribution
Layer
Speed Layer
Visualization
Batch Layer
All data
Incremented
information
Process stream
Realtime
increment
Serving Layer
real time view
real time view
DataService
Sensor
Layer
Incoming
Data
social
mobile
IoT
…
Precomputed
analytics
analytic view
DataService
Batch
Analytical analysis
Replay
2014 © Trivadis
Unified Log Processing Architecture
Stream processing
allows
for computing feeds
off of other feeds
Derived feeds
are no different
than original feeds
they are computed off
Single deployment of
“Unified Log” but
logically different
feeds
August 2014
Einheitlicher Umgang mit Ereignisströmen - Unified Log Processing Architecture
47
Meter
Readings
Collector
Enrich /
Transform
Aggregate
by Minute
Raw Meter

Readings
Meter with
Customer
Meter by Customer
by Minute
Customer
Aggregate
by Minute
Meter by
Minute
Persist
Meter by
Minute
Persist
Raw Meter
Readings
2014 © Trivadis
Weitere Informationen...
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
48
2014 © Trivadis
BASEL BERN BRUGG LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN
Fragen und Antworten...
2013 © Trivadis
Guido Schmutz
Technology Manager
guido.schmutz@trivadis.com
19.11.2014
DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung

More Related Content

What's hot

Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduCloudera, Inc.
 
Apache sqoop with an use case
Apache sqoop with an use caseApache sqoop with an use case
Apache sqoop with an use caseDavin Abraham
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsKhalid Salama
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Carole Gunst
 
Data Lake: A simple introduction
Data Lake: A simple introductionData Lake: A simple introduction
Data Lake: A simple introductionIBM Analytics
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemJames Serra
 
Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!Visual_BI
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWSAmazon Web Services
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architectureSudheer Kondla
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...HostedbyConfluent
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks FundamentalsDalibor Wijas
 
Snowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglySnowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglyTyler Wishnoff
 
Building the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InBuilding the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InSnapLogic
 

What's hot (20)

Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache Kudu
 
Apache sqoop with an use case
Apache sqoop with an use caseApache sqoop with an use case
Apache sqoop with an use case
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2
 
Data Lake: A simple introduction
Data Lake: A simple introductionData Lake: A simple introduction
Data Lake: A simple introduction
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
 
Data Migration to Azure
Data Migration to AzureData Migration to Azure
Data Migration to Azure
 
Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Druid deep dive
Druid deep diveDruid deep dive
Druid deep dive
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
Snowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglySnowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the Ugly
 
Building the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InBuilding the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump In
 

Viewers also liked

Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkDataWorks Summit
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Helena Edelson
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big dataTrieu Nguyen
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics ArchitectureArvind Sathi
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaHelena Edelson
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...SoftServe
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architecturesDaniel Marcous
 
Zeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data ArchitectureZeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data ArchitectureMapR Technologies
 
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...Nathan Bijnens
 
Next Generation Enterprise Architecture
Next Generation Enterprise ArchitectureNext Generation Enterprise Architecture
Next Generation Enterprise ArchitectureMapR Technologies
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkRahul Kumar
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Altan Khendup
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataTrieu Nguyen
 
Big Data Architectures
Big Data ArchitecturesBig Data Architectures
Big Data ArchitecturesGuido Schmutz
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksLegacy Typesafe (now Lightbend)
 
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesMongoDB
 

Viewers also liked (20)

Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics Architecture
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architectures
 
Zeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data ArchitectureZeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data Architecture
 
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
 
Next Generation Enterprise Architecture
Next Generation Enterprise ArchitectureNext Generation Enterprise Architecture
Next Generation Enterprise Architecture
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache spark
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Big Data Architectures
Big Data ArchitecturesBig Data Architectures
Big Data Architectures
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
 
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
 

Similar to Big Data and Fast Data - Lambda Architecture in Action

Big Data and Fast Data - big and fast combined, is it possible?
Big Data and Fast Data - big and fast combined, is it possible?Big Data and Fast Data - big and fast combined, is it possible?
Big Data and Fast Data - big and fast combined, is it possible?Guido Schmutz
 
Event-Processing-und-BigData-kombiniert-guido_schmutz
Event-Processing-und-BigData-kombiniert-guido_schmutzEvent-Processing-und-BigData-kombiniert-guido_schmutz
Event-Processing-und-BigData-kombiniert-guido_schmutzTrivadis
 
Big Data and Fast Data – Big and Fast Combined, is it Possible?
Big Data and Fast Data – Big and Fast Combined, is it Possible?Big Data and Fast Data – Big and Fast Combined, is it Possible?
Big Data and Fast Data – Big and Fast Combined, is it Possible?Guido Schmutz
 
IoT Architecture - are traditional architectures good enough or do we need n...
 IoT Architecture - are traditional architectures good enough or do we need n... IoT Architecture - are traditional architectures good enough or do we need n...
IoT Architecture - are traditional architectures good enough or do we need n...Guido Schmutz
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixNicolas Morales
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Trivadis
 
Big Data Infrastructure
Big Data InfrastructureBig Data Infrastructure
Big Data InfrastructureTrivadis
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Guido Schmutz
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 
Hadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataHadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataSenturus
 
Cloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native appsCloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native appsVMware Tanzu
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
State of the Cloud and Data Centers 2014
State of the Cloud and Data Centers 2014State of the Cloud and Data Centers 2014
State of the Cloud and Data Centers 2014Digital Realty
 
Why select a cloud based development platform
Why select a cloud based development platformWhy select a cloud based development platform
Why select a cloud based development platformWSO2
 
Why select a cloud based development platform
Why select a cloud based development platformWhy select a cloud based development platform
Why select a cloud based development platformWSO2
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
Have your cake and eat it too: adopting technologies without sacrificing - Pa...
Have your cake and eat it too: adopting technologies without sacrificing - Pa...Have your cake and eat it too: adopting technologies without sacrificing - Pa...
Have your cake and eat it too: adopting technologies without sacrificing - Pa...Internet World
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyDataStax
 

Similar to Big Data and Fast Data - Lambda Architecture in Action (20)

Big Data and Fast Data - big and fast combined, is it possible?
Big Data and Fast Data - big and fast combined, is it possible?Big Data and Fast Data - big and fast combined, is it possible?
Big Data and Fast Data - big and fast combined, is it possible?
 
Event-Processing-und-BigData-kombiniert-guido_schmutz
Event-Processing-und-BigData-kombiniert-guido_schmutzEvent-Processing-und-BigData-kombiniert-guido_schmutz
Event-Processing-und-BigData-kombiniert-guido_schmutz
 
Big Data and Fast Data – Big and Fast Combined, is it Possible?
Big Data and Fast Data – Big and Fast Combined, is it Possible?Big Data and Fast Data – Big and Fast Combined, is it Possible?
Big Data and Fast Data – Big and Fast Combined, is it Possible?
 
IoT Architecture - are traditional architectures good enough or do we need n...
 IoT Architecture - are traditional architectures good enough or do we need n... IoT Architecture - are traditional architectures good enough or do we need n...
IoT Architecture - are traditional architectures good enough or do we need n...
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with Bluemix
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)
 
Big Data Infrastructure
Big Data InfrastructureBig Data Infrastructure
Big Data Infrastructure
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 
Hadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataHadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big Data
 
Cloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native appsCloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native apps
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Big Data and Fast Data combined – is it possible?
Big Data and Fast Data combined – is it possible?Big Data and Fast Data combined – is it possible?
Big Data and Fast Data combined – is it possible?
 
State of the Cloud and Data Centers 2014
State of the Cloud and Data Centers 2014State of the Cloud and Data Centers 2014
State of the Cloud and Data Centers 2014
 
Why select a cloud based development platform
Why select a cloud based development platformWhy select a cloud based development platform
Why select a cloud based development platform
 
Why select a cloud based development platform
Why select a cloud based development platformWhy select a cloud based development platform
Why select a cloud based development platform
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Have your cake and eat it too: adopting technologies without sacrificing - Pa...
Have your cake and eat it too: adopting technologies without sacrificing - Pa...Have your cake and eat it too: adopting technologies without sacrificing - Pa...
Have your cake and eat it too: adopting technologies without sacrificing - Pa...
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 

More from Guido Schmutz

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as CodeGuido Schmutz
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureGuido Schmutz
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsGuido Schmutz
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Guido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureGuido Schmutz
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaGuido Schmutz
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaGuido Schmutz
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming VisualisationGuido Schmutz
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Guido Schmutz
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Guido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 

More from Guido Schmutz (20)

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data Architecture
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data Architecture
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache Kafka
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache Kafka
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using Kafka
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming Visualisation
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI Architecture
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Recently uploaded (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Big Data and Fast Data - Lambda Architecture in Action

  • 1. 2014 © Trivadis BASEL BERN BRUGG LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN 2014 © Trivadis Big Data und Fast Data - Lambda Architektur und deren Umsetzung 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 1 Guido Schmutz DOAG Konferenz 2014 19.11.2014 – 16:00 Raum Oslo
  • 2. 2014 © Trivadis Guido Schmutz •  Working for Trivadis for more than 17 years •  Oracle ACE Director for Fusion Middleware and SOA •  Co-Author of different books •  Consultant, Trainer Software Architect for Java, Oracle, SOA and Big Data / Fast Data •  Member of Trivadis Architecture Board •  Technology Manager @ Trivadis •  More than 25 years of software development experience •  Contact: guido.schmutz@trivadis.com •  Blog: http://guidoschmutz.wordpress.com •  Twitter: gschmutz 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 2
  • 3. 2014 © Trivadis Trivadis is a market leader in IT consulting, system integration, solution engineering and the provision of IT services focusing on and technologies in Switzerland, Germany and Austria. We offer our services in the following strategic business fields: Trivadis Services takes over the interacting operation of your IT systems. Our company O P E R A T I O N 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 3
  • 4. 2014 © Trivadis AGENDA 1.  Big Data and Fast Data, what is it? 2.  Architecting (Big) Data Systems 3.  The Lambda Architecture 4.  Use Case and the Implementation 5.  Summary and Outlook 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 4
  • 5. 2014 © Trivadis Big Data Definition (4 Vs) 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung + Time to action ? – Big Data + Event Processing = Fast Data Characteristics of Big Data: Its Volume, Velocity and Variety in combination 5
  • 6. 2014 © Trivadis The world is changing … The model of Generating/Consuming Data has changed …. Old Model: few companies are generating data, all others are consuming data New Model: all of us are generating data, and all of us are consuming data 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 6
  • 7. 2014 © Trivadis 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 7
  • 8. 2014 © Trivadis Internet Of Things – Sensors are/will be everywhere There are more devices tapping into the internet than people on earth How do we prepare our systems/architecture for the future? 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung Source: CiscoSource: The Economist 8
  • 9. 2014 © Trivadis The world is changing … new data stores Problem of traditional (R)DBMS approach: §  Complex object graph §  Schema evolution §  Semi-structured data §  Scaling Polyglot persistence §  Using multiple data storage technologies (RDMBS + NoSQL + NewSQL + In- Memory) 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 9 ORDER ADDRESS CUSTOMER ORDER_LINES Order ID: 1001 Order Date: 15.9.2012 Line Items Customer First Name: Peter Last Name: Sample Billing Address Street: Somestreet 10 City: Somewhere Postal Code: 55901 Name Ipod Touch Monster Beat Apple Mouse Quantity 1 2 1 Price 220.95 190.00 69.90
  • 10. 2014 © Trivadis The world is changing … New platforms evolving (i.e. Hadoop Ecosystem) 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 10
  • 11. 2014 © Trivadis Data as an Asset – Store everything? 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung Data is
 just too valuable
 to delete!
 We must 
 store anything! Nonsense! Just 
 store the data 
 you know 
 you need today! It depends … Big Data technologies allow to store the raw information from new and existing data sources so that you can later use it to create new data-driven products, which you haven’t thought about today! 11
  • 12. 2014 © Trivadis AGENDA 1.  Big Data and Fast Data, what is it? 2.  Architecting (Big) Data Systems 3.  The Lambda Architecture 4.  Use Case and the Implementation 5.  Summary and Outlook 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 12
  • 13. 2014 © Trivadis What is a data system? •  A (data) system that manages the storage and querying of data with a lifetime measured in years encompassing every version of the application to ever exist, every hardware failure and every human mistake ever made. •  A data system answers questions based on information that was acquired in the past 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 13
  • 14. 2014 © Trivadis How do we build (data) systems today – Today’s Architectures Source of Truth is mutable! •  CRUD pattern What is the problem with this? •  Lack of Human Fault Tolerance •  Potential loss of information/ data 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung Mutable Database Application (Query) RDBMS NoSQL NewSQL Mobile Web RIA Rich Client Source of Truth Source of Truth 14
  • 15. 2014 © Trivadis Lack of Human Fault Tolerance Bugs will be deployed to production over the lifetime of a data system Operational mistakes will be made Humans are part of the overall system •  Just like hard disks, CPUs, memory, software •  design for human error like you design for any other fault Examples of human error •  Deploy a bug that increments counters by two instead of by one •  Accidentally delete data from database •  Accidental DOS on important internal service Worst two consequences: data loss or data corruption As long as an error doesn‘t lose or corrupt good data, you can fix what went wrong 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 15
  • 16. 2014 © Trivadis Lack of Human Fault Tolerance – Immutability vs. Mutability The U and D in CRUD A mutable system updates the current state of the world Mutable systems inherently lack human fault-tolerance Easy to corrupt or lose data An immutable system captures historical records of events Each event happens at a particular time and is always true 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung Immutability restricts the range of errors causing data loss/data corruption Vastly more human fault-tolerant Conclusion: Your source of truth should always be immutable 16
  • 17. 2014 © Trivadis A different kind of architecture with immutable source of truth Instead of using our traditional approach … why not building data systems like this 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung HDFS NoSQL NewSQL RDBMS View on Data Mobile Web RIA Rich Client Source of Truth Immutable data View on Data Application (Query) Source of Truth 17
  • 18. 2014 © Trivadis How to create the views on the Immutable data? On the fly ? Materialized, i.e. Pre-computed ? 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung Immutable data View Immutable data Pre-
 Computed
 Views Query Query 18
  • 19. 2014 © Trivadis (Big) Data Processing 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung Immutable data Pre- Computed Views Query?? Incoming Data How to compute the materialized views ? How to compute queries from the views ? 19
  • 20. 2014 © Trivadis Today Big Data Processing means Batch Processing … 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung HDFS Data Store optimized for appending large results Queries Stream 1 Stream 2 Event Hadoop cluster (Map/Reduce) Hadoop Distributed File System 20
  • 21. 2014 © Trivadis Big Data Processing - Batch 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 1.2.13 Add iPAD 64GB 10.3.13 Add Sony RX-100 11..3.13 Add Canon GX-10 11.3.13 Remove Sony RX-100 12.3.13 Add Nikon S-100 14.4.13 Add BoseQC-15 15.4.13 Add MacBook Pro 15 20.4.13 Remove Canon GX10 iPAD 64GB Nikon S-100 BoseQC-15 MacBook Pro 15 4derive derive Favorite Product List Changes Current Favorite 
 Product List Current Product Count Raw information => data Information => derived 21
  • 22. 2014 © Trivadis Big Data Processing – Batch 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung §  Using only batch processing, leaves you always with a portion of non- processed data. Fully processed data Last full batch period Time for
 batch job time now non-processed data time now batch-processed data But we are not done yet … 22
  • 23. 2014 © Trivadis Big Data Processing - Adding Real-Time 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung Immutable data Batch Views Query ? Data Stream Realtime Views Incoming Data How to compute queries 
 from the views ?How to compute real-time views 23
  • 24. 2014 © Trivadis Big Data Processing - Adding Real-Time 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 1.2.13 Add iPAD 64GB 10.3.13 Add Sony RX-100 11..3.13 Add Canon GX-10 11.3.13 Remove Sony RX-100 12.3.13 Add Nikon S-100 14.4.13 Add BoseQC-15 15.4.13 Add MacBook Pro 15 20.4.13 Remove Canon GX10 Now Add Canon Scanner iPAD 64GB Nikon S-100 BoseQC-15 MacBook Pro 15 5 compute Favorite Product List Changes Current Favorite 
 Product List Current Product Count Now Canon ScannercomputeAdd Canon Scanner Stream of Favorite Product List Changes Immutable data Views Data Stream Query incoming 24
  • 25. 2014 © Trivadis Big Data Processing - Batch & Real Time 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung time Fully processed data Last full batch period now Time for
 batch job batch processing
 worked fine here (e.g. Hadoop) real time processing
 works here blended view for end user Adapted from Ted Dunning (March 2012): http://www.youtube.com/watch?v=7PcmbI5aC20 25
  • 26. 2014 © Trivadis AGENDA 1.  Big Data and Fast Data, what is it? 2.  Architecting (Big) Data Systems 3.  The Lambda Architecture 4.  The Use Case and the Implementation 5.  Summary and Outlook 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 26
  • 27. 2014 © Trivadis Lambda Architecture Lambda => Query = function(all data) 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 27 Immutable data Batch View Query Data Stream Realtime View Incoming Data Serving Layer Speed Layer Batch Layer A B C D E F G
  • 28. 2014 © Trivadis Lambda Architecture A.  All data is sent to both the batch and speed layer B.  Master data set is an immutable, append-only set of data C.  Batch layer pre-computes query functions from scratch, result is called Batch Views. Batch layer constantly re-computes the batch views. D.  Batch views are indexed and stored in a scalable database to get particular values very quickly. Swaps in new batch views when they are available E.  Speed layer compensates for the high latency of updates to the Batch Views F.  Uses fast incremental algorithms and read/write databases to produce real- time views G.  Queries are resolved by getting results from both batch and real-time views 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 28
  • 29. 2014 © Trivadis Lambda Architecture 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung Stores the immutable constantly growing dataset Computes arbitrary views from this dataset using BigData technologies (can take hours) Can be always recreated Computes the views from the constant stream of data it receives Needed to compensate for the high latency of the batch layer Incremental model and views are transient Responsible for indexing and exposing the pre-computed batch views so that they can be queried Exposes the incremented real-time views Merges the batch and the real-time views into a consistent result Serving Layer Batch Layer Speed Layer 29
  • 30. 2014 © Trivadis Lambda Architecture 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung Adapted from: Marz, N. & Warren, J. (2013) Big Data. Manning. 30 Distribution Layer Speed Layer Precompute Views Visualization Batch Layer Precomputed information All data Incremented information Process stream Batch recompute Realtime increment Serving Layer batch view batch view real time view real time view DataService(Merge) Sensor Layer Incoming Data social mobile IoT …
  • 31. 2014 © Trivadis AGENDA 1.  Big Data and Fast Data, what is it? 2.  Architecting (Big) Data Systems 3.  The Lambda Architecture 4.  Use Case and the Implementation 5.  Summary and Outlook 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 31
  • 32. 2014 © Trivadis Project Definition •  Build a platform for analyzing Twitter communications in retrospective and in real-time •  Scalability and ability for future data fusion with other information is a must •  Provide a Web-based access to the analytical information •  Invest into new, innovative and not widely-proven technology •  PoC environment, a pre-invest for future systems 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 32
  • 33. 2014 © Trivadis "profile_banner_url":"https://pbs.twimg.com/profile_banners/15032594/ 1371570460", "profile_link_color":"2FC2EF", "profile_sidebar_border_color":"FFFFFF", "profile_sidebar_fill_color":"252429", "profile_text_color":"666666", "profile_use_background_image":true, "default_profile":false, "default_profile_image":false, "following":null, "follow_request_sent":null, "notifications":null}, "geo":{ "type":"Point","coordinates":[43.28261499,-2.96464655]}, "coordinates":{"type":"Point","coordinates":[-2.96464655,43.28261499]}, "place":{"id":"cd43ea85d651af92", "url":"https://api.twitter.com/1.1/geo/id/cd43ea85d651af92.json", "place_type":"city", "name":"Bilbao", "full_name":"Bilbao, Vizcaya", "country_code":"ES", "country":"Espau00f1a", "bounding_box":{"type":"Polygon","coordinates":[[[-2.9860102,43.2136542], [-2.9860102,43.2901452],[-2.8803248,43.2901452],[-2.8803248,43.2136542]]]}, "attributes":{}}, "contributors": null, "retweet_count":0, "favorite_count":0, "entities":{"hashtags":[{"text":"quelosepash","indices":[58,70]}], "symbols":[], "urls":[], "user_mentions":[]}, "favorited":false, "retweeted":false, "filter_level":"medium", "lang":"es“ } Anatomy of a tweet 33 { "created_at":"Sun Aug 18 14:29:11 +0000 2013", "id":369103686938546176, "id_str":"369103686938546176", "text":"Baloncesto preparaciu00f3n Eslovenia, Rajoy derrota a Merkel. #quelosepash", "source":"u003ca href="http://twitter.com/download/iphone" rel="nofollow” u003eTwitter for iPhoneu003c/au003e", "truncated":false, "in_reply_to_status_id":null, "in_reply_to_status_id_str":null, "in_reply_to_user_id":null, "in_reply_to_user_id_str":null, "in_reply_to_screen_name":null, "user":{ "id":15032594, "id_str":"15032594", "name":"Juan Carlos Romou2122", "screen_name":"jcsromo", "location":"Sopuerta, Vizcaya", "url":null, "description":"Portugalujo, saturado de todo, de baloncesto no. Twitter personal.", "protected":false, "followers_count":1331, "friends_count":1326, "listed_count":31, "created_at":"Fri Jun 06 21:21:22 +0000 2008", "favourites_count":255, "utc_offset":7200, "time_zone":"Madrid", "geo_enabled":true, "verified":false, "statuses_count":22787, "lang":"es", "contributors_enabled":false, "is_translator":false, … "profile_image_url_https":"https://si0.twimg.com/profile_images/2649762203 be4973d9eb457a45077897879c47c8b7_normal.jpeg", Time Space Content Social Technic 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung
  • 34. 2014 © Trivadis Views on Tweets in four dimensions 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 34 when ⇐ where+what+who • Time series • Timelines where ⇐ when+what+who • Geo maps • Density plots what ⇐ when+where+who • Word clouds • Topic trends who ⇐ when+where+what • Social network graphs • Activity graphs Time Space Social Content Time Space Social Content Time Space Social Content Time Space Social Content
  • 35. 2014 © Trivadis Accessing Twitter 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 35 Quelle Limitierungen Zugang Twitter’s Search API 3200 / user 5000 / keyword 180 Anfragen / 15 Minuten gratis Twitter’s Streaming API 1%-40% des Volumens gratis DataSift keine 0.15 -0.20$ / unit Gnip keine Auf Anfrage
  • 36. 2014 © Trivadis Lambda Architecture Open Source Frameworks for implementing a Lambda Architecture 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 36 Distribution Layer Speed Layer Precompute Views Visualization Batch Layer Precomputed information All data Incremented information Process stream Batch recompute Realtime increment Serving Layer batch view batch view real time view real time view DataService(Merge) Sensor Layer Incoming Data social mobile IoT …
  • 37. 2014 © Trivadis Lambda Architecture in Action 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 37 Cloudera Distribution •  Distribution of Apache Hadoop: HDFS, MapReduce, Hive, Flume, Pig, Impala Cloudera Impala •  distributed query execution engine that runs against data stored in HDFS and HBase Apache Zookeeper •  Distributed, highly available coordination service. Provides primitives such as distributed locks Apache Storm & Trident •  distributed, fault-tolerant realtime computation system Apache Cassandra •  distributed database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure Twitter Horsebird Client (hbc) •  Twitter Java API over Streaming API Spring Framework •  Popular Java Framework used to modularize part of the logic (sensor and serving layer) Apache Kafka •  Simple messaging framework based on file system to distribute information to both batch and speed layer Apache Avro •  Serialization system for efficient cross-language RPC and persistent data storage JSON •  open standard format that uses human- readable text to transmit data objects consisting of attribute–value pairs.
  • 38. 2014 © Trivadis Facts & Figures Currently in total •  2.7 TB Raw Data •  1.1 TB Pre-Processed data in Impala •  1 TB Solr indices for full text search Cloudera 4.7.0 with Hadoop, Pig, Hive, Impala and Solr Kafka 0.7, Storm 0.9, DataStax Enterprise Edition 14 active twitter feeds •  ~ 14 million tweets/day ( > 5 billion tweets/year) •  ~ 8 GB/day raw data, compressed (2 DVDs) •  66 GB storage capacity / day (replication & views/results included) Cluster of 10 nodes •  ~100 processors •  ~40 TB HD capacity in total; 46% used •  >500 GB RAM 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 38
  • 39. 2014 © Trivadis Lambda Architecture with Oracle Product Stack Possible implementation with Oracle Product stack 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 39 Distribution Layer Speed Layer Precompute Views Visualization Batch Layer Precomputed information All data Incremented information Process stream Batch recompute Realtime increment Serving Layer batch view batch view real time view real time view DataService(Merge) Sensor Layer Incoming Data social mobile IoT … Oracle NoSQL Oracle RDBMS Oracle Coherence Oracle BigData Appliance Oracle NoSQL Oracle Coherence Oracle Event Processing Oracle GoldenGate Oracle Data Integrator Oracle GoldenGate Oracle Event Processing For Embedded Oracle Service Bus OracleWebLogicServer OBIEEOracleEndeca OracleBigData
 Connectors Oracle Coherence WebLogic JMS OracleBAM
  • 40. 2014 © Trivadis AGENDA 1.  Big Data and Fast Data, what is it? 2.  Architecting (Big) Data Systems 3.  The Lambda Architecture 4.  Use Case and the Implementation 5.  Summary and Outlook 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 40
  • 41. 2014 © Trivadis Summary – The lambda architecture •  Can discard batch views and real-time views and recreate everything from scratch •  Mistakes corrected via re-computation •  Scalability through platform and distribution •  Data storage layer optimized independently from query resolution layer •  Still in a early stage …. But a very interesting idea! •  Today a zoo of technologies are needed => Infrastructure group might not like it •  Better with so-called Hadoop distributions and Hadoop V2 (YARN) 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 41
  • 42. 2014 © Trivadis Alternative Approaches – Motivation Data Sharing in Map Reduce … 23/06/14 Obsidian 42 iter. 1 iter. 2 . . . Input HDFS" read HDFS" write HDFS" read HDFS" write Input query 1 query 2 query 3 result 1 result 2 result 3 . . . HDFS" read
  • 43. 2014 © Trivadis iter. 1 iter. 2 . . . Input Alternative Approaches – Motivation What we would like … 23/06/14 Obsidian 43 Distributed" memory Input query 1 query 2 query 3 . . . one-time" processing
  • 44. 2014 © Trivadis Alternatives – Apache Spark 23/06/14 Obsidian 44 Spark Spark Streaming" real-time Spark SQL structured GraphX graph MLlib machine learning … YARN HDFS HDFS Cassandra
  • 45. 2014 © Trivadis Alternative Technologies – Apache Spark 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 45 Distribution Layer Speed Layer Precompute Views Visualization Batch Layer Precomputed information All data Incremented information Process stream Batch recompute Realtime increment Serving Layer batch view batch view real time view real time view DataService(Merge) Sensor Layer Incoming Data social mobile IoT …
  • 46. 2014 © Trivadis “Kappa Architecture” 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung Adapted from: Marz, N. & Warren, J. (2013) Big Data. Manning. 46 Distribution Layer Speed Layer Visualization Batch Layer All data Incremented information Process stream Realtime increment Serving Layer real time view real time view DataService Sensor Layer Incoming Data social mobile IoT … Precomputed analytics analytic view DataService Batch Analytical analysis Replay
  • 47. 2014 © Trivadis Unified Log Processing Architecture Stream processing allows for computing feeds off of other feeds Derived feeds are no different than original feeds they are computed off Single deployment of “Unified Log” but logically different feeds August 2014 Einheitlicher Umgang mit Ereignisströmen - Unified Log Processing Architecture 47 Meter Readings Collector Enrich / Transform Aggregate by Minute Raw Meter
 Readings Meter with Customer Meter by Customer by Minute Customer Aggregate by Minute Meter by Minute Persist Meter by Minute Persist Raw Meter Readings
  • 48. 2014 © Trivadis Weitere Informationen... 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung 48
  • 49. 2014 © Trivadis BASEL BERN BRUGG LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN Fragen und Antworten... 2013 © Trivadis Guido Schmutz Technology Manager guido.schmutz@trivadis.com 19.11.2014 DOAG 2014 | Big Data und Fast Data - Lambda Architektur und deren Umsetzung