SlideShare a Scribd company logo
1 of 40
Download to read offline
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA
HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH
Big Data Solution Architectures
29.9.2016 – DOAG 2016 Big Data Days
Guido Schmutz
Trivadis
Guido Schmutz
Working for Trivadis for more than 19 years
Oracle ACE Director for Fusion Middleware and SOA
Co-Author of different books
Consultant, Trainer, Software Architect for Java, SOA & Big Data / Fast Data
Member of Trivadis Architecture Board
Technology Manager @ Trivadis
More than 25 years of software development experience
Contact: guido.schmutz@trivadis.com
Blog: http://guidoschmutz.wordpress.com
Slideshare: http://www.slideshare.net/gschmutz
Twitter: gschmutz
29.9.2016 Big Data Solution Architectures2
Agenda
Big Data Solution Architectures3 29.9.2016
1. Introduction
2. Big Data Reference Architectures
• Traditional Big Data
• Event / Stream-Processing
• Lambda Architecture
• Kappa Architecture
• Unified Architecture
3. Big Data Ecosystem – many choices sorted!
Introduction
Big Data Solution Architectures29.9.20164
Why talking about Big Data Architectures
Choosing the right architecture is key for any (big data) project
Big Data is still quite a rather young field and therefore a “moving target”
no standard architectures available which have been used for years
In the past years, some architectures and best practices have evolved
Know your use cases before choosing your architecture / technologies
To have a reference architecture in place helps in choosing the
right/matching technologies
Big Data Solution Architectures29.9.20165
How to do Big Data? Why is a structure / architecture
important
Big Data Solution Architectures29.9.20166
Big Data Ecosystem – many choices sorted!
Big Data Solution Architectures29.9.20167
Important Properties for choosing (Big) Data Architecture
Latency
Keep raw and un-interpreted data “forever” ?
Volume, Velocity, Variety, Veracity
Ad-Hoc Query Capabilities needed ?
Robustness & Fault Tolerance
Scalability
…
Big Data Solution Architectures29.9.20169
Big Data Reference Architectures -
Traditional Big Data
Big Data Solution Architectures29.9.201610
“Traditional Architecture” for Big Data
Data
Ingestion
(Analytical)	Data	Processing
Result	StoreData
Sources
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Batch
compute
Pushing	
Ingestion Result	Store
Query
Engine
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Big Data Solution Architectures
Pulling	
Ingestion
Channel
29.9.201611
“Traditional Architecture” for Big Data – Hadoop
Technology Mapping
Data
Ingestion
(Analytical)	Data	Processing
Result	StoreData
Sources
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Batch
compute
Pushing	
Ingestion Result	Store
Query
Engine
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Big Data Solution Architectures
Pulling	
Ingestion
Channel
29.9.201612
“Traditional Architecture” for Big Data – Spark
Technology Mapping
Data
Ingestion
(Analytical)	Data	Processing
Result	StoreData
Sources
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Batch
compute
Pushing	
Ingestion Result	Store
Query
Engine
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Big Data Solution Architectures
Pulling	
Ingestion
Channel
29.9.201613
“Traditional Architecture” for Big Data – Feeding in High-
Volume Event Streams
Data
Ingestion
(Analytical)	Data	Processing
Result	StoreData
Sources
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Batch
compute
Pushing	
Ingestion Result	Store
Query
Engine
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Big Data Solution Architectures
Pulling	
Ingestion
Channel
?
?
29.9.201614
Traditional Architecture for Big Data
• Batch Processing - “Data at Rest”
• Not for low latency use cases
• Responses are delivered “after the fact”
• Maximum value of the identified situation is lost
• Decision are made on old and stale data
• Spar Core is a faster alternative to Hadoop Map
Reduce, but still Batch Processing
• Spark Ecosystems offers a lot of additional
advanced analytic capabilities (machine learning,
graph processing, …)
Big Data Solution Architectures29.9.201615
Big Data Reference Architectures –
Event/Stream Processing
Big Data Solution Architectures29.9.201616
Event / Stream Processing – “Data in Motion”
“Data in motion”
Events are analyzed and processed in real-
time as the arrive
Decisions are timely, contextual and based
on fresh data
Decision latency is eliminated
Big Data Solution Architectures29.9.201617
Event / Stream Processing Architecture
Data
Ingestion
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
Logfiles
Social
RDBMS
ERP
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Result	Store
Messaging
Result	Store
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
29.9.201618
Continuous Ingestion
DB	Source
Big	Data
Log
Stream	
Processing
IoT Sensor
Event	Hub
Topic
Topic
REST
Topic
IoT GW
CDC	GW
Connect
CDC
DB	Source
Log CDC
Native
IoT Sensor
IoT Sensor
19
Dataflow	GW
Topic
Topic
Queue
MQTT	GW
Topic
Dataflow	GW
Dataflow
TopicREST
19
File	Source
Log
Log
Log
Social
Native
29.9.2016 Big Data Solution Architectures19
Topic
Topic
Challenges for Ingesting Sensor Data
Big Data Solution Architectures
Multitude of sensors
Real-Time Streaming
Multiple Firmware versions
Bad Data from damaged sensors
Regulatory Constraints
Data Quality
20 29.9.2016
SQL Polling
Change Data Capture
(CDC)
File Stream (File Tailing)
File Stream (Streaming
Appender)
Enabling Continuous Data Ingestion
Sensor Stream
Big Data Solution Architectures21 29.9.2016
Event / Stream Processing Architecture – Open Source
Technology Mapping
Data
Ingestion
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
Logfiles
Social
RDBMS
ERP
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Result	Store
Messaging
Result	Store
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
29.9.201622
Event / Stream Processing Architecture – Oracle
Technology Mapping
Data
Ingestion
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
Logfiles
Social
RDBMS
ERP
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Result	Store
Messaging
Result	Store
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
29.9.201623
Event / Stream Processing Architecture
The solution for low latency use cases
Process each event separately => low latency
Process events in micro-batches => increases latency but offers better
reliability
Previously known as “Complex Event Processing”
Keep the data moving / Data in Motion instead of Data at Rest => raw events
were not stored
Big Data Solution Architectures29.9.201624
Event / Stream Processing Architecture - Keep raw
event data
Data
Ingestion
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
Logfiles
Social
RDBMS
ERP
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Result	Store
Messaging
Result	Store
(Analytical)	Batch	Data	Processing
Raw	Data	
(Reservoir)
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
29.9.201625
Big Data Reference Architectures -
Lambda Architecture for Big Data
Big Data Solution Architectures29.9.201626
“Lambda Architecture” for Big Data
Data
Ingestion
(Analytical)	Batch	Data	Processing
Batch
compute
Result	StoreData
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Batch
compute
Messaging
Result	Store
Query
Engine
Result	Store
Computed	
Information
Raw	Data	
(Reservoir)
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
Pulling	
Ingestion
29.9.201627
“Lambda Architecture” for Big Data
Data
Ingestion
(Analytical)	Batch	Data	Processing
Batch
compute
Result	StoreData
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Batch
compute
Messaging
Result	Store
Query
Engine
Result	Store
Computed	
Information
Raw	Data	
(Reservoir)
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
Pulling	
Ingestion
29.9.201628
Lambda Architecture for Big Data
Combines (Big) Data at Rest with (Fast) Data in Motion
Closes the gap from high-latency batch processing
Keeps the raw information forever
Makes it possible to rerun analytics operations on whole data set if necessary
=> because the old run had an error or
=> because we have found a better algorithm we want to apply
Have to implement functionality twice
• Once for batch
• Once for real-time streaming
Big Data Solution Architectures29.9.201629
Big Data Reference Architectures -
„Kappa“ Architecture
Big Data Solution Architectures29.9.201630
“Kappa Architecture” for Big Data
Data
Ingestion
“Raw	Data	Reservoir”
Batch
compute
Data
Sources
Messaging
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Result	Store
Messaging
Result	Store
Raw	Data	
(Reservoir)
Computed	
Information
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
29.9.201631
Big Data Reference Architectures -
„Unified“ Architecture
Big Data Solution Architectures29.9.201632
“Unified Architecture” for Big Data
Data
Ingestion
(Analytical)	Batch	Data	Processing	(Calculate	
Models	of	incoming	data)
Batch
compute
Result	StoreData
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Batch
compute
Messaging
Result	Store
Query
Engine
Result	Store
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Prediction	
Models
Big Data Solution Architectures29.9.201633
Big Data Ecosystem – many
choices sorted!
Big Data Solution Architectures29.9.201634
Building Blocks for (Big) Data Processing
Data
Acquisition
Format
File System
Stream Processing
Batch SQL
Graph DBMS
Document
DBMS
Relational
DBMS
Visualization
IoT
Messaging
Analytics
OLAP DBMS
Query
Federation
Table-Style
DBMS
Key Value
DBMS
Batch Processing
In-Memory
Big Data Solution Architectures29.9.201635
Big Data Ecosystem – many choices sorted!
Big Data Solution Architectures29.9.201636
NoSQL Datastores
Big Data Solution Architectures29.9.201637
Organizing NoSQL Datastores – Different Types
Key	Value	Store
Big Data Solution Architectures38
Wide-column	store
Document	store
Graph	store
29.9.2016
Key Value
K1 V1
K2 V2
K3 V3
Document
{
k1:	v1,
k2:	v2,	
k3:	[v1,	v2,	v3]
}
Rowkey
CK1
RK1
V1
CK2
V2
CK3
V3
CK4
V4
…
…
CK1
RK2
V1
CK4
V4
CK6
V6
…
…
…
…
…
…
CK3
V3
Organizing NoSQL Datastores – and the Products
Key	Value	Store
Big Data Solution Architectures39
Wide-column	store
Document	store
Graph	store
29.9.2016
Big Data Solution Architectures29.9.201640
Guido Schmutz
Technology Manager
guido.schmutz@trivadis.com
Big Data Solution Architectures29.9.201641

More Related Content

What's hot

Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming VisualisationGuido Schmutz
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaGuido Schmutz
 
Building event-driven (Micro)Services with Apache Kafka
Building event-driven (Micro)Services with Apache Kafka Building event-driven (Micro)Services with Apache Kafka
Building event-driven (Micro)Services with Apache Kafka Guido Schmutz
 
Building Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache KafkaBuilding Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache KafkaGuido Schmutz
 
Architecture of Big Data Solutions
Architecture of Big Data SolutionsArchitecture of Big Data Solutions
Architecture of Big Data SolutionsGuido Schmutz
 
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Using Apache Cassandra and Apache Kafka to Scale Next Gen ApplicationsUsing Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Using Apache Cassandra and Apache Kafka to Scale Next Gen ApplicationsData Con LA
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureGuido Schmutz
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureGuido Schmutz
 
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
Real Time Analytics with Apache Cassandra - Cassandra Day MunichReal Time Analytics with Apache Cassandra - Cassandra Day Munich
Real Time Analytics with Apache Cassandra - Cassandra Day MunichGuido Schmutz
 
Building event-driven Microservices with Kafka Ecosystem
Building event-driven Microservices with Kafka EcosystemBuilding event-driven Microservices with Kafka Ecosystem
Building event-driven Microservices with Kafka EcosystemGuido Schmutz
 
Internet of Things - Are traditional architectures good enough?
Internet of Things - Are traditional architectures good enough?Internet of Things - Are traditional architectures good enough?
Internet of Things - Are traditional architectures good enough?Guido Schmutz
 
Real time big data stream processing
Real time big data stream processing Real time big data stream processing
Real time big data stream processing Luay AL-Assadi
 
Oracle Stream Explorer - Simplifying Event/Stream Processing
Oracle Stream Explorer - Simplifying Event/Stream ProcessingOracle Stream Explorer - Simplifying Event/Stream Processing
Oracle Stream Explorer - Simplifying Event/Stream ProcessingGuido Schmutz
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Guido Schmutz
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaGuido Schmutz
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?Guido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE) Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE) Guido Schmutz
 
Rediscovering the Value of Apache Kafka® in Modern Data Architecture
Rediscovering the Value of Apache Kafka® in Modern Data ArchitectureRediscovering the Value of Apache Kafka® in Modern Data Architecture
Rediscovering the Value of Apache Kafka® in Modern Data Architectureconfluent
 

What's hot (20)

Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming Visualisation
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache Kafka
 
Building event-driven (Micro)Services with Apache Kafka
Building event-driven (Micro)Services with Apache Kafka Building event-driven (Micro)Services with Apache Kafka
Building event-driven (Micro)Services with Apache Kafka
 
Building Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache KafkaBuilding Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache Kafka
 
Architecture of Big Data Solutions
Architecture of Big Data SolutionsArchitecture of Big Data Solutions
Architecture of Big Data Solutions
 
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Using Apache Cassandra and Apache Kafka to Scale Next Gen ApplicationsUsing Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data Architecture
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data Architecture
 
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
Real Time Analytics with Apache Cassandra - Cassandra Day MunichReal Time Analytics with Apache Cassandra - Cassandra Day Munich
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
 
Building event-driven Microservices with Kafka Ecosystem
Building event-driven Microservices with Kafka EcosystemBuilding event-driven Microservices with Kafka Ecosystem
Building event-driven Microservices with Kafka Ecosystem
 
Internet of Things - Are traditional architectures good enough?
Internet of Things - Are traditional architectures good enough?Internet of Things - Are traditional architectures good enough?
Internet of Things - Are traditional architectures good enough?
 
Real time big data stream processing
Real time big data stream processing Real time big data stream processing
Real time big data stream processing
 
Oracle Stream Explorer - Simplifying Event/Stream Processing
Oracle Stream Explorer - Simplifying Event/Stream ProcessingOracle Stream Explorer - Simplifying Event/Stream Processing
Oracle Stream Explorer - Simplifying Event/Stream Processing
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache Kafka
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE) Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
 
Rediscovering the Value of Apache Kafka® in Modern Data Architecture
Rediscovering the Value of Apache Kafka® in Modern Data ArchitectureRediscovering the Value of Apache Kafka® in Modern Data Architecture
Rediscovering the Value of Apache Kafka® in Modern Data Architecture
 

Viewers also liked

Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming AnalyticsGuido Schmutz
 
Storm – Streaming Data Analytics at Scale - StampedeCon 2014
Storm – Streaming Data Analytics at Scale - StampedeCon 2014Storm – Streaming Data Analytics at Scale - StampedeCon 2014
Storm – Streaming Data Analytics at Scale - StampedeCon 2014StampedeCon
 
Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming AnalyticsGuido Schmutz
 
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...DataStax Academy
 

Viewers also liked (6)

Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming Analytics
 
Storm – Streaming Data Analytics at Scale - StampedeCon 2014
Storm – Streaming Data Analytics at Scale - StampedeCon 2014Storm – Streaming Data Analytics at Scale - StampedeCon 2014
Storm – Streaming Data Analytics at Scale - StampedeCon 2014
 
Big Data Technology
Big Data TechnologyBig Data Technology
Big Data Technology
 
Importance of Big Data Analytics
Importance of Big Data AnalyticsImportance of Big Data Analytics
Importance of Big Data Analytics
 
Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming Analytics
 
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
 

Similar to Big Data Architectures

Architektur von Big Data Lösungen
Architektur von Big Data LösungenArchitektur von Big Data Lösungen
Architektur von Big Data LösungenGuido Schmutz
 
Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Guido Schmutz
 
Oracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingOracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingGuido Schmutz
 
BDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVA
BDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVABDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVA
BDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVABigData_Europe
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...Alex Liu
 
Stream Processing – Concepts and Frameworks
Stream Processing – Concepts and FrameworksStream Processing – Concepts and Frameworks
Stream Processing – Concepts and FrameworksGuido Schmutz
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesJames Serra
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 
SC4 Workshop 1: Simon Scerri: Existing tools and technologies
SC4 Workshop 1: Simon Scerri: Existing tools and technologiesSC4 Workshop 1: Simon Scerri: Existing tools and technologies
SC4 Workshop 1: Simon Scerri: Existing tools and technologiesBigData_Europe
 
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...Matt Stubbs
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big DataFrank Kienle
 
Internet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataInternet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataGuido Schmutz
 
MediaMath - Big Data Warehousing Meetup - 2/16/2016
MediaMath - Big Data Warehousing Meetup - 2/16/2016MediaMath - Big Data Warehousing Meetup - 2/16/2016
MediaMath - Big Data Warehousing Meetup - 2/16/2016SoryRawyer
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesVirtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesDenodo
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Shirshanka Das
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Yael Garten
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptxElsonPaul2
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Prof.Balakrishnan S
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...Seeling Cheung
 

Similar to Big Data Architectures (20)

Architektur von Big Data Lösungen
Architektur von Big Data LösungenArchitektur von Big Data Lösungen
Architektur von Big Data Lösungen
 
Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016
 
Oracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingOracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream Processing
 
BDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVA
BDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVABDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVA
BDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVA
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 
Stream Processing – Concepts and Frameworks
Stream Processing – Concepts and FrameworksStream Processing – Concepts and Frameworks
Stream Processing – Concepts and Frameworks
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
SC4 Workshop 1: Simon Scerri: Existing tools and technologies
SC4 Workshop 1: Simon Scerri: Existing tools and technologiesSC4 Workshop 1: Simon Scerri: Existing tools and technologies
SC4 Workshop 1: Simon Scerri: Existing tools and technologies
 
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
Internet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataInternet of Things (IoT) and Big Data
Internet of Things (IoT) and Big Data
 
MediaMath - Big Data Warehousing Meetup - 2/16/2016
MediaMath - Big Data Warehousing Meetup - 2/16/2016MediaMath - Big Data Warehousing Meetup - 2/16/2016
MediaMath - Big Data Warehousing Meetup - 2/16/2016
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesVirtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & Bénéfices
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
 

More from Guido Schmutz

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as CodeGuido Schmutz
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsGuido Schmutz
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaGuido Schmutz
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Guido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 
Location Analytics - Real Time Geofencing using Apache Kafka
Location Analytics - Real Time Geofencing using Apache KafkaLocation Analytics - Real Time Geofencing using Apache Kafka
Location Analytics - Real Time Geofencing using Apache KafkaGuido Schmutz
 
Kafka as an Event Store - is it Good Enough?
Kafka as an Event Store - is it Good Enough?Kafka as an Event Store - is it Good Enough?
Kafka as an Event Store - is it Good Enough?Guido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 

More from Guido Schmutz (18)

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache Kafka
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using Kafka
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI Architecture
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Location Analytics - Real Time Geofencing using Apache Kafka
Location Analytics - Real Time Geofencing using Apache KafkaLocation Analytics - Real Time Geofencing using Apache Kafka
Location Analytics - Real Time Geofencing using Apache Kafka
 
Kafka as an Event Store - is it Good Enough?
Kafka as an Event Store - is it Good Enough?Kafka as an Event Store - is it Good Enough?
Kafka as an Event Store - is it Good Enough?
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 

Recently uploaded

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Recently uploaded (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

Big Data Architectures

  • 1. BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH Big Data Solution Architectures 29.9.2016 – DOAG 2016 Big Data Days Guido Schmutz Trivadis
  • 2. Guido Schmutz Working for Trivadis for more than 19 years Oracle ACE Director for Fusion Middleware and SOA Co-Author of different books Consultant, Trainer, Software Architect for Java, SOA & Big Data / Fast Data Member of Trivadis Architecture Board Technology Manager @ Trivadis More than 25 years of software development experience Contact: guido.schmutz@trivadis.com Blog: http://guidoschmutz.wordpress.com Slideshare: http://www.slideshare.net/gschmutz Twitter: gschmutz 29.9.2016 Big Data Solution Architectures2
  • 3. Agenda Big Data Solution Architectures3 29.9.2016 1. Introduction 2. Big Data Reference Architectures • Traditional Big Data • Event / Stream-Processing • Lambda Architecture • Kappa Architecture • Unified Architecture 3. Big Data Ecosystem – many choices sorted!
  • 4. Introduction Big Data Solution Architectures29.9.20164
  • 5. Why talking about Big Data Architectures Choosing the right architecture is key for any (big data) project Big Data is still quite a rather young field and therefore a “moving target” no standard architectures available which have been used for years In the past years, some architectures and best practices have evolved Know your use cases before choosing your architecture / technologies To have a reference architecture in place helps in choosing the right/matching technologies Big Data Solution Architectures29.9.20165
  • 6. How to do Big Data? Why is a structure / architecture important Big Data Solution Architectures29.9.20166
  • 7. Big Data Ecosystem – many choices sorted! Big Data Solution Architectures29.9.20167
  • 8. Important Properties for choosing (Big) Data Architecture Latency Keep raw and un-interpreted data “forever” ? Volume, Velocity, Variety, Veracity Ad-Hoc Query Capabilities needed ? Robustness & Fault Tolerance Scalability … Big Data Solution Architectures29.9.20169
  • 9. Big Data Reference Architectures - Traditional Big Data Big Data Solution Architectures29.9.201610
  • 10. “Traditional Architecture” for Big Data Data Ingestion (Analytical) Data Processing Result StoreData Sources Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Batch compute Pushing Ingestion Result Store Query Engine Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Big Data Solution Architectures Pulling Ingestion Channel 29.9.201611
  • 11. “Traditional Architecture” for Big Data – Hadoop Technology Mapping Data Ingestion (Analytical) Data Processing Result StoreData Sources Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Batch compute Pushing Ingestion Result Store Query Engine Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Big Data Solution Architectures Pulling Ingestion Channel 29.9.201612
  • 12. “Traditional Architecture” for Big Data – Spark Technology Mapping Data Ingestion (Analytical) Data Processing Result StoreData Sources Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Batch compute Pushing Ingestion Result Store Query Engine Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Big Data Solution Architectures Pulling Ingestion Channel 29.9.201613
  • 13. “Traditional Architecture” for Big Data – Feeding in High- Volume Event Streams Data Ingestion (Analytical) Data Processing Result StoreData Sources Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Batch compute Pushing Ingestion Result Store Query Engine Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Big Data Solution Architectures Pulling Ingestion Channel ? ? 29.9.201614
  • 14. Traditional Architecture for Big Data • Batch Processing - “Data at Rest” • Not for low latency use cases • Responses are delivered “after the fact” • Maximum value of the identified situation is lost • Decision are made on old and stale data • Spar Core is a faster alternative to Hadoop Map Reduce, but still Batch Processing • Spark Ecosystems offers a lot of additional advanced analytic capabilities (machine learning, graph processing, …) Big Data Solution Architectures29.9.201615
  • 15. Big Data Reference Architectures – Event/Stream Processing Big Data Solution Architectures29.9.201616
  • 16. Event / Stream Processing – “Data in Motion” “Data in motion” Events are analyzed and processed in real- time as the arrive Decisions are timely, contextual and based on fresh data Decision latency is eliminated Big Data Solution Architectures29.9.201617
  • 17. Event / Stream Processing Architecture Data Ingestion Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content Logfiles Social RDBMS ERP Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Result Store Messaging Result Store Big Data Solution Architectures = Data in Motion = Data at Rest 29.9.201618
  • 18. Continuous Ingestion DB Source Big Data Log Stream Processing IoT Sensor Event Hub Topic Topic REST Topic IoT GW CDC GW Connect CDC DB Source Log CDC Native IoT Sensor IoT Sensor 19 Dataflow GW Topic Topic Queue MQTT GW Topic Dataflow GW Dataflow TopicREST 19 File Source Log Log Log Social Native 29.9.2016 Big Data Solution Architectures19 Topic Topic
  • 19. Challenges for Ingesting Sensor Data Big Data Solution Architectures Multitude of sensors Real-Time Streaming Multiple Firmware versions Bad Data from damaged sensors Regulatory Constraints Data Quality 20 29.9.2016
  • 20. SQL Polling Change Data Capture (CDC) File Stream (File Tailing) File Stream (Streaming Appender) Enabling Continuous Data Ingestion Sensor Stream Big Data Solution Architectures21 29.9.2016
  • 21. Event / Stream Processing Architecture – Open Source Technology Mapping Data Ingestion Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content Logfiles Social RDBMS ERP Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Result Store Messaging Result Store Big Data Solution Architectures = Data in Motion = Data at Rest 29.9.201622
  • 22. Event / Stream Processing Architecture – Oracle Technology Mapping Data Ingestion Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content Logfiles Social RDBMS ERP Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Result Store Messaging Result Store Big Data Solution Architectures = Data in Motion = Data at Rest 29.9.201623
  • 23. Event / Stream Processing Architecture The solution for low latency use cases Process each event separately => low latency Process events in micro-batches => increases latency but offers better reliability Previously known as “Complex Event Processing” Keep the data moving / Data in Motion instead of Data at Rest => raw events were not stored Big Data Solution Architectures29.9.201624
  • 24. Event / Stream Processing Architecture - Keep raw event data Data Ingestion Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content Logfiles Social RDBMS ERP Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Result Store Messaging Result Store (Analytical) Batch Data Processing Raw Data (Reservoir) Big Data Solution Architectures = Data in Motion = Data at Rest 29.9.201625
  • 25. Big Data Reference Architectures - Lambda Architecture for Big Data Big Data Solution Architectures29.9.201626
  • 26. “Lambda Architecture” for Big Data Data Ingestion (Analytical) Batch Data Processing Batch compute Result StoreData Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Batch compute Messaging Result Store Query Engine Result Store Computed Information Raw Data (Reservoir) Big Data Solution Architectures = Data in Motion = Data at Rest Pulling Ingestion 29.9.201627
  • 27. “Lambda Architecture” for Big Data Data Ingestion (Analytical) Batch Data Processing Batch compute Result StoreData Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Batch compute Messaging Result Store Query Engine Result Store Computed Information Raw Data (Reservoir) Big Data Solution Architectures = Data in Motion = Data at Rest Pulling Ingestion 29.9.201628
  • 28. Lambda Architecture for Big Data Combines (Big) Data at Rest with (Fast) Data in Motion Closes the gap from high-latency batch processing Keeps the raw information forever Makes it possible to rerun analytics operations on whole data set if necessary => because the old run had an error or => because we have found a better algorithm we want to apply Have to implement functionality twice • Once for batch • Once for real-time streaming Big Data Solution Architectures29.9.201629
  • 29. Big Data Reference Architectures - „Kappa“ Architecture Big Data Solution Architectures29.9.201630
  • 30. “Kappa Architecture” for Big Data Data Ingestion “Raw Data Reservoir” Batch compute Data Sources Messaging Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Result Store Messaging Result Store Raw Data (Reservoir) Computed Information Big Data Solution Architectures = Data in Motion = Data at Rest 29.9.201631
  • 31. Big Data Reference Architectures - „Unified“ Architecture Big Data Solution Architectures29.9.201632
  • 32. “Unified Architecture” for Big Data Data Ingestion (Analytical) Batch Data Processing (Calculate Models of incoming data) Batch compute Result StoreData Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Batch compute Messaging Result Store Query Engine Result Store Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Prediction Models Big Data Solution Architectures29.9.201633
  • 33. Big Data Ecosystem – many choices sorted! Big Data Solution Architectures29.9.201634
  • 34. Building Blocks for (Big) Data Processing Data Acquisition Format File System Stream Processing Batch SQL Graph DBMS Document DBMS Relational DBMS Visualization IoT Messaging Analytics OLAP DBMS Query Federation Table-Style DBMS Key Value DBMS Batch Processing In-Memory Big Data Solution Architectures29.9.201635
  • 35. Big Data Ecosystem – many choices sorted! Big Data Solution Architectures29.9.201636
  • 36. NoSQL Datastores Big Data Solution Architectures29.9.201637
  • 37. Organizing NoSQL Datastores – Different Types Key Value Store Big Data Solution Architectures38 Wide-column store Document store Graph store 29.9.2016 Key Value K1 V1 K2 V2 K3 V3 Document { k1: v1, k2: v2, k3: [v1, v2, v3] } Rowkey CK1 RK1 V1 CK2 V2 CK3 V3 CK4 V4 … … CK1 RK2 V1 CK4 V4 CK6 V6 … … … … … … CK3 V3
  • 38. Organizing NoSQL Datastores – and the Products Key Value Store Big Data Solution Architectures39 Wide-column store Document store Graph store 29.9.2016
  • 39. Big Data Solution Architectures29.9.201640
  • 40. Guido Schmutz Technology Manager guido.schmutz@trivadis.com Big Data Solution Architectures29.9.201641