SlideShare a Scribd company logo
1 of 73
Download to read offline
End-to-end streaming analytics on
Google Cloud Platform
From event capture to dashboard to monitoring
Javier Ramirez
@supercoco9
Hard problems
and easy problems
And hard problems that look easy
3
Calculate the average of several numbers.
An easy problem
Calculate the average of several numbers. By the way, they might be MANY
numbers. They will probably not fit in memory. They might not even fit in one
file or on a single hard drive.
An easy big data problem
Calculate the average of several numbers. By the way, they might be MANY
numbers. They will probably not fit in memory. They might not even fit in one
file or on a single hard drive.
Truth is they will not be in one file, but they will be streamed live from
different sensors…
An easy big data and streaming problem
Calculate the average of several numbers. By the way, they might be MANY
numbers. They will probably not fit in memory. They might not even fit in one
file or on a single hard drive.
Truth is they will not be in one file, but they will be streamed live from
different sensors… In different parts of the world
A not so easy streaming data problem
Calculate the average of several numbers. By the way, they might be MANY
numbers. They will probably not fit in memory. They might not even fit in one
file or on a single hard drive.
Truth is they will not be in one file, but they will be streamed live from
different sensors… In different parts of the world
Some sensors might send a few events per hour, some a few thousands per
second…
An autoscaling streaming problem
Calculate the average of several numbers. By the way, they might be MANY
numbers. They will probably not fit in memory. They might not even fit in one
file or on a single hard drive.
Truth is they will not be in one file, but they will be streamed live from
different sensors… In different parts of the world
Some sensors might send a few events per hour, some a few thousands per
second… We want not just the total average of all the points, but the moving
average every 30 seconds, for every sensor. And the hourly, daily, and
monthly averages
A hard streaming analytics problem
Calculate the average of several numbers. By the way, they might be MANY
numbers. They will probably not fit in memory. They might not even fit in one
file or on a single hard drive.
Truth is they will not be in one file, but they will be streamed live from
different sensors… In different parts of the world
Some sensors might send a few events per hour, some a few thousands per
second… We want not just the total average of all the points, but the moving
average every 30 seconds, for every sensor. And the hourly, daily, and
monthly averages
Sometimes the sensors will have connectivity issues and will not send their
data until later, but of course I want the calculations to still be correct
A real life analytics problem
All of the above, plus monitoring, alerts, self-healing, a way to query the data
efficiently, and a pretty dashboard on top
What your client/boss will expect
Calculate the average of several numbers. By the way, they might be MANY numbers. They will probably not fit in memory. They might
not even fit in one file or on a single hard drive.
Truth is they will not be in one file, but they will be streamed live from different sensors… In different parts of the world
Some sensors might send a few events per hour, some a few thousands per second… We want not just the total average of all the
points, but the moving average every 30 seconds, for every sensor. And the hourly, daily, and monthly averages
Sometimes the sensors will have connectivity issues and will not send their data until later, but of course I want the calculations to still
be correct
… is easier
said than
done
Our complete system
in 100 lines of Java, of
which 90 are mostly
boilerplate and
configuration
<= Don’t try to read that. I’ll zoom in later
14
What we need
A streaming data pipeline components
Data
acquisition
Data
validation
Transformation
/ Aggregation
Visualization
Storage/
Analytics
Monitoring and alerts of all the components
Data Acquisition
Sending and receiving data at scale
16
Google Cloud Pub/Sub
Google Cloud Pub/Sub brings the scalability, flexibility, and reliability of
enterprise message-oriented middleware to the cloud. By providing
many-to-many, asynchronous messaging that decouples senders and
receivers, it allows for secure and highly available communication between
independently written applications.
Google Cloud Pub/Sub delivers low-latency, durable messaging that helps
developers quickly integrate systems hosted on the Google Cloud Platform
and externally.
Ingest event streams from anywhere, at any scale, for simple, reliable, real-time stream analytics
The spotify proof of concept
Currently our production load peaks at around 700K events per second. To account for the future
growth and possible disaster recovery scenarios, we settled on a test load of 2M events per second.
To make it extra hard for Pub/Sub, we wanted to publish this amount of traffic from a single data
center, so that all the requests were hitting the Pub/Sub machines in the same zone. We made the
assumption that Google plans zones as independent failure domains and that each zone can handle
equal amounts of traffic.
In theory, if we’re able to push 2M messages to a single zone, we should be able to push
number_of_zones * 2M messages across all zones.
Our hope was that the system would be able to handle this traffic on both the producing and
consuming side for a long time without the service degrading.
https://labs.spotify.com/2016/03/03/spotifys-event-delivery-the-road-to-the-cloud-part-ii/
The spotify proof of concept
https://labs.spotify.com/2016/03/03/spotifys-event-delivery-the-road-to-the-cloud-part-ii/
They pushed 2 million events per second (to two
topics) from 29 servers, non-stop, for five days.
“We did not observe any lost messages whatsoever
during the test period.”
The no operations advantage
https://labs.spotify.com/2016/03/10/spotifys-event-delivery-the-road-to-the-cloud-part-iii/
Event Delivery System In Cloud
We’re actively working on bringing the new system to production. The preliminary
numbers we obtained from running the new system in the experimental phase look very
promising. The worst end-to-end latency observed with the new system is four times
lower than the end-to-end latency of old system.
But boosting performance isn’t the only thing we want to get from the new system. Our
bet is that by using cloud-managed products we will have a much lower operational
overhead. That in turn means we will have much more time to make Spotify’s
products better.
A truly global network
Pub/Sub now works with Cloud IOT Core
Device Manager
The device manager allows individual devices to be configured and managed securely in a
coarse-grained way; management can be done through a console or programmatically. The device
manager establishes the identity of a device, and provides the mechanism for authenticating a
device when connecting. It also maintains a logical configuration of each device and can be used to
remotely control the device from the cloud.
Protocol Bridge
The protocol bridge provides connection endpoints for protocols with automatic load balancing for
all device connections. The protocol bridge has native support for secure connection over MQTT, an
industry-standard IoT protocol. The protocol bridge publishes all device telemetry to Cloud
Pub/Sub, which can then be consumed by downstream analytic systems.
Which is very cool if you are into Arduino, Raspberry PI, Android, or embedded systems
Storage & Analytics
Reliable, fast, scalable, and flexible. And no-ops
25
BigQuery
A database where you can send as much (or as little) data as you want, either
batch or streaming, and run any SQL you want, no matter how big your data
is.
Even if you have petabytes of data.
Even if you want to join data from different projects or from public data
sources.
Even if you want to query external data on Spreadsheets or Cloud Storage.
Even if you want to create your own User Defined Functions in JavaScript.
BigQuery also...
… is serverless and zero configuration. You never have to worry about
memory, CPU, network, or disk. You send your data, you send your queries,
you get results.
Behind the scenes BigQuery will use up to 2000 CPUs in parallel for your
queries, and a huge amount of networked storage. But you don’t care.
You pay for how much data you send and how much data you query. If you
are not using the database, you are not paying anything. But it’s always
available
Hope you are not easily impressed
How long it would take to read 4 Terabytes from a hard drive at 100 MB/s?
And to filter 100 billion data points using a regular expression for each?
And moving 278 GB across a 1 Gbps network?
Hope you are not easily impressed
How long it would take to read 4 Terabytes from a hard drive at 100 MB/s?
About 11 hours
And to filter 100 billion data points using a regular expression for each?
About 27 hours
And moving 278 GB across a 1 Gbps network?
About 40 minutes
Hope you are not easily impressed
Hope you are not easily impressed
We will use a simple table for our system
Data Validation
Apparently simple, but always a pain
34
Google Cloud Dataprep
An intelligent cloud data service to visually explore, clean, and prepare data for analysis
Cloud Dataprep: Explorer & Suggestions
Cloud Dataprep: Transforms & Scripts
Cloud Dataprep: No-ops execution & reports
Transformation &
Aggregation
Batch and streaming ETL jobs, and data pipelines
40
Apache BEAM: An advanced unified programming model
Apache Beam is an open source, unified model for
defining both batch and streaming data-parallel
processing pipelines. Using one of the open source
Beam SDKs, you build a program that defines the
pipeline. The pipeline is then executed by one of
Beam’s supported distributed processing back-ends,
which include Apache Apex, Apache Flink, Apache
Spark, and Google Cloud Dataflow.
Beam is particularly useful for Embarrassingly Parallel
data processing tasks, in which the problem can be
decomposed into many smaller bundles of data that
can be processed independently and in parallel. You
can also use Beam for Extract, Transform, and Load
(ETL) tasks and pure data integration. These tasks are
useful for moving data between different storage
media and data sources, transforming data into a
more desirable format, or loading data onto a new
system.
Apache BEAM: A basic pipeline
Apache BEAM: Streaming is hard
Apache BEAM: Streaming is hard
Averages with BEAM: Overview
Boilerplate
and
configuration
Writing the output to BigQuery
This is the code that actually processes and
aggregates the data
Start the pipeline
Averages with BEAM: Config
Averages with BEAM: Output to BigQuery
Averages with BEAM: The processing itself
Transform/Filter. We are just parsing a line
of text into multiple fields
Aggregate. We are outputting the mean
speed of the last minute per sensor, every
30 seconds
Google Cloud Dataflow: BEAM with no-operations
Google developed internally BEAM as a closed-source product. Then they
realised it would make sense to open-source it and they donated it to the
Apache community.
Anyone can use BEAM completely for free, and choose the runner in which to
execute your pipeline.
Google Cloud Dataflow is a BEAM runner to execute your pipelines with
no-operations, with logging, monitoring, auto-scaling, shuffling, and dynamic
re-balancing.
It’s like BEAM, but as a managed service.
Demo
time
51
Three instances in three continents
Our dataflow pipeline ready to accept data
Let’s start sending some data
Our dataflow
pipeline seems to
be working fine.
64 elements per
second should be
easy.
Let’s send much more data from all over!
Our dataflow
pipeline starts
feeling the heat.
Receiving 2440
elements per
second now-
And now we are
processing over
20,000 elements per
second at the 1st step.
But the lag starts to
increase
Auto scaling to the rescue. Three workers
now
And lag goes back to normal
And back to just one worker
Monitoring
What I can’t see, doesn’t exist
64
Google Stackdriver monitoring
Google Stackdriver monitoring
Google Stackdriver alerts
Visualization
Because an image is worth a thousand logs
68
Google Data Studio
Google Data Studio: my dashboard
Google Data Studio: data sources
Google Data Studio: data sources
Google Data Studio: drag and drop
Google Data Studio: drag and drop
Almost there
Just one more slide
75
Cloud
IoT Core
Cloud
Dataprep
Stackdriver Monitoring Logging
Error
Reporting
All together now!
CHEERS!
I’m happy to answer any questions you may have at lunchtime
or the coffee breaks.
Or ping me at @supercoco9 on twitter. You got 240 chars now
Demo source code available at:
https://github.com/GoogleCloudPlatform/training-data-analyst/tree/master/courses/streaming/process
Javier Ramirez
End-to-end streaming analytics on
Google Cloud Platform
From event capture to dashboard to monitoring
Javier Ramirez
@supercoco9
Template Design Credits
The Template provides a theme with four basic
colors:
The backgrounds were created by Free Google
Slides Templates.
The original template for this presentation was
provided by, and it’s property of, Free Google
Slides Templates -
http://freegoogleslidestemplates.com
Vectorial Shapes in this Template were created
by Free Google Slides Templates and
downloaded from pexels.com and
unsplash.com.
Icons in this Template are part of Google®
Material Icons and 1001freedownloads.com.
Shapes & Icons Backgrounds
Fonts Color Palette
The fonts used in this template are taken from
Google fonts. ( Dosis,Open Sans )
You can download the fonts from the following
url: https://www.google.com/fonts/
#93c47dff #0097a7ff
#78909cff #eeeeeeff
#f7b600ff #00ce00e3
#de445eff #000000ff

More Related Content

What's hot

Hong Kong AWS Summit 2017 - Keynote
Hong Kong AWS Summit 2017 - KeynoteHong Kong AWS Summit 2017 - Keynote
Hong Kong AWS Summit 2017 - KeynoteAmazon Web Services
 
Modern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementModern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementAmazon Web Services
 
Architetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
Architetture Serverless: concentrarsi sull'idea, non sull'infrastrutturaArchitetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
Architetture Serverless: concentrarsi sull'idea, non sull'infrastrutturaAmazon Web Services
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Amazon Web Services
 
ENT303 Another Day, Another Billion Packets
ENT303 Another Day, Another Billion PacketsENT303 Another Day, Another Billion Packets
ENT303 Another Day, Another Billion PacketsAmazon Web Services
 
AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...
AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...
AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...Amazon Web Services
 
Introduction to AWS Step Functions
Introduction to AWS Step FunctionsIntroduction to AWS Step Functions
Introduction to AWS Step FunctionsAmazon Web Services
 
Getting the most Bang for your Buck with #EC2 #Winning
Getting the most Bang for your Buck with #EC2 #WinningGetting the most Bang for your Buck with #EC2 #Winning
Getting the most Bang for your Buck with #EC2 #WinningAmazon Web Services
 
Automate Migration to AWS with Datapipe
Automate Migration to AWS with DatapipeAutomate Migration to AWS with Datapipe
Automate Migration to AWS with DatapipeAmazon Web Services
 
Getting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless CloudGetting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless CloudAmazon Web Services
 
Real-time Data Processing using AWS Lambda
Real-time Data Processing using AWS LambdaReal-time Data Processing using AWS Lambda
Real-time Data Processing using AWS LambdaAmazon Web Services
 
Big Data Architectural Patterns and Best Practices
Big Data Architectural Patterns and Best PracticesBig Data Architectural Patterns and Best Practices
Big Data Architectural Patterns and Best PracticesAmazon Web Services
 
Build a Website on AWS for Your First 10 Million Users
Build a Website on AWS for Your First 10 Million UsersBuild a Website on AWS for Your First 10 Million Users
Build a Website on AWS for Your First 10 Million UsersAmazon Web Services
 
Build an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersBuild an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersAmazon Web Services
 
Serverless Design Patterns for Rethinking Traditional Enterprise Application ...
Serverless Design Patterns for Rethinking Traditional Enterprise Application ...Serverless Design Patterns for Rethinking Traditional Enterprise Application ...
Serverless Design Patterns for Rethinking Traditional Enterprise Application ...Amazon Web Services
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesAmazon Web Services
 
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Amazon Web Services
 
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...Amazon Web Services
 
10 Tips For Serverless Backends With NodeJS and AWS Lambda
10 Tips For Serverless Backends With NodeJS and AWS Lambda10 Tips For Serverless Backends With NodeJS and AWS Lambda
10 Tips For Serverless Backends With NodeJS and AWS LambdaJim Lynch
 
Workshop: Building a Streaming Data Platform on AWS
Workshop: Building a Streaming Data Platform on AWSWorkshop: Building a Streaming Data Platform on AWS
Workshop: Building a Streaming Data Platform on AWSAmazon Web Services
 

What's hot (20)

Hong Kong AWS Summit 2017 - Keynote
Hong Kong AWS Summit 2017 - KeynoteHong Kong AWS Summit 2017 - Keynote
Hong Kong AWS Summit 2017 - Keynote
 
Modern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementModern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagement
 
Architetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
Architetture Serverless: concentrarsi sull'idea, non sull'infrastrutturaArchitetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
Architetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
 
ENT303 Another Day, Another Billion Packets
ENT303 Another Day, Another Billion PacketsENT303 Another Day, Another Billion Packets
ENT303 Another Day, Another Billion Packets
 
AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...
AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...
AWS re:Invent 2016: IoT Blueprints: Optimizing Supply for Smart Agriculture f...
 
Introduction to AWS Step Functions
Introduction to AWS Step FunctionsIntroduction to AWS Step Functions
Introduction to AWS Step Functions
 
Getting the most Bang for your Buck with #EC2 #Winning
Getting the most Bang for your Buck with #EC2 #WinningGetting the most Bang for your Buck with #EC2 #Winning
Getting the most Bang for your Buck with #EC2 #Winning
 
Automate Migration to AWS with Datapipe
Automate Migration to AWS with DatapipeAutomate Migration to AWS with Datapipe
Automate Migration to AWS with Datapipe
 
Getting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless CloudGetting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless Cloud
 
Real-time Data Processing using AWS Lambda
Real-time Data Processing using AWS LambdaReal-time Data Processing using AWS Lambda
Real-time Data Processing using AWS Lambda
 
Big Data Architectural Patterns and Best Practices
Big Data Architectural Patterns and Best PracticesBig Data Architectural Patterns and Best Practices
Big Data Architectural Patterns and Best Practices
 
Build a Website on AWS for Your First 10 Million Users
Build a Website on AWS for Your First 10 Million UsersBuild a Website on AWS for Your First 10 Million Users
Build a Website on AWS for Your First 10 Million Users
 
Build an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersBuild an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million Users
 
Serverless Design Patterns for Rethinking Traditional Enterprise Application ...
Serverless Design Patterns for Rethinking Traditional Enterprise Application ...Serverless Design Patterns for Rethinking Traditional Enterprise Application ...
Serverless Design Patterns for Rethinking Traditional Enterprise Application ...
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless Architectures
 
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
 
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
 
10 Tips For Serverless Backends With NodeJS and AWS Lambda
10 Tips For Serverless Backends With NodeJS and AWS Lambda10 Tips For Serverless Backends With NodeJS and AWS Lambda
10 Tips For Serverless Backends With NodeJS and AWS Lambda
 
Workshop: Building a Streaming Data Platform on AWS
Workshop: Building a Streaming Data Platform on AWSWorkshop: Building a Streaming Data Platform on AWS
Workshop: Building a Streaming Data Platform on AWS
 

Viewers also liked

Cisco Connect Toronto 2017 - Simplifying Cloud Adoption
Cisco Connect Toronto 2017 - Simplifying Cloud AdoptionCisco Connect Toronto 2017 - Simplifying Cloud Adoption
Cisco Connect Toronto 2017 - Simplifying Cloud AdoptionCisco Canada
 
Cisco Connect Toronto 2017 - Cisco meraki let simple work for you
Cisco Connect Toronto 2017 - Cisco meraki   let simple work for youCisco Connect Toronto 2017 - Cisco meraki   let simple work for you
Cisco Connect Toronto 2017 - Cisco meraki let simple work for youCisco Canada
 
Cisco Connect Toronto 2017 - Anatomy-of-attack
Cisco Connect Toronto 2017 - Anatomy-of-attackCisco Connect Toronto 2017 - Anatomy-of-attack
Cisco Connect Toronto 2017 - Anatomy-of-attackCisco Canada
 
Cisco Connect Toronto 2017 - Optimizing your client's Wi-Fi Experience
Cisco Connect Toronto 2017 - Optimizing your client's Wi-Fi ExperienceCisco Connect Toronto 2017 - Optimizing your client's Wi-Fi Experience
Cisco Connect Toronto 2017 - Optimizing your client's Wi-Fi ExperienceCisco Canada
 
Cisco Connect Toronto 2017 - Cloud and On Premises Collaboration Security Exp...
Cisco Connect Toronto 2017 - Cloud and On Premises Collaboration Security Exp...Cisco Connect Toronto 2017 - Cloud and On Premises Collaboration Security Exp...
Cisco Connect Toronto 2017 - Cloud and On Premises Collaboration Security Exp...Cisco Canada
 
Cisco Connect Toronto 2017 - Your time is now
Cisco Connect Toronto 2017 - Your time is nowCisco Connect Toronto 2017 - Your time is now
Cisco Connect Toronto 2017 - Your time is nowCisco Canada
 
Cisco Connect Toronto 2017 - Putting Firepower into the Next Generation Firewall
Cisco Connect Toronto 2017 - Putting Firepower into the Next Generation FirewallCisco Connect Toronto 2017 - Putting Firepower into the Next Generation Firewall
Cisco Connect Toronto 2017 - Putting Firepower into the Next Generation FirewallCisco Canada
 
OpenContrail Overview
OpenContrail OverviewOpenContrail Overview
OpenContrail OverviewJames Kelly
 
Cisco Connect Toronto 2017 - Introducing the Network Intuitive
Cisco Connect Toronto 2017 - Introducing the Network IntuitiveCisco Connect Toronto 2017 - Introducing the Network Intuitive
Cisco Connect Toronto 2017 - Introducing the Network IntuitiveCisco Canada
 
Veeam Availability for Hybrid Cloud (AWS)
Veeam Availability for Hybrid Cloud (AWS) Veeam Availability for Hybrid Cloud (AWS)
Veeam Availability for Hybrid Cloud (AWS) Tanawit Chansuchai
 
Cisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven TelemetryCisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven TelemetryCisco Canada
 
The business case for SD WAN in the enterprise
The business case for SD WAN in the enterprise The business case for SD WAN in the enterprise
The business case for SD WAN in the enterprise Colt Technology Services
 
Cisco Umbrella как облачная платформа защиты от угроз
Cisco Umbrella как облачная платформа защиты от угрозCisco Umbrella как облачная платформа защиты от угроз
Cisco Umbrella как облачная платформа защиты от угрозCisco Russia
 
A.I. Exercise.
A.I. Exercise.A.I. Exercise.
A.I. Exercise.Mario Cho
 
Варианты построения SD-WAN архитектуры корпоративного клиента: плюсы и минусы...
Варианты построения SD-WAN архитектуры корпоративного клиента: плюсы и минусы...Варианты построения SD-WAN архитектуры корпоративного клиента: плюсы и минусы...
Варианты построения SD-WAN архитектуры корпоративного клиента: плюсы и минусы...Victoria Kalinina
 
Digital Transformation - Cisco's Journey
Digital Transformation - Cisco's JourneyDigital Transformation - Cisco's Journey
Digital Transformation - Cisco's JourneyCisco Canada
 
Cisco Connect Toronto 2017 - Understanding Cisco Next Generation SD-WAN
Cisco Connect Toronto 2017 - Understanding Cisco Next Generation SD-WANCisco Connect Toronto 2017 - Understanding Cisco Next Generation SD-WAN
Cisco Connect Toronto 2017 - Understanding Cisco Next Generation SD-WANCisco Canada
 
Cisco Connect Toronto 2017 - NFV/SDN Platform for Orchestrating Cloud and vBr...
Cisco Connect Toronto 2017 - NFV/SDN Platform for Orchestrating Cloud and vBr...Cisco Connect Toronto 2017 - NFV/SDN Platform for Orchestrating Cloud and vBr...
Cisco Connect Toronto 2017 - NFV/SDN Platform for Orchestrating Cloud and vBr...Cisco Canada
 

Viewers also liked (20)

Cisco Connect Toronto 2017 - Simplifying Cloud Adoption
Cisco Connect Toronto 2017 - Simplifying Cloud AdoptionCisco Connect Toronto 2017 - Simplifying Cloud Adoption
Cisco Connect Toronto 2017 - Simplifying Cloud Adoption
 
Cisco Connect Toronto 2017 - Cisco meraki let simple work for you
Cisco Connect Toronto 2017 - Cisco meraki   let simple work for youCisco Connect Toronto 2017 - Cisco meraki   let simple work for you
Cisco Connect Toronto 2017 - Cisco meraki let simple work for you
 
Cisco Connect Toronto 2017 - Anatomy-of-attack
Cisco Connect Toronto 2017 - Anatomy-of-attackCisco Connect Toronto 2017 - Anatomy-of-attack
Cisco Connect Toronto 2017 - Anatomy-of-attack
 
Cisco Connect Toronto 2017 - Optimizing your client's Wi-Fi Experience
Cisco Connect Toronto 2017 - Optimizing your client's Wi-Fi ExperienceCisco Connect Toronto 2017 - Optimizing your client's Wi-Fi Experience
Cisco Connect Toronto 2017 - Optimizing your client's Wi-Fi Experience
 
Cisco Connect Toronto 2017 - Cloud and On Premises Collaboration Security Exp...
Cisco Connect Toronto 2017 - Cloud and On Premises Collaboration Security Exp...Cisco Connect Toronto 2017 - Cloud and On Premises Collaboration Security Exp...
Cisco Connect Toronto 2017 - Cloud and On Premises Collaboration Security Exp...
 
Colt Network On Demand
Colt Network On DemandColt Network On Demand
Colt Network On Demand
 
Cisco Connect Toronto 2017 - Your time is now
Cisco Connect Toronto 2017 - Your time is nowCisco Connect Toronto 2017 - Your time is now
Cisco Connect Toronto 2017 - Your time is now
 
Cisco Connect Toronto 2017 - Putting Firepower into the Next Generation Firewall
Cisco Connect Toronto 2017 - Putting Firepower into the Next Generation FirewallCisco Connect Toronto 2017 - Putting Firepower into the Next Generation Firewall
Cisco Connect Toronto 2017 - Putting Firepower into the Next Generation Firewall
 
OpenContrail Overview
OpenContrail OverviewOpenContrail Overview
OpenContrail Overview
 
Colt Optical SDN Innovation
Colt Optical SDN InnovationColt Optical SDN Innovation
Colt Optical SDN Innovation
 
Cisco Connect Toronto 2017 - Introducing the Network Intuitive
Cisco Connect Toronto 2017 - Introducing the Network IntuitiveCisco Connect Toronto 2017 - Introducing the Network Intuitive
Cisco Connect Toronto 2017 - Introducing the Network Intuitive
 
Veeam Availability for Hybrid Cloud (AWS)
Veeam Availability for Hybrid Cloud (AWS) Veeam Availability for Hybrid Cloud (AWS)
Veeam Availability for Hybrid Cloud (AWS)
 
Cisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven TelemetryCisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven Telemetry
 
The business case for SD WAN in the enterprise
The business case for SD WAN in the enterprise The business case for SD WAN in the enterprise
The business case for SD WAN in the enterprise
 
Cisco Umbrella как облачная платформа защиты от угроз
Cisco Umbrella как облачная платформа защиты от угрозCisco Umbrella как облачная платформа защиты от угроз
Cisco Umbrella как облачная платформа защиты от угроз
 
A.I. Exercise.
A.I. Exercise.A.I. Exercise.
A.I. Exercise.
 
Варианты построения SD-WAN архитектуры корпоративного клиента: плюсы и минусы...
Варианты построения SD-WAN архитектуры корпоративного клиента: плюсы и минусы...Варианты построения SD-WAN архитектуры корпоративного клиента: плюсы и минусы...
Варианты построения SD-WAN архитектуры корпоративного клиента: плюсы и минусы...
 
Digital Transformation - Cisco's Journey
Digital Transformation - Cisco's JourneyDigital Transformation - Cisco's Journey
Digital Transformation - Cisco's Journey
 
Cisco Connect Toronto 2017 - Understanding Cisco Next Generation SD-WAN
Cisco Connect Toronto 2017 - Understanding Cisco Next Generation SD-WANCisco Connect Toronto 2017 - Understanding Cisco Next Generation SD-WAN
Cisco Connect Toronto 2017 - Understanding Cisco Next Generation SD-WAN
 
Cisco Connect Toronto 2017 - NFV/SDN Platform for Orchestrating Cloud and vBr...
Cisco Connect Toronto 2017 - NFV/SDN Platform for Orchestrating Cloud and vBr...Cisco Connect Toronto 2017 - NFV/SDN Platform for Orchestrating Cloud and vBr...
Cisco Connect Toronto 2017 - NFV/SDN Platform for Orchestrating Cloud and vBr...
 

Similar to Streaming analytics on Google Cloud Platform, by Javier Ramirez, teowaki

Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSkynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSylvain Kalache
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...confluent
 
Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Brian Brazil
 
Big Data Analytics(concepts of hadoop mapreduce,mahout,k-means clustering,hbase)
Big Data Analytics(concepts of hadoop mapreduce,mahout,k-means clustering,hbase)Big Data Analytics(concepts of hadoop mapreduce,mahout,k-means clustering,hbase)
Big Data Analytics(concepts of hadoop mapreduce,mahout,k-means clustering,hbase)MIT College Of Engineering,Pune
 
Big Data Analytics(Intro,Hadoop Map Reduce,Mahout,K-means clustering,H-base)
Big Data Analytics(Intro,Hadoop Map Reduce,Mahout,K-means clustering,H-base)Big Data Analytics(Intro,Hadoop Map Reduce,Mahout,K-means clustering,H-base)
Big Data Analytics(Intro,Hadoop Map Reduce,Mahout,K-means clustering,H-base)MIT College Of Engineering,Pune
 
Storage for next-generation sequencing
Storage for next-generation sequencingStorage for next-generation sequencing
Storage for next-generation sequencingGuy Coates
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)Brian Brazil
 
BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)Pavlo Baron
 
Big Data - Need of Converged Data Platform
Big Data - Need of Converged Data PlatformBig Data - Need of Converged Data Platform
Big Data - Need of Converged Data PlatformGeekNightHyderabad
 
Big Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriBig Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
Distributed computing poli
Distributed computing poliDistributed computing poli
Distributed computing poliivascucristian
 
Measure All the Things! - Austin Data Day 2014
Measure All the Things! - Austin Data Day 2014Measure All the Things! - Austin Data Day 2014
Measure All the Things! - Austin Data Day 2014gdusbabek
 
Big Data - An Overview
Big Data -  An OverviewBig Data -  An Overview
Big Data - An OverviewArvind Kalyan
 
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and HadoopA gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and HadoopStefano Paluello
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 
SplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
SplunkLive! Amsterdam 2015 Breakout - Getting Started with SplunkSplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
SplunkLive! Amsterdam 2015 Breakout - Getting Started with SplunkSplunk
 
Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Stavros Kontopoulos
 

Similar to Streaming analytics on Google Cloud Platform, by Javier Ramirez, teowaki (20)

Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSkynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
Big Data and Fast Data combined – is it possible?
Big Data and Fast Data combined – is it possible?Big Data and Fast Data combined – is it possible?
Big Data and Fast Data combined – is it possible?
 
Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)
 
Big Data Analytics(concepts of hadoop mapreduce,mahout,k-means clustering,hbase)
Big Data Analytics(concepts of hadoop mapreduce,mahout,k-means clustering,hbase)Big Data Analytics(concepts of hadoop mapreduce,mahout,k-means clustering,hbase)
Big Data Analytics(concepts of hadoop mapreduce,mahout,k-means clustering,hbase)
 
Big Data Analytics(Intro,Hadoop Map Reduce,Mahout,K-means clustering,H-base)
Big Data Analytics(Intro,Hadoop Map Reduce,Mahout,K-means clustering,H-base)Big Data Analytics(Intro,Hadoop Map Reduce,Mahout,K-means clustering,H-base)
Big Data Analytics(Intro,Hadoop Map Reduce,Mahout,K-means clustering,H-base)
 
Storage for next-generation sequencing
Storage for next-generation sequencingStorage for next-generation sequencing
Storage for next-generation sequencing
 
Is this normal?
Is this normal?Is this normal?
Is this normal?
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)
 
BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)
 
Big Data - Need of Converged Data Platform
Big Data - Need of Converged Data PlatformBig Data - Need of Converged Data Platform
Big Data - Need of Converged Data Platform
 
Big Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriBig Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-Ari
 
Distributed computing poli
Distributed computing poliDistributed computing poli
Distributed computing poli
 
Measure All the Things! - Austin Data Day 2014
Measure All the Things! - Austin Data Day 2014Measure All the Things! - Austin Data Day 2014
Measure All the Things! - Austin Data Day 2014
 
Big Data - An Overview
Big Data -  An OverviewBig Data -  An Overview
Big Data - An Overview
 
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and HadoopA gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
SplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
SplunkLive! Amsterdam 2015 Breakout - Getting Started with SplunkSplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
SplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
 
Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016
 
ambient-computing
ambient-computingambient-computing
ambient-computing
 

More from javier ramirez

¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfestjavier ramirez
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databasejavier ramirez
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...javier ramirez
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...javier ramirez
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBjavier ramirez
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)javier ramirez
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Databasejavier ramirez
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...javier ramirez
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728javier ramirez
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022javier ramirez
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...javier ramirez
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragónjavier ramirez
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloudjavier ramirez
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analyticsjavier ramirez
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelinejavier ramirez
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Divejavier ramirez
 
Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)javier ramirez
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSjavier ramirez
 
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...javier ramirez
 
Recomendaciones, predicciones y detección de fraude usando servicios de intel...
Recomendaciones, predicciones y detección de fraude usando servicios de intel...Recomendaciones, predicciones y detección de fraude usando servicios de intel...
Recomendaciones, predicciones y detección de fraude usando servicios de intel...javier ramirez
 

More from javier ramirez (20)

¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series database
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDB
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Database
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragón
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloud
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analytics
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipeline
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Dive
 
Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWS
 
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
 
Recomendaciones, predicciones y detección de fraude usando servicios de intel...
Recomendaciones, predicciones y detección de fraude usando servicios de intel...Recomendaciones, predicciones y detección de fraude usando servicios de intel...
Recomendaciones, predicciones y detección de fraude usando servicios de intel...
 

Recently uploaded

➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...amitlee9823
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 

Recently uploaded (20)

➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 

Streaming analytics on Google Cloud Platform, by Javier Ramirez, teowaki

  • 1. End-to-end streaming analytics on Google Cloud Platform From event capture to dashboard to monitoring Javier Ramirez @supercoco9
  • 2. Hard problems and easy problems And hard problems that look easy 3
  • 3. Calculate the average of several numbers. An easy problem
  • 4. Calculate the average of several numbers. By the way, they might be MANY numbers. They will probably not fit in memory. They might not even fit in one file or on a single hard drive. An easy big data problem
  • 5. Calculate the average of several numbers. By the way, they might be MANY numbers. They will probably not fit in memory. They might not even fit in one file or on a single hard drive. Truth is they will not be in one file, but they will be streamed live from different sensors… An easy big data and streaming problem
  • 6. Calculate the average of several numbers. By the way, they might be MANY numbers. They will probably not fit in memory. They might not even fit in one file or on a single hard drive. Truth is they will not be in one file, but they will be streamed live from different sensors… In different parts of the world A not so easy streaming data problem
  • 7. Calculate the average of several numbers. By the way, they might be MANY numbers. They will probably not fit in memory. They might not even fit in one file or on a single hard drive. Truth is they will not be in one file, but they will be streamed live from different sensors… In different parts of the world Some sensors might send a few events per hour, some a few thousands per second… An autoscaling streaming problem
  • 8. Calculate the average of several numbers. By the way, they might be MANY numbers. They will probably not fit in memory. They might not even fit in one file or on a single hard drive. Truth is they will not be in one file, but they will be streamed live from different sensors… In different parts of the world Some sensors might send a few events per hour, some a few thousands per second… We want not just the total average of all the points, but the moving average every 30 seconds, for every sensor. And the hourly, daily, and monthly averages A hard streaming analytics problem
  • 9. Calculate the average of several numbers. By the way, they might be MANY numbers. They will probably not fit in memory. They might not even fit in one file or on a single hard drive. Truth is they will not be in one file, but they will be streamed live from different sensors… In different parts of the world Some sensors might send a few events per hour, some a few thousands per second… We want not just the total average of all the points, but the moving average every 30 seconds, for every sensor. And the hourly, daily, and monthly averages Sometimes the sensors will have connectivity issues and will not send their data until later, but of course I want the calculations to still be correct A real life analytics problem
  • 10. All of the above, plus monitoring, alerts, self-healing, a way to query the data efficiently, and a pretty dashboard on top What your client/boss will expect Calculate the average of several numbers. By the way, they might be MANY numbers. They will probably not fit in memory. They might not even fit in one file or on a single hard drive. Truth is they will not be in one file, but they will be streamed live from different sensors… In different parts of the world Some sensors might send a few events per hour, some a few thousands per second… We want not just the total average of all the points, but the moving average every 30 seconds, for every sensor. And the hourly, daily, and monthly averages Sometimes the sensors will have connectivity issues and will not send their data until later, but of course I want the calculations to still be correct
  • 11. … is easier said than done
  • 12. Our complete system in 100 lines of Java, of which 90 are mostly boilerplate and configuration <= Don’t try to read that. I’ll zoom in later
  • 13. 14
  • 14. What we need A streaming data pipeline components Data acquisition Data validation Transformation / Aggregation Visualization Storage/ Analytics Monitoring and alerts of all the components
  • 15. Data Acquisition Sending and receiving data at scale 16
  • 16. Google Cloud Pub/Sub Google Cloud Pub/Sub brings the scalability, flexibility, and reliability of enterprise message-oriented middleware to the cloud. By providing many-to-many, asynchronous messaging that decouples senders and receivers, it allows for secure and highly available communication between independently written applications. Google Cloud Pub/Sub delivers low-latency, durable messaging that helps developers quickly integrate systems hosted on the Google Cloud Platform and externally. Ingest event streams from anywhere, at any scale, for simple, reliable, real-time stream analytics
  • 17. The spotify proof of concept Currently our production load peaks at around 700K events per second. To account for the future growth and possible disaster recovery scenarios, we settled on a test load of 2M events per second. To make it extra hard for Pub/Sub, we wanted to publish this amount of traffic from a single data center, so that all the requests were hitting the Pub/Sub machines in the same zone. We made the assumption that Google plans zones as independent failure domains and that each zone can handle equal amounts of traffic. In theory, if we’re able to push 2M messages to a single zone, we should be able to push number_of_zones * 2M messages across all zones. Our hope was that the system would be able to handle this traffic on both the producing and consuming side for a long time without the service degrading. https://labs.spotify.com/2016/03/03/spotifys-event-delivery-the-road-to-the-cloud-part-ii/
  • 18. The spotify proof of concept https://labs.spotify.com/2016/03/03/spotifys-event-delivery-the-road-to-the-cloud-part-ii/ They pushed 2 million events per second (to two topics) from 29 servers, non-stop, for five days. “We did not observe any lost messages whatsoever during the test period.”
  • 19. The no operations advantage https://labs.spotify.com/2016/03/10/spotifys-event-delivery-the-road-to-the-cloud-part-iii/ Event Delivery System In Cloud We’re actively working on bringing the new system to production. The preliminary numbers we obtained from running the new system in the experimental phase look very promising. The worst end-to-end latency observed with the new system is four times lower than the end-to-end latency of old system. But boosting performance isn’t the only thing we want to get from the new system. Our bet is that by using cloud-managed products we will have a much lower operational overhead. That in turn means we will have much more time to make Spotify’s products better.
  • 20. A truly global network
  • 21. Pub/Sub now works with Cloud IOT Core Device Manager The device manager allows individual devices to be configured and managed securely in a coarse-grained way; management can be done through a console or programmatically. The device manager establishes the identity of a device, and provides the mechanism for authenticating a device when connecting. It also maintains a logical configuration of each device and can be used to remotely control the device from the cloud. Protocol Bridge The protocol bridge provides connection endpoints for protocols with automatic load balancing for all device connections. The protocol bridge has native support for secure connection over MQTT, an industry-standard IoT protocol. The protocol bridge publishes all device telemetry to Cloud Pub/Sub, which can then be consumed by downstream analytic systems. Which is very cool if you are into Arduino, Raspberry PI, Android, or embedded systems
  • 22. Storage & Analytics Reliable, fast, scalable, and flexible. And no-ops 25
  • 23. BigQuery A database where you can send as much (or as little) data as you want, either batch or streaming, and run any SQL you want, no matter how big your data is. Even if you have petabytes of data. Even if you want to join data from different projects or from public data sources. Even if you want to query external data on Spreadsheets or Cloud Storage. Even if you want to create your own User Defined Functions in JavaScript.
  • 24. BigQuery also... … is serverless and zero configuration. You never have to worry about memory, CPU, network, or disk. You send your data, you send your queries, you get results. Behind the scenes BigQuery will use up to 2000 CPUs in parallel for your queries, and a huge amount of networked storage. But you don’t care. You pay for how much data you send and how much data you query. If you are not using the database, you are not paying anything. But it’s always available
  • 25. Hope you are not easily impressed How long it would take to read 4 Terabytes from a hard drive at 100 MB/s? And to filter 100 billion data points using a regular expression for each? And moving 278 GB across a 1 Gbps network?
  • 26. Hope you are not easily impressed How long it would take to read 4 Terabytes from a hard drive at 100 MB/s? About 11 hours And to filter 100 billion data points using a regular expression for each? About 27 hours And moving 278 GB across a 1 Gbps network? About 40 minutes
  • 27. Hope you are not easily impressed
  • 28. Hope you are not easily impressed
  • 29.
  • 30. We will use a simple table for our system
  • 31. Data Validation Apparently simple, but always a pain 34
  • 32.
  • 33. Google Cloud Dataprep An intelligent cloud data service to visually explore, clean, and prepare data for analysis
  • 34. Cloud Dataprep: Explorer & Suggestions
  • 36. Cloud Dataprep: No-ops execution & reports
  • 37. Transformation & Aggregation Batch and streaming ETL jobs, and data pipelines 40
  • 38. Apache BEAM: An advanced unified programming model Apache Beam is an open source, unified model for defining both batch and streaming data-parallel processing pipelines. Using one of the open source Beam SDKs, you build a program that defines the pipeline. The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow. Beam is particularly useful for Embarrassingly Parallel data processing tasks, in which the problem can be decomposed into many smaller bundles of data that can be processed independently and in parallel. You can also use Beam for Extract, Transform, and Load (ETL) tasks and pure data integration. These tasks are useful for moving data between different storage media and data sources, transforming data into a more desirable format, or loading data onto a new system.
  • 39. Apache BEAM: A basic pipeline
  • 42. Averages with BEAM: Overview Boilerplate and configuration Writing the output to BigQuery This is the code that actually processes and aggregates the data Start the pipeline
  • 44. Averages with BEAM: Output to BigQuery
  • 45. Averages with BEAM: The processing itself Transform/Filter. We are just parsing a line of text into multiple fields Aggregate. We are outputting the mean speed of the last minute per sensor, every 30 seconds
  • 46. Google Cloud Dataflow: BEAM with no-operations Google developed internally BEAM as a closed-source product. Then they realised it would make sense to open-source it and they donated it to the Apache community. Anyone can use BEAM completely for free, and choose the runner in which to execute your pipeline. Google Cloud Dataflow is a BEAM runner to execute your pipelines with no-operations, with logging, monitoring, auto-scaling, shuffling, and dynamic re-balancing. It’s like BEAM, but as a managed service.
  • 48. Three instances in three continents
  • 49. Our dataflow pipeline ready to accept data
  • 51. Our dataflow pipeline seems to be working fine. 64 elements per second should be easy.
  • 52. Let’s send much more data from all over!
  • 53. Our dataflow pipeline starts feeling the heat. Receiving 2440 elements per second now-
  • 54. And now we are processing over 20,000 elements per second at the 1st step. But the lag starts to increase
  • 55. Auto scaling to the rescue. Three workers now
  • 56. And lag goes back to normal
  • 57. And back to just one worker
  • 58. Monitoring What I can’t see, doesn’t exist 64
  • 62. Visualization Because an image is worth a thousand logs 68
  • 64. Google Data Studio: my dashboard
  • 65. Google Data Studio: data sources
  • 66. Google Data Studio: data sources
  • 67. Google Data Studio: drag and drop
  • 68. Google Data Studio: drag and drop
  • 69. Almost there Just one more slide 75
  • 70. Cloud IoT Core Cloud Dataprep Stackdriver Monitoring Logging Error Reporting All together now!
  • 71. CHEERS! I’m happy to answer any questions you may have at lunchtime or the coffee breaks. Or ping me at @supercoco9 on twitter. You got 240 chars now Demo source code available at: https://github.com/GoogleCloudPlatform/training-data-analyst/tree/master/courses/streaming/process Javier Ramirez
  • 72. End-to-end streaming analytics on Google Cloud Platform From event capture to dashboard to monitoring Javier Ramirez @supercoco9
  • 73. Template Design Credits The Template provides a theme with four basic colors: The backgrounds were created by Free Google Slides Templates. The original template for this presentation was provided by, and it’s property of, Free Google Slides Templates - http://freegoogleslidestemplates.com Vectorial Shapes in this Template were created by Free Google Slides Templates and downloaded from pexels.com and unsplash.com. Icons in this Template are part of Google® Material Icons and 1001freedownloads.com. Shapes & Icons Backgrounds Fonts Color Palette The fonts used in this template are taken from Google fonts. ( Dosis,Open Sans ) You can download the fonts from the following url: https://www.google.com/fonts/ #93c47dff #0097a7ff #78909cff #eeeeeeff #f7b600ff #00ce00e3 #de445eff #000000ff