SlideShare a Scribd company logo
1 of 32
1
Google Cloud & Data Pipeline
Patterns
@LynnLangit
2
Google Cloud in Australia
Data center here in 2017
3
GCP and Patterns
Developer-first
• Fast, flexible and cheap
• Virtual Machines / GCE
• Storage / GCS
Servers ➡ Containers ➡ Functions
• Data Warehouse
• Internet of Things (IoT)
• Bioinformatics
1. Modern Cloud by Example 2. GCP Data Pipeline Patterns
**And also, something New…
4Confidential & ProprietaryGoogle Cloud Platform 4
Demo – Storage / GCS
5
6Confidential & ProprietaryGoogle Cloud Platform 6
Demo – Virtual Machines / GCE
7
Virtual Machines /
GCE
• Fast
• Spin up in seconds
• Tools - SSH, gcloud console
• Flexible
• Custom sizing – slider 
• OS variety – Linux or Windows
• Cheap and Simple
• Auto discount for use
• Pre-emptible
Storage / GCS
• Fast
• Very fast within region
• Tools included
• Flexible
• 4 storage options
• Simple to use / understand
• Cheap
• Pricing by type
8
9
Pipeline Architectures
10Google Cloud Platform 10
Data Warehousing
11
Big Data > Data Warehouse
Reference table
Query / Compute
BigQuery
Customer Lists / Reference
Data
Export Ad
Data
Cloud Storage
Id matching
Cloud Dataflow
Marketing List
DoubleClick
Campaign Manager
Google Analytics
Relevant Users
Cloud Storage
Analysts
DataStudio
360
Dashboards
12Confidential & ProprietaryGoogle Cloud Platform 12
Demo – BigQuery
13
Batch
Streaming
Big Data > Log Processing
Log Storage
Cloud Storage
Log Streaming
Cloud Pub/Sub
Log Analytics
BigQuery
Log Processing
Cloud Dataflow
14
Cloud Dataflow /
Apache Beam
15
Big Data > Time Series Analysis
Batch Storage
BigQuery
Storage
Cloud Storage
Time Series Processing
Cloud Dataflow
Analysis
Cloud Datalab
Storage
Cloud
Bigtable*
Processing
Cloud Dataproc
Time Series Files
Cloud Storage
ML
Cloud ML
Streaming
Time Series Streaming
Cloud Pub/Sub
*Note: Use Bigtable with
NoSQL workloads of 1 TB or more
16
Streaming
Big Data > Complex Event Processing
Cloud Apps
Compute Engine
Streamin
g
Batch
Push to Devices
App Engine
Rules Engine
Cloud Dataflow Data Analysis
Cloud Datalab
Mobile Devices
Push Notifications
Report &
Share
Business Analysis
Cloud Apps
Compute
Engine
On-Premises
Databases
On-Premises
Applications
Processed Events
Cloud Bigtable
Events Time Series
Data
Warehouse
BigQuery
Execution Results
Streaming
Cloud Pub/Sub
Transactions
Processing
Cloud Dataflow
Transaction Streams
Messaging
Cloud Pub/Sub
Rules Actions
ETL
Cloud Dataflow
Transform Data
Cloud Data
Cloud Storage
Rules Engine
Cloud Dataproc
1717
Files
• Cloud Storage
Compute
• Big Query
• Cloud Dataflow
Other
• 3rd party ETL
• 3rd party dashboards
Core Products for Data
Warehousing
More on Big Query…
• Interactive or Batch query
• ANSI SQL compliant
• Cost control - Purchase ‘slots’
• NoOps Data Warehouse
18Google Cloud Platform 18
Big Relational
1919
What is Spanner?
20Confidential & ProprietaryGoogle Cloud Platform 20
Demo – Cloud Spanner
21Google Cloud Platform 21
Internet of Things
22
Internet of Things > MQTT
IoT Warehouse
BigQuery
IoT Application
App Engine
Stream Analytics
Cloud Dataflow
IoT Topic
Cloud Pub/Sub
MQTT
Devices
Auto-scaled Broker
Tier
Custom MQTT broker
MQTT Broker
Compute Engine
RabbitMQ
Cloud Load
Balancing
23
Ingest Pipelines
Storage
Analytics
Application &
Presentation
Standard
Devices
HTTPS
Constraine
d
Devices
Non-TCP
e.g. BLE
Gateway
Internet of Things > Sensor stream ingest and
processing
App
Engine
Container
Engine
Cloud
Storage
Cloud
Pub/Sub
Cloud
Dataflow
Monitoring
Logging
Cloud
Dataflow
Cloud
Datastore
Cloud
Bigtable
BigQuer
y
Cloud
Dataproc
Cloud
Datalab
Compute
Engine
24
Retail > Beacons and Targeted Marketing
Events
Cloud Bigtable
Proximity Events
Analytics
BigQuery
Data Warehouse
Messaging
Cloud Pub/Sub
Proximity Streams
Processing
Cloud Dataflow
Stream Processing
Notifications
App Engine
Push to Devices
Mobile-Push
Notifications
Office Business
Systems
Beacons
Proximity
Notifications
Messaging
Cloud Pub/Sub
Queued Notifications
2525
Files & Storage
• Cloud Storage
• Big Table
Compute & Ingest
• Cloud Pub/Sub
• Big Query
• Cloud Dataflow
Core Products for IoT
26Confidential & ProprietaryGoogle Cloud Platform 26
Demo – Machine Learning
27Google Cloud Platform 27
Bioinformatics
28
Patient
Analytics
Life Sciences > Patient Monitoring
Analytics
Process Data
Prediction API
Ingest
Cloud Pub/Sub
Storage
Cloud Bigtable
Alerts
Notifications
Cloud Pub/Sub
Health Care
Professional
Patient Monitors
(pulse, blood
sugar, exercise)
29
Private Datasets Public Datasets
Life Sciences > Variant Analysis
MSSNG Autism
Cloud Storage
Scientist
High
Throughput
Genome
Sequencers
1000 Genomes
Cloud Storage
Patient Data
Cloud Storage
Illumina Platform
Cloud Storage
Ref Genomes
Cloud Storage
TCGA
Cloud Storage
Analytics
Online Analytics
BigQuery
Batch Analytics
Cloud Dataflow
Lab Notebooks
Cloud Datalab
Data Ingest
Genomics
BAM
FAST
Q
30
Ingest
Elastic Cluster
Storage
Analytics
Life Sciences > Genomics, Secondary Analysis
Carrier
Interconnect
High
Throughput
Genome
Sequencer
s
Scientist
Raw Datafiles
Cloud Storage
Processed Data
Cloud Storage
Metadata
Cloud SQL
Lab notebooks
Cloud Datalab
HPC Cluster
Compute
Engine
10 Nodes
Ingest Server
Compute
Engine
Online Analytics
BigQuery
Cloud Load
Balancing
Cloud
Network
3131
• Cloud Storage
• Big Query
• Compute Engine
• Cloud Dataflow
• Public datasets on GCP
Core Products for
Bioinformatics
33
“The Future is Functional”
@LynnLangit

More Related Content

What's hot

Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
 
Apache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial ServicesApache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial Servicesconfluent
 
BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementationSimon Su
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®confluent
 
How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?confluent
 
Google BigQuery Best Practices
Google BigQuery Best PracticesGoogle BigQuery Best Practices
Google BigQuery Best PracticesMatillion
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Adrien Blind
 
Big Query Basics
Big Query BasicsBig Query Basics
Big Query BasicsIdo Green
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...DataWorks Summit
 
Batch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing DifferenceBatch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing Differencejeetendra mandal
 
bigquery.pptx
bigquery.pptxbigquery.pptx
bigquery.pptxHarissh16
 
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdfChris Hoyean Song
 

What's hot (20)

Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Gcp dataflow
Gcp dataflowGcp dataflow
Gcp dataflow
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
 
Apache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial ServicesApache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial Services
 
BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementation
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
 
How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?
 
Domain Driven Design
Domain Driven Design Domain Driven Design
Domain Driven Design
 
Migrating Oracle to PostgreSQL
Migrating Oracle to PostgreSQLMigrating Oracle to PostgreSQL
Migrating Oracle to PostgreSQL
 
Google BigQuery Best Practices
Google BigQuery Best PracticesGoogle BigQuery Best Practices
Google BigQuery Best Practices
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Big Query Basics
Big Query BasicsBig Query Basics
Big Query Basics
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
 
Batch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing DifferenceBatch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing Difference
 
bigquery.pptx
bigquery.pptxbigquery.pptx
bigquery.pptx
 
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
 

Viewers also liked

Scaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformScaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformLynn Langit
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platformdhruv_chaudhari
 
A Tour of Google Cloud Platform
A Tour of Google Cloud PlatformA Tour of Google Cloud Platform
A Tour of Google Cloud PlatformColin Su
 
The journey of Moving from AWS ELK to GCP Data Pipeline
The journey of Moving from AWS ELK to GCP Data PipelineThe journey of Moving from AWS ELK to GCP Data Pipeline
The journey of Moving from AWS ELK to GCP Data PipelineRandy Huang
 
New AWS Services for Bioinformatics
New AWS Services for BioinformaticsNew AWS Services for Bioinformatics
New AWS Services for BioinformaticsLynn Langit
 
BDT201 AWS Data Pipeline - AWS re: Invent 2012
BDT201 AWS Data Pipeline - AWS re: Invent 2012BDT201 AWS Data Pipeline - AWS re: Invent 2012
BDT201 AWS Data Pipeline - AWS re: Invent 2012Amazon Web Services
 
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016Chris Jang
 
Beyond Relational
Beyond RelationalBeyond Relational
Beyond RelationalLynn Langit
 
Firebase para se divertir com Internet das Coisas
Firebase para se divertir com Internet das CoisasFirebase para se divertir com Internet das Coisas
Firebase para se divertir com Internet das CoisasLuís Leão
 
Ad Personalization at Spotify: Iterative Enginering and Product Development -...
Ad Personalization at Spotify: Iterative Enginering and Product Development -...Ad Personalization at Spotify: Iterative Enginering and Product Development -...
Ad Personalization at Spotify: Iterative Enginering and Product Development -...Hakka Labs
 
Google Tech Talk with Dr. Eric Brewer in Korea Apr.27.2015
Google Tech Talk with Dr. Eric Brewer in Korea Apr.27.2015Google Tech Talk with Dr. Eric Brewer in Korea Apr.27.2015
Google Tech Talk with Dr. Eric Brewer in Korea Apr.27.2015Chris Jang
 
Machine Learning on the Microsoft Stack
Machine Learning on the Microsoft StackMachine Learning on the Microsoft Stack
Machine Learning on the Microsoft StackLynn Langit
 
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014Jaroslav Gergic
 
Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Sour...
Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Sour...Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Sour...
Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Sour...Sudhir Tonse
 
Google Cloud Platform 2014Q1 - Starter Guide
Google Cloud Platform   2014Q1 - Starter GuideGoogle Cloud Platform   2014Q1 - Starter Guide
Google Cloud Platform 2014Q1 - Starter GuideSimon Su
 
Modern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesModern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesWill Gardella
 
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...Chris Schalk
 
Serverless architecture with AWS Lambda (June 2016)
Serverless architecture with AWS Lambda (June 2016)Serverless architecture with AWS Lambda (June 2016)
Serverless architecture with AWS Lambda (June 2016)Julien SIMON
 

Viewers also liked (20)

Scaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformScaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud Platform
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
A Tour of Google Cloud Platform
A Tour of Google Cloud PlatformA Tour of Google Cloud Platform
A Tour of Google Cloud Platform
 
The journey of Moving from AWS ELK to GCP Data Pipeline
The journey of Moving from AWS ELK to GCP Data PipelineThe journey of Moving from AWS ELK to GCP Data Pipeline
The journey of Moving from AWS ELK to GCP Data Pipeline
 
New AWS Services for Bioinformatics
New AWS Services for BioinformaticsNew AWS Services for Bioinformatics
New AWS Services for Bioinformatics
 
BDT201 AWS Data Pipeline - AWS re: Invent 2012
BDT201 AWS Data Pipeline - AWS re: Invent 2012BDT201 AWS Data Pipeline - AWS re: Invent 2012
BDT201 AWS Data Pipeline - AWS re: Invent 2012
 
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Beyond Relational
Beyond RelationalBeyond Relational
Beyond Relational
 
Firebase para se divertir com Internet das Coisas
Firebase para se divertir com Internet das CoisasFirebase para se divertir com Internet das Coisas
Firebase para se divertir com Internet das Coisas
 
Ad Personalization at Spotify: Iterative Enginering and Product Development -...
Ad Personalization at Spotify: Iterative Enginering and Product Development -...Ad Personalization at Spotify: Iterative Enginering and Product Development -...
Ad Personalization at Spotify: Iterative Enginering and Product Development -...
 
Google Tech Talk with Dr. Eric Brewer in Korea Apr.27.2015
Google Tech Talk with Dr. Eric Brewer in Korea Apr.27.2015Google Tech Talk with Dr. Eric Brewer in Korea Apr.27.2015
Google Tech Talk with Dr. Eric Brewer in Korea Apr.27.2015
 
Machine Learning on the Microsoft Stack
Machine Learning on the Microsoft StackMachine Learning on the Microsoft Stack
Machine Learning on the Microsoft Stack
 
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
 
Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Sour...
Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Sour...Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Sour...
Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Sour...
 
Data Pipeline at Tapad
Data Pipeline at TapadData Pipeline at Tapad
Data Pipeline at Tapad
 
Google Cloud Platform 2014Q1 - Starter Guide
Google Cloud Platform   2014Q1 - Starter GuideGoogle Cloud Platform   2014Q1 - Starter Guide
Google Cloud Platform 2014Q1 - Starter Guide
 
Modern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesModern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and Practices
 
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...
 
Serverless architecture with AWS Lambda (June 2016)
Serverless architecture with AWS Lambda (June 2016)Serverless architecture with AWS Lambda (June 2016)
Serverless architecture with AWS Lambda (June 2016)
 

Similar to Google Cloud and Data Pipeline Patterns

Google cloud big data summit master gcp big data summit la - 10-20-2015
Google cloud big data summit   master gcp big data summit la - 10-20-2015Google cloud big data summit   master gcp big data summit la - 10-20-2015
Google cloud big data summit master gcp big data summit la - 10-20-2015Raj Babu
 
Google's Infrastructure and Specific IoT Services
Google's Infrastructure and Specific IoT ServicesGoogle's Infrastructure and Specific IoT Services
Google's Infrastructure and Specific IoT ServicesIntel® Software
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryMárton Kodok
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalAvere Systems
 
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Gary Arora
 
Dimension Data Cloud Business Unit - Solution Offering
Dimension Data Cloud Business Unit - Solution OfferingDimension Data Cloud Business Unit - Solution Offering
Dimension Data Cloud Business Unit - Solution OfferingRifaHaryadi
 
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...Openbar
 
Session 4 GCCP.pptx
Session 4 GCCP.pptxSession 4 GCCP.pptx
Session 4 GCCP.pptxDSCIITPatna
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the CloudAmihay Zer-Kavod
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...Flink Forward
 
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...SnapLogic
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaBest Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaKai Wähner
 
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...confluent
 
Monitoring docker, k8s and your applications with the elastic stack
Monitoring docker, k8s and your applications with the elastic stackMonitoring docker, k8s and your applications with the elastic stack
Monitoring docker, k8s and your applications with the elastic stackSmartWave
 
Getting started with GCP ( Google Cloud Platform)
Getting started with GCP ( Google  Cloud Platform)Getting started with GCP ( Google  Cloud Platform)
Getting started with GCP ( Google Cloud Platform)bigdata trunk
 
Building real-time data analytics on Google Cloud
Building real-time data analytics on Google CloudBuilding real-time data analytics on Google Cloud
Building real-time data analytics on Google CloudJonny Daenen
 
Data & Analytics - Session 1 - Big Data Analytics
Data & Analytics - Session 1 -  Big Data AnalyticsData & Analytics - Session 1 -  Big Data Analytics
Data & Analytics - Session 1 - Big Data AnalyticsAmazon Web Services
 
Building what's next with google cloud's powerful infrastructure
Building what's next with google cloud's powerful infrastructureBuilding what's next with google cloud's powerful infrastructure
Building what's next with google cloud's powerful infrastructureMediaAgility
 

Similar to Google Cloud and Data Pipeline Patterns (20)

Google cloud big data summit master gcp big data summit la - 10-20-2015
Google cloud big data summit   master gcp big data summit la - 10-20-2015Google cloud big data summit   master gcp big data summit la - 10-20-2015
Google cloud big data summit master gcp big data summit la - 10-20-2015
 
Google's Infrastructure and Specific IoT Services
Google's Infrastructure and Specific IoT ServicesGoogle's Infrastructure and Specific IoT Services
Google's Infrastructure and Specific IoT Services
 
Data Platform on GCP
Data Platform on GCPData Platform on GCP
Data Platform on GCP
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute final
 
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
 
Dimension Data Cloud Business Unit - Solution Offering
Dimension Data Cloud Business Unit - Solution OfferingDimension Data Cloud Business Unit - Solution Offering
Dimension Data Cloud Business Unit - Solution Offering
 
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
 
Session 4 GCCP.pptx
Session 4 GCCP.pptxSession 4 GCCP.pptx
Session 4 GCCP.pptx
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the Cloud
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
 
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaBest Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
 
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
 
Monitoring docker, k8s and your applications with the elastic stack
Monitoring docker, k8s and your applications with the elastic stackMonitoring docker, k8s and your applications with the elastic stack
Monitoring docker, k8s and your applications with the elastic stack
 
Getting started with GCP ( Google Cloud Platform)
Getting started with GCP ( Google  Cloud Platform)Getting started with GCP ( Google  Cloud Platform)
Getting started with GCP ( Google Cloud Platform)
 
Building real-time data analytics on Google Cloud
Building real-time data analytics on Google CloudBuilding real-time data analytics on Google Cloud
Building real-time data analytics on Google Cloud
 
GCP.pptx
GCP.pptxGCP.pptx
GCP.pptx
 
Data & Analytics - Session 1 - Big Data Analytics
Data & Analytics - Session 1 -  Big Data AnalyticsData & Analytics - Session 1 -  Big Data Analytics
Data & Analytics - Session 1 - Big Data Analytics
 
Building what's next with google cloud's powerful infrastructure
Building what's next with google cloud's powerful infrastructureBuilding what's next with google cloud's powerful infrastructure
Building what's next with google cloud's powerful infrastructure
 

More from Lynn Langit

VariantSpark on AWS
VariantSpark on AWSVariantSpark on AWS
VariantSpark on AWSLynn Langit
 
Serverless Architectures
Serverless ArchitecturesServerless Architectures
Serverless ArchitecturesLynn Langit
 
10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids ProgrammingLynn Langit
 
Blastn plus jupyter on Docker
Blastn plus jupyter on DockerBlastn plus jupyter on Docker
Blastn plus jupyter on DockerLynn Langit
 
Testing in Ballerina Language
Testing in Ballerina LanguageTesting in Ballerina Language
Testing in Ballerina LanguageLynn Langit
 
Teaching Kids to create Alexa Skills
Teaching Kids to create Alexa SkillsTeaching Kids to create Alexa Skills
Teaching Kids to create Alexa SkillsLynn Langit
 
Understanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examplesUnderstanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examplesLynn Langit
 
Genome-scale Big Data Pipelines
Genome-scale Big Data PipelinesGenome-scale Big Data Pipelines
Genome-scale Big Data PipelinesLynn Langit
 
Teaching Kids Programming
Teaching Kids ProgrammingTeaching Kids Programming
Teaching Kids ProgrammingLynn Langit
 
Serverless Reality
Serverless RealityServerless Reality
Serverless RealityLynn Langit
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesLynn Langit
 
VariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsVariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsLynn Langit
 
Bioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSBioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSLynn Langit
 
Serverless Reality
Serverless RealityServerless Reality
Serverless RealityLynn Langit
 
SQL Server on Google Cloud Platform
SQL Server on Google Cloud PlatformSQL Server on Google Cloud Platform
SQL Server on Google Cloud PlatformLynn Langit
 
Redis Labs and SQL Server
Redis Labs and SQL ServerRedis Labs and SQL Server
Redis Labs and SQL ServerLynn Langit
 
Building a data warehouse with AWS Redshift, Matillion and Yellowfin
Building a data warehouse with AWS Redshift, Matillion and YellowfinBuilding a data warehouse with AWS Redshift, Matillion and Yellowfin
Building a data warehouse with AWS Redshift, Matillion and YellowfinLynn Langit
 
What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'Lynn Langit
 

More from Lynn Langit (20)

VariantSpark on AWS
VariantSpark on AWSVariantSpark on AWS
VariantSpark on AWS
 
Serverless Architectures
Serverless ArchitecturesServerless Architectures
Serverless Architectures
 
10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming
 
Blastn plus jupyter on Docker
Blastn plus jupyter on DockerBlastn plus jupyter on Docker
Blastn plus jupyter on Docker
 
Testing in Ballerina Language
Testing in Ballerina LanguageTesting in Ballerina Language
Testing in Ballerina Language
 
Teaching Kids to create Alexa Skills
Teaching Kids to create Alexa SkillsTeaching Kids to create Alexa Skills
Teaching Kids to create Alexa Skills
 
Practical cloud
Practical cloudPractical cloud
Practical cloud
 
Understanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examplesUnderstanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examples
 
Genome-scale Big Data Pipelines
Genome-scale Big Data PipelinesGenome-scale Big Data Pipelines
Genome-scale Big Data Pipelines
 
Teaching Kids Programming
Teaching Kids ProgrammingTeaching Kids Programming
Teaching Kids Programming
 
Practical Cloud
Practical CloudPractical Cloud
Practical Cloud
 
Serverless Reality
Serverless RealityServerless Reality
Serverless Reality
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data Pipelines
 
VariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsVariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomics
 
Bioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSBioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWS
 
Serverless Reality
Serverless RealityServerless Reality
Serverless Reality
 
SQL Server on Google Cloud Platform
SQL Server on Google Cloud PlatformSQL Server on Google Cloud Platform
SQL Server on Google Cloud Platform
 
Redis Labs and SQL Server
Redis Labs and SQL ServerRedis Labs and SQL Server
Redis Labs and SQL Server
 
Building a data warehouse with AWS Redshift, Matillion and Yellowfin
Building a data warehouse with AWS Redshift, Matillion and YellowfinBuilding a data warehouse with AWS Redshift, Matillion and Yellowfin
Building a data warehouse with AWS Redshift, Matillion and Yellowfin
 
What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'
 

Recently uploaded

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 

Recently uploaded (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

Google Cloud and Data Pipeline Patterns

  • 1. 1 Google Cloud & Data Pipeline Patterns @LynnLangit
  • 2. 2 Google Cloud in Australia Data center here in 2017
  • 3. 3 GCP and Patterns Developer-first • Fast, flexible and cheap • Virtual Machines / GCE • Storage / GCS Servers ➡ Containers ➡ Functions • Data Warehouse • Internet of Things (IoT) • Bioinformatics 1. Modern Cloud by Example 2. GCP Data Pipeline Patterns **And also, something New…
  • 4. 4Confidential & ProprietaryGoogle Cloud Platform 4 Demo – Storage / GCS
  • 5. 5
  • 6. 6Confidential & ProprietaryGoogle Cloud Platform 6 Demo – Virtual Machines / GCE
  • 7. 7 Virtual Machines / GCE • Fast • Spin up in seconds • Tools - SSH, gcloud console • Flexible • Custom sizing – slider  • OS variety – Linux or Windows • Cheap and Simple • Auto discount for use • Pre-emptible Storage / GCS • Fast • Very fast within region • Tools included • Flexible • 4 storage options • Simple to use / understand • Cheap • Pricing by type
  • 8. 8
  • 10. 10Google Cloud Platform 10 Data Warehousing
  • 11. 11 Big Data > Data Warehouse Reference table Query / Compute BigQuery Customer Lists / Reference Data Export Ad Data Cloud Storage Id matching Cloud Dataflow Marketing List DoubleClick Campaign Manager Google Analytics Relevant Users Cloud Storage Analysts DataStudio 360 Dashboards
  • 12. 12Confidential & ProprietaryGoogle Cloud Platform 12 Demo – BigQuery
  • 13. 13 Batch Streaming Big Data > Log Processing Log Storage Cloud Storage Log Streaming Cloud Pub/Sub Log Analytics BigQuery Log Processing Cloud Dataflow
  • 15. 15 Big Data > Time Series Analysis Batch Storage BigQuery Storage Cloud Storage Time Series Processing Cloud Dataflow Analysis Cloud Datalab Storage Cloud Bigtable* Processing Cloud Dataproc Time Series Files Cloud Storage ML Cloud ML Streaming Time Series Streaming Cloud Pub/Sub *Note: Use Bigtable with NoSQL workloads of 1 TB or more
  • 16. 16 Streaming Big Data > Complex Event Processing Cloud Apps Compute Engine Streamin g Batch Push to Devices App Engine Rules Engine Cloud Dataflow Data Analysis Cloud Datalab Mobile Devices Push Notifications Report & Share Business Analysis Cloud Apps Compute Engine On-Premises Databases On-Premises Applications Processed Events Cloud Bigtable Events Time Series Data Warehouse BigQuery Execution Results Streaming Cloud Pub/Sub Transactions Processing Cloud Dataflow Transaction Streams Messaging Cloud Pub/Sub Rules Actions ETL Cloud Dataflow Transform Data Cloud Data Cloud Storage Rules Engine Cloud Dataproc
  • 17. 1717 Files • Cloud Storage Compute • Big Query • Cloud Dataflow Other • 3rd party ETL • 3rd party dashboards Core Products for Data Warehousing More on Big Query… • Interactive or Batch query • ANSI SQL compliant • Cost control - Purchase ‘slots’ • NoOps Data Warehouse
  • 18. 18Google Cloud Platform 18 Big Relational
  • 20. 20Confidential & ProprietaryGoogle Cloud Platform 20 Demo – Cloud Spanner
  • 21. 21Google Cloud Platform 21 Internet of Things
  • 22. 22 Internet of Things > MQTT IoT Warehouse BigQuery IoT Application App Engine Stream Analytics Cloud Dataflow IoT Topic Cloud Pub/Sub MQTT Devices Auto-scaled Broker Tier Custom MQTT broker MQTT Broker Compute Engine RabbitMQ Cloud Load Balancing
  • 23. 23 Ingest Pipelines Storage Analytics Application & Presentation Standard Devices HTTPS Constraine d Devices Non-TCP e.g. BLE Gateway Internet of Things > Sensor stream ingest and processing App Engine Container Engine Cloud Storage Cloud Pub/Sub Cloud Dataflow Monitoring Logging Cloud Dataflow Cloud Datastore Cloud Bigtable BigQuer y Cloud Dataproc Cloud Datalab Compute Engine
  • 24. 24 Retail > Beacons and Targeted Marketing Events Cloud Bigtable Proximity Events Analytics BigQuery Data Warehouse Messaging Cloud Pub/Sub Proximity Streams Processing Cloud Dataflow Stream Processing Notifications App Engine Push to Devices Mobile-Push Notifications Office Business Systems Beacons Proximity Notifications Messaging Cloud Pub/Sub Queued Notifications
  • 25. 2525 Files & Storage • Cloud Storage • Big Table Compute & Ingest • Cloud Pub/Sub • Big Query • Cloud Dataflow Core Products for IoT
  • 26. 26Confidential & ProprietaryGoogle Cloud Platform 26 Demo – Machine Learning
  • 27. 27Google Cloud Platform 27 Bioinformatics
  • 28. 28 Patient Analytics Life Sciences > Patient Monitoring Analytics Process Data Prediction API Ingest Cloud Pub/Sub Storage Cloud Bigtable Alerts Notifications Cloud Pub/Sub Health Care Professional Patient Monitors (pulse, blood sugar, exercise)
  • 29. 29 Private Datasets Public Datasets Life Sciences > Variant Analysis MSSNG Autism Cloud Storage Scientist High Throughput Genome Sequencers 1000 Genomes Cloud Storage Patient Data Cloud Storage Illumina Platform Cloud Storage Ref Genomes Cloud Storage TCGA Cloud Storage Analytics Online Analytics BigQuery Batch Analytics Cloud Dataflow Lab Notebooks Cloud Datalab Data Ingest Genomics BAM FAST Q
  • 30. 30 Ingest Elastic Cluster Storage Analytics Life Sciences > Genomics, Secondary Analysis Carrier Interconnect High Throughput Genome Sequencer s Scientist Raw Datafiles Cloud Storage Processed Data Cloud Storage Metadata Cloud SQL Lab notebooks Cloud Datalab HPC Cluster Compute Engine 10 Nodes Ingest Server Compute Engine Online Analytics BigQuery Cloud Load Balancing Cloud Network
  • 31. 3131 • Cloud Storage • Big Query • Compute Engine • Cloud Dataflow • Public datasets on GCP Core Products for Bioinformatics
  • 32. 33 “The Future is Functional” @LynnLangit

Editor's Notes

  1. https://cloud.google.com/spanner/ https://research.google.com/pubs/pub45855.html https://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf
  2. Icon and sample diagrams landing page https://cloud.google.com/icons