SlideShare a Scribd company logo
1 of 50
Download to read offline
(Big) Data Driven at
Eway
Tu Pham - CTO @ Eway
Journey to Google Cloud
-- Ha Noi - 03/2019 --
About Me - CTO at Eway JSC
- Google Developer Expert on Cloud
Platform
- Open source contributor, blogger,
father
- 8 years experience on Big data and
Cloud Computing
44M dollar
transaction
value in
2018
> 5M
Transactions
When You
Have (Big)
Data
How Do We
Use This
Data
Use Case - Reporting
- Business Analytics
- Operational Analytics
- Product Features
- System Monitoring
- Reporting to:
- Partners
- Advertisers
- Publishers
Reporting
Business
Analytics
- Analyzing
- Growth
- Users behavior
- Sign up funnels
- Sign up referrals
- ...
Operational
Analytics
- Analyzing
- Root cause analysis
- Latency analysis
- Error analysis
- Better
- Threshold alerts
- Security alerts
- Capacity planning (server,
bandwidth)
Product
Features
- Product Features
- Top Products
- Adflex publisher challenge
- Signup referrals
- A/B Testing
Samole: End-To-End Flow For Mining
User Behavior
How do we
collect this
data?
Step 1: GC Compute Engine Instances
Collect Raw Data
- Technology: Cloud Load Balancing, Compute Engine
- Why Cloud Load Balancing:
- TCP/UDP Load Balancing
- Seamless Autoscaling
- Scalable
- Why Compute Engine:
- High-Performance
- Scalable
- Low Cost
- Fast Networking
- Custom Machine Types
Step 1: GC Compute Engine Instances
Collect Raw Data
How do we
process this
data?
Step 2: GC Compute Engine Instances
Convert Raw Data To Apache Parquet Files
- Technology: Compute Engine, Parquet file format
- Why Parquet:
- Self-describing, columnar storage format
- Language-independent
- High query-performance
- Spark SQL is much faster with Parquet
- High compression (up to 70%)- less disk IO
Step 2: GC Compute Engine Instances
Convert Raw Data To Apache Parquet Files
Step 2: GC Compute Engine Instances
Convert Raw Data To Apache Parquet Files
- Technology: Compute Engine, Parquet file format, Cloud Storage
- Why Cloud Storage:
- Four storage classes
- Easy to integrate
- Object Lifecycle Management
- Fast Networking
Step 3: GC Compute Engine Upload Parquet
File To GC Cloud Storage
Step 3: GC Compute Engine Upload Parquet
File To GC Cloud Storage
Step 3: GC Compute Engine Upload Parquet
File To GC Cloud Storage
Step 3: GC Compute Engine Upload Parquet
File To GC Cloud Storage
Step 3: GC Compute Engine Upload Parquet
File To GC Cloud Storage
How do we
visualize this
data
Step 4a: Explore Dataset Using GC Datalab
- Technology: Cloud Datalab
- Why Datalab:
- Integrated with: Cloud BigQuery, Cloud Machine Learning Engine, Cloud Storage, and
Stackdriver Monitoring
- IPython Support & Notebook Format
- Interactive Data Visualization
- Multi-Language Support: Python, SQL, and JavaScript (for BigQuery user-defined functions
Step 4b: Explore Dataset Using BI Tools
- Technology: Grafana, PowerBI
- Why:
- Support >40 data sources (File, Database, Log Stream, Zabbix, Google Analytic, Google
Calendar, AWS Cloudwatch, Jira, ...)
- Query, visualize, alert on and understand your metrics
- Create, explore, and share dashboards with your team
Step 4a: Explore Dataset Using GC Datalab
Become
Geek
Where Are
The AI /
ML
Create
Your
Principles
Principles:
- KISS (Keep it simple, stupid)
- DRY (Don’t Repeat Yourself)
- Single Responsibility
- Low Cost
- Scalable
Be 1% better everyday
tips
Create your system
principles
Design system
architecture, data flow,
data model, data
structure first
Separate realtime and
batch flows
Separate data storage
strategies between data
types
Save the cost by
network cost, instances
cost, storage cost by
metric monitoring &
alert system
We Are Hiring
● Product Owner
● Backend Java Developer
● Full Stack PHP Developer
● Full Stack Python Developer
● DevOps Engineer
Thank You - Q&A
● Eway: https://eway.vn
● My Contact: tupp@eway.vn

More Related Content

What's hot

Understanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformUnderstanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformDr. Ketan Parmar
 
Scaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformScaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformLynn Langit
 
Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3Simon Su
 
Getting started with GCP ( Google Cloud Platform)
Getting started with GCP ( Google  Cloud Platform)Getting started with GCP ( Google  Cloud Platform)
Getting started with GCP ( Google Cloud Platform)bigdata trunk
 
Cloud Developer Days - BigQuery
Cloud Developer Days - BigQueryCloud Developer Days - BigQuery
Cloud Developer Days - BigQueryWlodek Bielski
 
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsR, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsKai Wähner
 
Google Cloud Platform (GCP) At a Glance
Google Cloud Platform (GCP)  At a GlanceGoogle Cloud Platform (GCP)  At a Glance
Google Cloud Platform (GCP) At a GlanceCloud Analogy
 
How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014James Chittenden
 
An overview of BigQuery
An overview of BigQuery An overview of BigQuery
An overview of BigQuery GirdhareeSaran
 
#DataUnlimited - Google Big Data Unlimited
#DataUnlimited - Google Big Data Unlimited#DataUnlimited - Google Big Data Unlimited
#DataUnlimited - Google Big Data UnlimitedAudrey Huvet
 
Google Cloud Platform as a Backend Solution for your Product
Google Cloud Platform as a Backend Solution for your ProductGoogle Cloud Platform as a Backend Solution for your Product
Google Cloud Platform as a Backend Solution for your ProductSergey Smetanin
 
StackEngine Demo - Docker Austin
StackEngine Demo - Docker AustinStackEngine Demo - Docker Austin
StackEngine Demo - Docker AustinBoyd Hemphill
 
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Imam Raza
 
Building Reactive Applications With Akka And Java
Building Reactive Applications With Akka And JavaBuilding Reactive Applications With Akka And Java
Building Reactive Applications With Akka And JavaTu Pham
 
Critical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and AnalyticsCritical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and AnalyticsData Driven Innovation
 
Visualising and Linking Open Data from Multiple Sources
Visualising and Linking Open Data from Multiple SourcesVisualising and Linking Open Data from Multiple Sources
Visualising and Linking Open Data from Multiple SourcesData Driven Innovation
 
Introduction to GCP presentation
Introduction to GCP presentationIntroduction to GCP presentation
Introduction to GCP presentationMohit Kachhwani
 

What's hot (20)

Understanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformUnderstanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud Platform
 
Scaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformScaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud Platform
 
Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3
 
Getting started with GCP ( Google Cloud Platform)
Getting started with GCP ( Google  Cloud Platform)Getting started with GCP ( Google  Cloud Platform)
Getting started with GCP ( Google Cloud Platform)
 
Cloud Developer Days - BigQuery
Cloud Developer Days - BigQueryCloud Developer Days - BigQuery
Cloud Developer Days - BigQuery
 
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsR, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
 
Google Cloud Platform (GCP) At a Glance
Google Cloud Platform (GCP)  At a GlanceGoogle Cloud Platform (GCP)  At a Glance
Google Cloud Platform (GCP) At a Glance
 
Google Cloud Platform Data Storage
Google Cloud Platform Data StorageGoogle Cloud Platform Data Storage
Google Cloud Platform Data Storage
 
Google Cloud Dataflow
Google Cloud DataflowGoogle Cloud Dataflow
Google Cloud Dataflow
 
How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014
 
An overview of BigQuery
An overview of BigQuery An overview of BigQuery
An overview of BigQuery
 
#DataUnlimited - Google Big Data Unlimited
#DataUnlimited - Google Big Data Unlimited#DataUnlimited - Google Big Data Unlimited
#DataUnlimited - Google Big Data Unlimited
 
Google Cloud Platform as a Backend Solution for your Product
Google Cloud Platform as a Backend Solution for your ProductGoogle Cloud Platform as a Backend Solution for your Product
Google Cloud Platform as a Backend Solution for your Product
 
StackEngine Demo - Docker Austin
StackEngine Demo - Docker AustinStackEngine Demo - Docker Austin
StackEngine Demo - Docker Austin
 
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
 
Building Reactive Applications With Akka And Java
Building Reactive Applications With Akka And JavaBuilding Reactive Applications With Akka And Java
Building Reactive Applications With Akka And Java
 
Critical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and AnalyticsCritical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and Analytics
 
Visualising and Linking Open Data from Multiple Sources
Visualising and Linking Open Data from Multiple SourcesVisualising and Linking Open Data from Multiple Sources
Visualising and Linking Open Data from Multiple Sources
 
Introduction to GCP presentation
Introduction to GCP presentationIntroduction to GCP presentation
Introduction to GCP presentation
 
Google cloud
Google cloudGoogle cloud
Google cloud
 

Similar to Big Data Driven At Eway

Google cloud platform
Google cloud platformGoogle cloud platform
Google cloud platformrajdeep
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryMárton Kodok
 
API and Big Data Solution Patterns
API and Big Data Solution Patterns API and Big Data Solution Patterns
API and Big Data Solution Patterns WSO2
 
Presentation Data Council Meetup: F. Mekkenholt, R. Vlijm
Presentation Data Council Meetup: F. Mekkenholt, R. VlijmPresentation Data Council Meetup: F. Mekkenholt, R. Vlijm
Presentation Data Council Meetup: F. Mekkenholt, R. VlijmAlexander Oppel
 
The role of AWS in the Datalandscape of a fast growing Startup
The role of AWS in the Datalandscape of a fast growing StartupThe role of AWS in the Datalandscape of a fast growing Startup
The role of AWS in the Datalandscape of a fast growing StartupMaximilian Ehrlich
 
Quick Intro to Google Cloud Technologies
Quick Intro to Google Cloud TechnologiesQuick Intro to Google Cloud Technologies
Quick Intro to Google Cloud TechnologiesChris Schalk
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryMárton Kodok
 
How to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpHow to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpJoseph Arriola
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalAvere Systems
 
Key projects Data Science and Engineering
Key projects Data Science and EngineeringKey projects Data Science and Engineering
Key projects Data Science and EngineeringVijayananda Mohire
 
Key projects Data Science and Engineering
Key projects Data Science and EngineeringKey projects Data Science and Engineering
Key projects Data Science and EngineeringVijayananda Mohire
 
ALT-F1.BE : The Accelerator (Google Cloud Platform)
ALT-F1.BE : The Accelerator (Google Cloud Platform)ALT-F1.BE : The Accelerator (Google Cloud Platform)
ALT-F1.BE : The Accelerator (Google Cloud Platform)Abdelkrim Boujraf
 
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQueryIntro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQueryChris Schalk
 
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologiesIntroduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologiesChris Schalk
 
Introduction to Google's Cloud Technologies
Introduction to Google's Cloud TechnologiesIntroduction to Google's Cloud Technologies
Introduction to Google's Cloud TechnologiesChris Schalk
 
Intro to Google's Cloud Technologies
Intro to Google's Cloud TechnologiesIntro to Google's Cloud Technologies
Intro to Google's Cloud TechnologiesChris Schalk
 
Get Well Prepared for Google Professional Cloud Developer (GCP-PCD) Certifica...
Get Well Prepared for Google Professional Cloud Developer (GCP-PCD) Certifica...Get Well Prepared for Google Professional Cloud Developer (GCP-PCD) Certifica...
Get Well Prepared for Google Professional Cloud Developer (GCP-PCD) Certifica...Amaaira Johns
 

Similar to Big Data Driven At Eway (20)

Google cloud platform
Google cloud platformGoogle cloud platform
Google cloud platform
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
 
Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017
 
API and Big Data Solution Patterns
API and Big Data Solution Patterns API and Big Data Solution Patterns
API and Big Data Solution Patterns
 
Engineering Data Pipeline for Data-Driven Analytics
Engineering Data Pipeline for Data-Driven AnalyticsEngineering Data Pipeline for Data-Driven Analytics
Engineering Data Pipeline for Data-Driven Analytics
 
Presentation Data Council Meetup: F. Mekkenholt, R. Vlijm
Presentation Data Council Meetup: F. Mekkenholt, R. VlijmPresentation Data Council Meetup: F. Mekkenholt, R. Vlijm
Presentation Data Council Meetup: F. Mekkenholt, R. Vlijm
 
The role of AWS in the Datalandscape of a fast growing Startup
The role of AWS in the Datalandscape of a fast growing StartupThe role of AWS in the Datalandscape of a fast growing Startup
The role of AWS in the Datalandscape of a fast growing Startup
 
Quick Intro to Google Cloud Technologies
Quick Intro to Google Cloud TechnologiesQuick Intro to Google Cloud Technologies
Quick Intro to Google Cloud Technologies
 
Data Infrastructure in Kumparan
Data Infrastructure in KumparanData Infrastructure in Kumparan
Data Infrastructure in Kumparan
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
 
How to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpHow to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcp
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute final
 
Key projects Data Science and Engineering
Key projects Data Science and EngineeringKey projects Data Science and Engineering
Key projects Data Science and Engineering
 
Key projects Data Science and Engineering
Key projects Data Science and EngineeringKey projects Data Science and Engineering
Key projects Data Science and Engineering
 
ALT-F1.BE : The Accelerator (Google Cloud Platform)
ALT-F1.BE : The Accelerator (Google Cloud Platform)ALT-F1.BE : The Accelerator (Google Cloud Platform)
ALT-F1.BE : The Accelerator (Google Cloud Platform)
 
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQueryIntro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
 
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologiesIntroduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
 
Introduction to Google's Cloud Technologies
Introduction to Google's Cloud TechnologiesIntroduction to Google's Cloud Technologies
Introduction to Google's Cloud Technologies
 
Intro to Google's Cloud Technologies
Intro to Google's Cloud TechnologiesIntro to Google's Cloud Technologies
Intro to Google's Cloud Technologies
 
Get Well Prepared for Google Professional Cloud Developer (GCP-PCD) Certifica...
Get Well Prepared for Google Professional Cloud Developer (GCP-PCD) Certifica...Get Well Prepared for Google Professional Cloud Developer (GCP-PCD) Certifica...
Get Well Prepared for Google Professional Cloud Developer (GCP-PCD) Certifica...
 

More from Tu Pham

Go from idea to app with no coding using AppSheet.pptx
Go from idea to app with no coding using AppSheet.pptxGo from idea to app with no coding using AppSheet.pptx
Go from idea to app with no coding using AppSheet.pptxTu Pham
 
Secure your app against DDOS, API Abuse, Hijacking, and Fraud
 Secure your app against DDOS, API Abuse, Hijacking, and Fraud Secure your app against DDOS, API Abuse, Hijacking, and Fraud
Secure your app against DDOS, API Abuse, Hijacking, and FraudTu Pham
 
Challenges In Implementing SRE
Challenges In Implementing SREChallenges In Implementing SRE
Challenges In Implementing SRETu Pham
 
IT Strategy
IT Strategy IT Strategy
IT Strategy Tu Pham
 
Set up Learn and Development program
Set up Learn and Development programSet up Learn and Development program
Set up Learn and Development programTu Pham
 
Cost Management For IT Project / Product
Cost Management For IT Project / ProductCost Management For IT Project / Product
Cost Management For IT Project / ProductTu Pham
 
Minimum Viable Product 101
Minimum Viable Product 101Minimum Viable Product 101
Minimum Viable Product 101Tu Pham
 
Understand your customers
Understand your customersUnderstand your customers
Understand your customersTu Pham
 
Let's build great products for mid-size companies
Let's build great products for mid-size companiesLet's build great products for mid-size companies
Let's build great products for mid-size companiesTu Pham
 
Latency Control And Supervision In Resilience Design Patterns
Latency Control And Supervision In Resilience Design Patterns Latency Control And Supervision In Resilience Design Patterns
Latency Control And Supervision In Resilience Design Patterns Tu Pham
 
High Output Tech Management
High Output Tech Management High Output Tech Management
High Output Tech Management Tu Pham
 
Security On The Cloud
Security On The CloudSecurity On The Cloud
Security On The CloudTu Pham
 
Eway Tech Talk #2 Coding Guidelines
Eway Tech Talk #2 Coding GuidelinesEway Tech Talk #2 Coding Guidelines
Eway Tech Talk #2 Coding GuidelinesTu Pham
 
Eway Tech Talk #0 Knowledge Sharing
Eway Tech Talk #0 Knowledge SharingEway Tech Talk #0 Knowledge Sharing
Eway Tech Talk #0 Knowledge SharingTu Pham
 
Php 5.6 vs Php 7 performance comparison
Php 5.6 vs Php 7 performance comparisonPhp 5.6 vs Php 7 performance comparison
Php 5.6 vs Php 7 performance comparisonTu Pham
 
System Security on Cloud
System Security on CloudSystem Security on Cloud
System Security on CloudTu Pham
 
Big data on google cloud
Big data on google cloudBig data on google cloud
Big data on google cloudTu Pham
 
Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding KubernetesTu Pham
 
Database, data storage, hosting with Firebase
Database, data storage, hosting with FirebaseDatabase, data storage, hosting with Firebase
Database, data storage, hosting with FirebaseTu Pham
 
Recommendation system for ecommerce
Recommendation system for ecommerceRecommendation system for ecommerce
Recommendation system for ecommerceTu Pham
 

More from Tu Pham (20)

Go from idea to app with no coding using AppSheet.pptx
Go from idea to app with no coding using AppSheet.pptxGo from idea to app with no coding using AppSheet.pptx
Go from idea to app with no coding using AppSheet.pptx
 
Secure your app against DDOS, API Abuse, Hijacking, and Fraud
 Secure your app against DDOS, API Abuse, Hijacking, and Fraud Secure your app against DDOS, API Abuse, Hijacking, and Fraud
Secure your app against DDOS, API Abuse, Hijacking, and Fraud
 
Challenges In Implementing SRE
Challenges In Implementing SREChallenges In Implementing SRE
Challenges In Implementing SRE
 
IT Strategy
IT Strategy IT Strategy
IT Strategy
 
Set up Learn and Development program
Set up Learn and Development programSet up Learn and Development program
Set up Learn and Development program
 
Cost Management For IT Project / Product
Cost Management For IT Project / ProductCost Management For IT Project / Product
Cost Management For IT Project / Product
 
Minimum Viable Product 101
Minimum Viable Product 101Minimum Viable Product 101
Minimum Viable Product 101
 
Understand your customers
Understand your customersUnderstand your customers
Understand your customers
 
Let's build great products for mid-size companies
Let's build great products for mid-size companiesLet's build great products for mid-size companies
Let's build great products for mid-size companies
 
Latency Control And Supervision In Resilience Design Patterns
Latency Control And Supervision In Resilience Design Patterns Latency Control And Supervision In Resilience Design Patterns
Latency Control And Supervision In Resilience Design Patterns
 
High Output Tech Management
High Output Tech Management High Output Tech Management
High Output Tech Management
 
Security On The Cloud
Security On The CloudSecurity On The Cloud
Security On The Cloud
 
Eway Tech Talk #2 Coding Guidelines
Eway Tech Talk #2 Coding GuidelinesEway Tech Talk #2 Coding Guidelines
Eway Tech Talk #2 Coding Guidelines
 
Eway Tech Talk #0 Knowledge Sharing
Eway Tech Talk #0 Knowledge SharingEway Tech Talk #0 Knowledge Sharing
Eway Tech Talk #0 Knowledge Sharing
 
Php 5.6 vs Php 7 performance comparison
Php 5.6 vs Php 7 performance comparisonPhp 5.6 vs Php 7 performance comparison
Php 5.6 vs Php 7 performance comparison
 
System Security on Cloud
System Security on CloudSystem Security on Cloud
System Security on Cloud
 
Big data on google cloud
Big data on google cloudBig data on google cloud
Big data on google cloud
 
Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding Kubernetes
 
Database, data storage, hosting with Firebase
Database, data storage, hosting with FirebaseDatabase, data storage, hosting with Firebase
Database, data storage, hosting with Firebase
 
Recommendation system for ecommerce
Recommendation system for ecommerceRecommendation system for ecommerce
Recommendation system for ecommerce
 

Recently uploaded

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 

Big Data Driven At Eway

  • 1. (Big) Data Driven at Eway Tu Pham - CTO @ Eway Journey to Google Cloud -- Ha Noi - 03/2019 --
  • 2. About Me - CTO at Eway JSC - Google Developer Expert on Cloud Platform - Open source contributor, blogger, father - 8 years experience on Big data and Cloud Computing
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 14.
  • 15.
  • 17. How Do We Use This Data
  • 18. Use Case - Reporting - Business Analytics - Operational Analytics - Product Features - System Monitoring
  • 19. - Reporting to: - Partners - Advertisers - Publishers Reporting
  • 20. Business Analytics - Analyzing - Growth - Users behavior - Sign up funnels - Sign up referrals - ...
  • 21. Operational Analytics - Analyzing - Root cause analysis - Latency analysis - Error analysis - Better - Threshold alerts - Security alerts - Capacity planning (server, bandwidth)
  • 22. Product Features - Product Features - Top Products - Adflex publisher challenge - Signup referrals - A/B Testing
  • 23.
  • 24. Samole: End-To-End Flow For Mining User Behavior
  • 25. How do we collect this data?
  • 26. Step 1: GC Compute Engine Instances Collect Raw Data - Technology: Cloud Load Balancing, Compute Engine - Why Cloud Load Balancing: - TCP/UDP Load Balancing - Seamless Autoscaling - Scalable - Why Compute Engine: - High-Performance - Scalable - Low Cost - Fast Networking - Custom Machine Types
  • 27. Step 1: GC Compute Engine Instances Collect Raw Data
  • 28. How do we process this data?
  • 29. Step 2: GC Compute Engine Instances Convert Raw Data To Apache Parquet Files - Technology: Compute Engine, Parquet file format - Why Parquet: - Self-describing, columnar storage format - Language-independent - High query-performance - Spark SQL is much faster with Parquet - High compression (up to 70%)- less disk IO
  • 30. Step 2: GC Compute Engine Instances Convert Raw Data To Apache Parquet Files
  • 31. Step 2: GC Compute Engine Instances Convert Raw Data To Apache Parquet Files
  • 32. - Technology: Compute Engine, Parquet file format, Cloud Storage - Why Cloud Storage: - Four storage classes - Easy to integrate - Object Lifecycle Management - Fast Networking Step 3: GC Compute Engine Upload Parquet File To GC Cloud Storage
  • 33. Step 3: GC Compute Engine Upload Parquet File To GC Cloud Storage
  • 34. Step 3: GC Compute Engine Upload Parquet File To GC Cloud Storage
  • 35. Step 3: GC Compute Engine Upload Parquet File To GC Cloud Storage
  • 36. Step 3: GC Compute Engine Upload Parquet File To GC Cloud Storage
  • 37. How do we visualize this data
  • 38. Step 4a: Explore Dataset Using GC Datalab - Technology: Cloud Datalab - Why Datalab: - Integrated with: Cloud BigQuery, Cloud Machine Learning Engine, Cloud Storage, and Stackdriver Monitoring - IPython Support & Notebook Format - Interactive Data Visualization - Multi-Language Support: Python, SQL, and JavaScript (for BigQuery user-defined functions
  • 39. Step 4b: Explore Dataset Using BI Tools - Technology: Grafana, PowerBI - Why: - Support >40 data sources (File, Database, Log Stream, Zabbix, Google Analytic, Google Calendar, AWS Cloudwatch, Jira, ...) - Query, visualize, alert on and understand your metrics - Create, explore, and share dashboards with your team
  • 40. Step 4a: Explore Dataset Using GC Datalab
  • 41.
  • 42.
  • 43.
  • 44.
  • 47. Create Your Principles Principles: - KISS (Keep it simple, stupid) - DRY (Don’t Repeat Yourself) - Single Responsibility - Low Cost - Scalable
  • 48. Be 1% better everyday tips Create your system principles Design system architecture, data flow, data model, data structure first Separate realtime and batch flows Separate data storage strategies between data types Save the cost by network cost, instances cost, storage cost by metric monitoring & alert system
  • 49. We Are Hiring ● Product Owner ● Backend Java Developer ● Full Stack PHP Developer ● Full Stack Python Developer ● DevOps Engineer
  • 50. Thank You - Q&A ● Eway: https://eway.vn ● My Contact: tupp@eway.vn