SlideShare a Scribd company logo
1 of 37
Download to read offline
Applying BigQueryML
on E-commerce Data Analytics
September 2020 - Split Croatia
Márton Kodok / @martonkodok
Google Developer Expert at REEA.net
● Among the Top3 romanians on Stackoverflow 175k reputation
● Google Developer Expert on Cloud technologies
● Crafting Web/Mobile backends at REEA.net
● BigQuery + Redis database engine expert
Slideshare: martonkodok
Twitter: @martonkodok
StackOverflow: pentium10
GitHub: pentium10
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
About me
1. E-commerce Workloads and data models
2. What is BigQuery? - Data warehouse in the Cloud
3. Introduction to BigQuery ML - execute ML models using SQL
4. Practical use cases
5. Predict, recommend and forecastwith BigQuery ML
6. Conclusions
Agenda
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Shop - products, tagging, features, attributes
Users profile, preferences, favorites, rating, engagement
Customers orders, re-orders, profile, associated products, survey, feedback, 360°
Analytics metrics, event data, page hits, email campaigns, A/B split tests
Upsells recommendations, price tags, strategy, discounts, vouchers
Enriched data sku, sentiment analysis, image parsing, object recognition
E-commerce Workloads and data models
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Shop - products, tagging, features, attributes
Users profile, preferences, favorites, rating, engagement
Customers orders, re-orders, profile, associated products, survey, feedback, 360°
Analytics metrics, event data, page hits, email campaigns, A/B split tests
Upsells recommendations, price tags, strategy, discounts, vouchers
Enriched data sku, sentiment analysis, image parsing, object recognition
E-commerce Workloads and data models
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
“ Where to store all these
rawdata?
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
BigQuery
On-Premises Servers
ApplicationEvents
Frontend
Metrics / Logs/
Streaming
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
SQL
Analytics-as-a-Service - Data Warehouse in the Cloud
Familiar DB Structure (table, columns, views, struct, nested, JSON)
Decent pricing (storage: $20/TB cold: $10/TB,queries $5/TB) *Sep 2020
SQL 2011 + Javascript UDF (User Defined Functions)
BigQuery ML enables users to create machine learning models by SQL queries
Scales into Exabytes on Managed Infrastructure
Integrates with Cloud SQL + Cloud Storage + Sheets + Pub/Sub connectors
What is BigQuery?
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
1. Load from file - either local or from GCS (max 5TB each)
2. Streaming rows - event driven approach - high throughput 1M rows/sec
3. Functions - observer-trigger based (Google Cloud Functions)
4. Join with Cloud SQL - Ability to join with MySQL, Postgres
5. Pipelines - flexibility to do ETL - FluentD, Kafka, Google Dataflow
6. Export from connected services - Firestore, Billing, AuditLogs, Stackdriver
7. Firebase - Analytics - Messaging - Crashlytics - Perf. Monitoring - Predictions
Loading Data into BigQuery
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
“ We have our app outside of GCP.
We need to join with our SQL database.
Solution: EXTERNAL_QUERY
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Combine on-premise with Cloud
App
Load
Balancing
NGINX
Compute Engine
10GB PD
2 1
Database Service (Master/Slave)
Compute Engine
10GB PD
4 1
Compute Engine
10GB PD
4 1
Compute Engine
10GB PD
4 1
BigQuery
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Zone 1
us-east1-a
Replica
Cloud SQL
Cloud
VPN
Gateway
Execute combined
queries
Report
EXTERNAL_QUERY: Run in BQ a query from Cloud SQL db
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
● SQL language to run BigData queries for everyday Devs
● run raw ad-hoc queries (either by analysts/sales or Devs)
● no more throwing away-, expiring-, aggregating old data
● it’s serverless
● no provisioning/deploy
● no running out of resources
● no more focus on large scale execution plan
Our benefits
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
What is BigQueryML?
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
BigQuery ML
1. CREATE MODEL in SQL to increase
development speed
2. Predict, recommend, foreast on tabular
data with SQL
3. Automate common ML tasks and
hyperparameter tuning by creating new
models as easy ascreatingtables
● Binary or Multiclass logistic regression for classification (labels can have up to 50 unique values)
● K-means clustering for data segmentation (unsupervised learning - not require labels/training)
● Recommend with Matrix factorization
● Model for performing time-series forecasts
● Import TensorFlow models for prediction in BigQuery
● Linear regression for forecasting - the sales of an item on a given day
● Boosted Tree for creating XGBoost | Deep Neural Network DNN models | AutoML tables
Supported models in BigQuery ML
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Conversion/Purchase prediction MODEL: Logistic-Regression
Predict if a user “converts” or "purchases". It is in the company's interest if many users sign up for this
membership as it helps streamline their Ads convertion and also helps with recurring revenue.
Customer Lifetime Value (LTV) prediction. MODEL: Logistic-Regression
It is used by the organisations to identify and prioritizesignificantcustomersegments that would be most
valuable to the company.
Customer Segmentation MODEL: K-means clustering
dividing a client base into groups in specific ways relevanttomarketing, such as interestsandspending
habits. Segmentation allows marketers to better customize their efforts to various audience groups.
E-commerce Use Cases
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Create a MODELthat predicts whether a website visitor will make a transaction.
● CREATEMODEL statement
● TheML.EVALUATE function to evaluate the ML model
● TheML.PREDICTfunction to make predictions using the ML model
Getting started with BigQuery ML
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Create a binarylogisticregressionmodel
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
3
2
Create training dataset
using a labelcolumn
CREATEMODEL syntax
1
2
SELECT features
3
1
Evaluate your model
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Predict
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Use cases:
● Customer segmentation
● Data quality
Options and defaults
● Number of clusters: Default log10
(num_rows) clusters
● Distance type - Euclidean(default), Cosine
● Supports all major SQL data types including GIS
K-means clustering
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
CREATE MODEL yourmodel
OPTIONS (model_type = “kmeans”)
AS SELECT..
FROM
ml.PREDICT maps rows to closest clusters
ml.CENTROID for cluster centroids
ml.EVALUATE
ml.TRAINING_INFO
ml.FEATURE_INFO
Available data:
● Encode yes/no features
(eg: has a microwave, has a kitchen, has a TV, has a bathroom)
● Can apply clustering on the encoded data
K-means clustering: Problem definition
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Premise
We can identify oddities
(potential data quality issues)
by grouping things together
and separating outliers.
K-means clustering: Problem definition
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Use cases:
● Product recommendation
● Marketing campaign target optimization tool
Options and defaults
● Input: User, Item, Rating
● Can use L2 regularization
● Specify training-test split (default random 80-20)
Matrix Factorization
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
CREATE MODEL yourmodel
OPTIONS (model_type = “matrix_factorization”)
AS SELECT..
FROM
ml.RECOMMEND for full user-item matrix
ml.EVALUATE
ml.WEIGHTS
ml.TRAINING_INFO
ml.FEATURE_INFO
Available data:
● User
● Item
● Rating
Problem
● assigning values for previously unknown values
(zeros in our case)
Matrix Factorization: Problem definition
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
BigQuery ML - Matrix Factorization
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
CREATE MODEL wr_temp.purchases_mf_model
options(model_type= 'matrix_factorization' )
as
SELECT user,item,rating FROM `wr_temp.purchases`;
SELECT * FROM
ML.RECOMMEND(MODEL wr_temp.purchases_mf_model);
Step 1
Create a model from a dataset.
Step 2
To view the rating associated with a
given user-item pair, use
ML.RECOMMEND with the model name.
The output will return a rating
for each user-item pair.
Use cases:
● All sort of time series data forecast
● Marketing campaign target optimization tool
Options and defaults
● Holiday effects adjustments by Region
● Seasonal and trend decomposition
● Auto data frequency detection
Time Series forecasting with ARIMA model
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
CREATE MODEL yourmodel
OPTIONS (model_type = “ARIMA”)
AS SELECT..
ml.FORECAST to be use with HORIZON
ml.EVALUATE
ml.ARIMA_COEFFICIENTS
Available data:
● Past Timestamp
● Past Value
Problem
● Forecasts for next X slots (called horizon)
Time Series forecasting with ARIMA model
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
SELECT forecast_timestamp, forecast_value FROM
ML.FORECAST(MODEL bqml_tutorial.nyc_citibike_arima_model,
STRUCT(300 AS horizon, 0.8 AS confidence_level))
Use cases:
● Easily add TensorFlow predictions to BigQuery
● Build unstructured data models in TensorFlow,
predict in BigQuery
Key restrictions
● Model size limit of 250MB
Import TensorFlow models for prediction
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
CREATE MODEL yourmodel
OPTIONS (model_type =“tensorflow”,
Model_path =’gs://’)
ml.PREDICT()
DEMO
Search 'QueryIt Smart' on GitHub to learn more.
Google Drive - Collaboratory - Jupyter Notebook
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
New on BigQuery UI - Evaluation charts
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Conclusions
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Automation
● Run the process daily
● Determine hyperparameters
● Surface the results and route them somewhere for inspection and improvement
Testing
● AB test around impact of data quality on conversion and customer NPS (net promoter score)
Improvements
● Determine, and explore outliers
● Repeat, automate
Considerations
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
● Democratizes the use of ML by empowering data analysts to build and run models using existing
business intelligence tools and spreadsheets
● Generalist team. Models are trained using SQL. There is no need to program an ML solution using
Python or Java.
● Increases the innovation and speed of model development by removing the need to export data from
the data warehouse.
● A Model serves a purpose. Easy to change/recycle.
Benefits of BigQuery ML
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
The possibilities are endless
Applying BigQuery ML on E-commerce Data Analytics @martonkodok
Marketing Retail IndustrialandIoT Media/gaming
Predict customer value
Predict funnel conversion
Personalize ads, email,
webpage content
Optimize inventory
Forecast revenue
Enable product
recommendations
Optimize staff promotions
Forecast demand for
parking, traffic utilities,
personnel
Prevent equipment
downtime
Predict maintenance needs
Personalize content
Predict game difficulty
Predict player lifetime value
Thank you.
Slides available on:
slideshare.net/martonkodok
Reea.net - Integrated web solutions driven by creativity
to deliver projects.

More Related Content

More from Márton Kodok

More from Márton Kodok (20)

Serverless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud WorkflowsServerless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud Workflows
 
Serverless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud WorkflowsServerless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud Workflows
 
Serverless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud WorkflowsServerless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud Workflows
 
BigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery MLBigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery ML
 
DevFest Romania 2020 Keynote: Bringing the Cloud to you.
DevFest Romania 2020 Keynote: Bringing the Cloud to you.DevFest Romania 2020 Keynote: Bringing the Cloud to you.
DevFest Romania 2020 Keynote: Bringing the Cloud to you.
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
 
Applying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsApplying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analytics
 
Supercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuery
 
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer Expertig
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer ExpertigVibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer Expertig
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer Expertig
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
 
Google Cloud Platform Solutions for DevOps Engineers
Google Cloud Platform Solutions  for DevOps EngineersGoogle Cloud Platform Solutions  for DevOps Engineers
Google Cloud Platform Solutions for DevOps Engineers
 
GDG DevFest Romania - Architecting for the Google Cloud Platform
GDG DevFest Romania - Architecting for the Google Cloud PlatformGDG DevFest Romania - Architecting for the Google Cloud Platform
GDG DevFest Romania - Architecting for the Google Cloud Platform
 
Next18 Extended Targu Mures - Bringing the Cloud to you
Next18 Extended Targu Mures - Bringing the Cloud to youNext18 Extended Targu Mures - Bringing the Cloud to you
Next18 Extended Targu Mures - Bringing the Cloud to you
 
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud Platformon
6. DISZ - Webalkalmazások skálázhatósága  a Google Cloud Platformon6. DISZ - Webalkalmazások skálázhatósága  a Google Cloud Platformon
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud Platformon
 
GCP - A felhőalapú architektúrák és szolgáltatások
GCP - A felhőalapú architektúrák és szolgáltatásokGCP - A felhőalapú architektúrák és szolgáltatások
GCP - A felhőalapú architektúrák és szolgáltatások
 
GDG Heraklion - Architecting for the Google Cloud Platform
GDG Heraklion - Architecting for the Google Cloud PlatformGDG Heraklion - Architecting for the Google Cloud Platform
GDG Heraklion - Architecting for the Google Cloud Platform
 
Efikot - Smart City, okos város - a jövőnk kulcsa
Efikot - Smart City, okos város - a jövőnk kulcsaEfikot - Smart City, okos város - a jövőnk kulcsa
Efikot - Smart City, okos város - a jövőnk kulcsa
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
 
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryGDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
 

Recently uploaded

The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 

Recently uploaded (20)

%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 

Applying BigQuery ML on e-commerce data analytics

  • 1. Applying BigQueryML on E-commerce Data Analytics September 2020 - Split Croatia Márton Kodok / @martonkodok Google Developer Expert at REEA.net
  • 2. ● Among the Top3 romanians on Stackoverflow 175k reputation ● Google Developer Expert on Cloud technologies ● Crafting Web/Mobile backends at REEA.net ● BigQuery + Redis database engine expert Slideshare: martonkodok Twitter: @martonkodok StackOverflow: pentium10 GitHub: pentium10 Applying BigQuery ML on E-commerce Data Analytics @martonkodok About me
  • 3. 1. E-commerce Workloads and data models 2. What is BigQuery? - Data warehouse in the Cloud 3. Introduction to BigQuery ML - execute ML models using SQL 4. Practical use cases 5. Predict, recommend and forecastwith BigQuery ML 6. Conclusions Agenda Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 4. Shop - products, tagging, features, attributes Users profile, preferences, favorites, rating, engagement Customers orders, re-orders, profile, associated products, survey, feedback, 360° Analytics metrics, event data, page hits, email campaigns, A/B split tests Upsells recommendations, price tags, strategy, discounts, vouchers Enriched data sku, sentiment analysis, image parsing, object recognition E-commerce Workloads and data models Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 5. Shop - products, tagging, features, attributes Users profile, preferences, favorites, rating, engagement Customers orders, re-orders, profile, associated products, survey, feedback, 360° Analytics metrics, event data, page hits, email campaigns, A/B split tests Upsells recommendations, price tags, strategy, discounts, vouchers Enriched data sku, sentiment analysis, image parsing, object recognition E-commerce Workloads and data models Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 6. “ Where to store all these rawdata? Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 7. BigQuery On-Premises Servers ApplicationEvents Frontend Metrics / Logs/ Streaming Applying BigQuery ML on E-commerce Data Analytics @martonkodok SQL
  • 8. Analytics-as-a-Service - Data Warehouse in the Cloud Familiar DB Structure (table, columns, views, struct, nested, JSON) Decent pricing (storage: $20/TB cold: $10/TB,queries $5/TB) *Sep 2020 SQL 2011 + Javascript UDF (User Defined Functions) BigQuery ML enables users to create machine learning models by SQL queries Scales into Exabytes on Managed Infrastructure Integrates with Cloud SQL + Cloud Storage + Sheets + Pub/Sub connectors What is BigQuery? Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 9. 1. Load from file - either local or from GCS (max 5TB each) 2. Streaming rows - event driven approach - high throughput 1M rows/sec 3. Functions - observer-trigger based (Google Cloud Functions) 4. Join with Cloud SQL - Ability to join with MySQL, Postgres 5. Pipelines - flexibility to do ETL - FluentD, Kafka, Google Dataflow 6. Export from connected services - Firestore, Billing, AuditLogs, Stackdriver 7. Firebase - Analytics - Messaging - Crashlytics - Perf. Monitoring - Predictions Loading Data into BigQuery Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 10. “ We have our app outside of GCP. We need to join with our SQL database. Solution: EXTERNAL_QUERY Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 11. Combine on-premise with Cloud App Load Balancing NGINX Compute Engine 10GB PD 2 1 Database Service (Master/Slave) Compute Engine 10GB PD 4 1 Compute Engine 10GB PD 4 1 Compute Engine 10GB PD 4 1 BigQuery Applying BigQuery ML on E-commerce Data Analytics @martonkodok Zone 1 us-east1-a Replica Cloud SQL Cloud VPN Gateway Execute combined queries Report
  • 12. EXTERNAL_QUERY: Run in BQ a query from Cloud SQL db Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 13. ● SQL language to run BigData queries for everyday Devs ● run raw ad-hoc queries (either by analysts/sales or Devs) ● no more throwing away-, expiring-, aggregating old data ● it’s serverless ● no provisioning/deploy ● no running out of resources ● no more focus on large scale execution plan Our benefits Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 14. What is BigQueryML? Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 15. BigQuery ML 1. CREATE MODEL in SQL to increase development speed 2. Predict, recommend, foreast on tabular data with SQL 3. Automate common ML tasks and hyperparameter tuning by creating new models as easy ascreatingtables
  • 16. ● Binary or Multiclass logistic regression for classification (labels can have up to 50 unique values) ● K-means clustering for data segmentation (unsupervised learning - not require labels/training) ● Recommend with Matrix factorization ● Model for performing time-series forecasts ● Import TensorFlow models for prediction in BigQuery ● Linear regression for forecasting - the sales of an item on a given day ● Boosted Tree for creating XGBoost | Deep Neural Network DNN models | AutoML tables Supported models in BigQuery ML Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 17. Conversion/Purchase prediction MODEL: Logistic-Regression Predict if a user “converts” or "purchases". It is in the company's interest if many users sign up for this membership as it helps streamline their Ads convertion and also helps with recurring revenue. Customer Lifetime Value (LTV) prediction. MODEL: Logistic-Regression It is used by the organisations to identify and prioritizesignificantcustomersegments that would be most valuable to the company. Customer Segmentation MODEL: K-means clustering dividing a client base into groups in specific ways relevanttomarketing, such as interestsandspending habits. Segmentation allows marketers to better customize their efforts to various audience groups. E-commerce Use Cases Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 18. Create a MODELthat predicts whether a website visitor will make a transaction. ● CREATEMODEL statement ● TheML.EVALUATE function to evaluate the ML model ● TheML.PREDICTfunction to make predictions using the ML model Getting started with BigQuery ML Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 19. Create a binarylogisticregressionmodel Applying BigQuery ML on E-commerce Data Analytics @martonkodok 3 2 Create training dataset using a labelcolumn CREATEMODEL syntax 1 2 SELECT features 3 1
  • 20. Evaluate your model Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 21. Predict Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 22. Use cases: ● Customer segmentation ● Data quality Options and defaults ● Number of clusters: Default log10 (num_rows) clusters ● Distance type - Euclidean(default), Cosine ● Supports all major SQL data types including GIS K-means clustering Applying BigQuery ML on E-commerce Data Analytics @martonkodok CREATE MODEL yourmodel OPTIONS (model_type = “kmeans”) AS SELECT.. FROM ml.PREDICT maps rows to closest clusters ml.CENTROID for cluster centroids ml.EVALUATE ml.TRAINING_INFO ml.FEATURE_INFO
  • 23. Available data: ● Encode yes/no features (eg: has a microwave, has a kitchen, has a TV, has a bathroom) ● Can apply clustering on the encoded data K-means clustering: Problem definition Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 24. Premise We can identify oddities (potential data quality issues) by grouping things together and separating outliers. K-means clustering: Problem definition Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 25. Use cases: ● Product recommendation ● Marketing campaign target optimization tool Options and defaults ● Input: User, Item, Rating ● Can use L2 regularization ● Specify training-test split (default random 80-20) Matrix Factorization Applying BigQuery ML on E-commerce Data Analytics @martonkodok CREATE MODEL yourmodel OPTIONS (model_type = “matrix_factorization”) AS SELECT.. FROM ml.RECOMMEND for full user-item matrix ml.EVALUATE ml.WEIGHTS ml.TRAINING_INFO ml.FEATURE_INFO
  • 26. Available data: ● User ● Item ● Rating Problem ● assigning values for previously unknown values (zeros in our case) Matrix Factorization: Problem definition Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 27. BigQuery ML - Matrix Factorization Applying BigQuery ML on E-commerce Data Analytics @martonkodok CREATE MODEL wr_temp.purchases_mf_model options(model_type= 'matrix_factorization' ) as SELECT user,item,rating FROM `wr_temp.purchases`; SELECT * FROM ML.RECOMMEND(MODEL wr_temp.purchases_mf_model); Step 1 Create a model from a dataset. Step 2 To view the rating associated with a given user-item pair, use ML.RECOMMEND with the model name. The output will return a rating for each user-item pair.
  • 28. Use cases: ● All sort of time series data forecast ● Marketing campaign target optimization tool Options and defaults ● Holiday effects adjustments by Region ● Seasonal and trend decomposition ● Auto data frequency detection Time Series forecasting with ARIMA model Applying BigQuery ML on E-commerce Data Analytics @martonkodok CREATE MODEL yourmodel OPTIONS (model_type = “ARIMA”) AS SELECT.. ml.FORECAST to be use with HORIZON ml.EVALUATE ml.ARIMA_COEFFICIENTS
  • 29. Available data: ● Past Timestamp ● Past Value Problem ● Forecasts for next X slots (called horizon) Time Series forecasting with ARIMA model Applying BigQuery ML on E-commerce Data Analytics @martonkodok SELECT forecast_timestamp, forecast_value FROM ML.FORECAST(MODEL bqml_tutorial.nyc_citibike_arima_model, STRUCT(300 AS horizon, 0.8 AS confidence_level))
  • 30. Use cases: ● Easily add TensorFlow predictions to BigQuery ● Build unstructured data models in TensorFlow, predict in BigQuery Key restrictions ● Model size limit of 250MB Import TensorFlow models for prediction Applying BigQuery ML on E-commerce Data Analytics @martonkodok CREATE MODEL yourmodel OPTIONS (model_type =“tensorflow”, Model_path =’gs://’) ml.PREDICT() DEMO Search 'QueryIt Smart' on GitHub to learn more.
  • 31. Google Drive - Collaboratory - Jupyter Notebook Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 32. New on BigQuery UI - Evaluation charts Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 33. Conclusions Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 34. Automation ● Run the process daily ● Determine hyperparameters ● Surface the results and route them somewhere for inspection and improvement Testing ● AB test around impact of data quality on conversion and customer NPS (net promoter score) Improvements ● Determine, and explore outliers ● Repeat, automate Considerations Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 35. ● Democratizes the use of ML by empowering data analysts to build and run models using existing business intelligence tools and spreadsheets ● Generalist team. Models are trained using SQL. There is no need to program an ML solution using Python or Java. ● Increases the innovation and speed of model development by removing the need to export data from the data warehouse. ● A Model serves a purpose. Easy to change/recycle. Benefits of BigQuery ML Applying BigQuery ML on E-commerce Data Analytics @martonkodok
  • 36. The possibilities are endless Applying BigQuery ML on E-commerce Data Analytics @martonkodok Marketing Retail IndustrialandIoT Media/gaming Predict customer value Predict funnel conversion Personalize ads, email, webpage content Optimize inventory Forecast revenue Enable product recommendations Optimize staff promotions Forecast demand for parking, traffic utilities, personnel Prevent equipment downtime Predict maintenance needs Personalize content Predict game difficulty Predict player lifetime value
  • 37. Thank you. Slides available on: slideshare.net/martonkodok Reea.net - Integrated web solutions driven by creativity to deliver projects.