SlideShare a Scribd company logo
1 of 23
Download to read offline
Yannick Dawant & Vinh Nguyen
MovingfromMySQLto
ElasticsearchforAnalytics
— What is Analytics, and why is it important to Percolate?
— Analytics 1.0 - MySQL
— Analytics 2.0 - Elasticsearch
— Next Steps
Agenda
TheSystemofRecordforMarketing
WhatdoesAnalyticsmeanto
Percolate?

Howdoesitwork?
Analytics1.0-Design
Crawlers MySQL
API
UI
Facebook
Twitter
Instagram
LinkedIn
[…]
metrics
MySQLDataModel
post_id service_id tag created_at
1 1 blog 2016-01-01 10:11:15
2 1 blog, video 2016-01-01 12:12:30
3 2 election 2016 2016-01-01 10:10:57
metric_id service_id name
1 1 likes
2 1 comments
3 1 follows
4 2 follows
5 2 mentions
6 2 retweets
post_id metric_id metric_value captured_at
1 1 10 2016-01-01 10:11:15
1 1 20 2016-01-01 12:12:30
2 2 5 2016-01-01 10:10:57
2 2 10 2016-01-01 13:12:20
3 1 15 2016-01-01 13:12:45
3 2 30 2016-01-01 17:05:11
[post]
service_id name
1 facebook
2 twitter
3 instagram
[service]
[post_metrics] [metric_names]
— Relational data models
— Very well known pattern
— Application-level objects map cleanly to DB tables
— Joins are easy to do
— Easy to use
— Amazon RDS for managed hosting/deployment/monitoring
— Very familiar to Ops team and other developers, shared knowledge base
— Lots of support available online
— Met product requirements
WhyMySQL?
Seemsreasonable.

Whatarethetradeoffs?
— Data Modeling Issues
— Starts easy but becomes complex over time (increasing number of tables)
— Schema inflexibility (dynamic changes, unused columns)
— Hard to modify live schemas, may require downtime
— Slow Queries
— Lots of joins at query time
— Tables grow larger and larger over time
— Hard to partition Time series data
— Expensive post-processing on application side
MySQLTradeoffs
— Scalability Issues
— Database grows larger and larger over time
— Scaling is mostly vertical (add more CPU/RAM/disk to same node), may require downtime
— Hard to scale horizontally
— Not suitable for our Search needs
MySQLTradeoffs
Wheredowegofromhere?
Analytics1.0-Design
Crawlers MySQL
API
UI
Facebook
Twitter
Instagram
LinkedIn
[…]
metrics
Analytics2.0-Design
Crawlers Elasticsearch
API
UI
Facebook
Twitter
Instagram
LinkedIn
[…]
MySQL
Kafka Data Transformation
metrics
Data Transformation
— Decouples data collection from storage
— Enhances reliability of our data pipelines
— Message queue persistence, replay
— Enhances horizontal scalability of our data pipelines
— Multiple brokers, parallel consumers/producers
WhyKafka?
— Applies data transformation rules
— Validation, enrichment, denormalization, rollups
— Writes data to various indexes in ES
— Error handling
— Network issues, ES load/timeout issues, mapping conflicts
— Multiple workers to increase overall throughput
— Real time and asynchronous workers
DataTransformation
{

"_index" : "analytics_2016-11-01",

"_type" : "post",

"_id" : "f6065582-a2d7-11e6-bee7-22000ae51cc9",

"post_id": "19398339",
"service": "facebook",

"captured_at": "2016-10-31T20:32:17+00:00",

"metrics": {

"comments": 13,

"consumptions": 132,
“engaged": 24,
"impressions": 132,
"likes": 50,
“negative_feedback": 5,
"reach": 93,

"shares": 76
“video_views": 42

},

"tags": ["blog","video"]

}
ElasticsearchDataModel
— Document based datastore
— Flexible schemas, dynamic mapping, mapping templates
— JSON, rich data structures, nested objects
— REST APIs make integration simple
— Query performance
— Shards spread across nodes (versus entire MySQL DB/table on single node)
— Rolling indexes for Time series data == querying only the indexes needed (versus entire
MySQL table)
WhyElasticsearch?
— Search
— Rich set of built-in queries
— Powerful aggregations (and sub aggregations)
— Scalability
— More control over shards and indexes
— Horizontally scale by adding more nodes and clusters
— Easy to archive old data/indexes to free up resources
— Meets current and *new* product requirements
WhyElasticsearch?
Seemsreasonable.

Whatarethetradeoffs?
— Data updates are more complex
— Update by query, upserts, script security issues
— Not truly schema-less
— Reindexing is time consuming
— Adding fields, mapping conflicts
— Still need custom, index management layer
— Index mappings, settings, templates, naming patterns, data retention, backup/restore
— Operating ES requires effort
— Deployment, configuration, performance tuning, monitoring
ElasticsearchTradeoffs
— More index management
— Better support for different types of indexes, each with own settings
— Add APIs + Tools for operations
— Avoid oversharding, which causes cluster stability issues
— More focus on UPDATE operations
— Field updates (i.e. tags) require update by query/script
— Faster reindexing (i.e. adding new fields, changing field mappings)
— Slow updates/reindexing can affect other system operations/transactions
— Data denormalization vs joins
— More production monitoring
NextSteps
https://percolate.com/careers/
We’reHiring!

More Related Content

Similar to Moving From MySQL to Elasticsearch for Analytics

Splunk at Oscar Health
Splunk at Oscar HealthSplunk at Oscar Health
Splunk at Oscar HealthSplunk
 
Sql server 2008 r2 data mining whitepaper overview
Sql server 2008 r2 data mining whitepaper overviewSql server 2008 r2 data mining whitepaper overview
Sql server 2008 r2 data mining whitepaper overviewKlaudiia Jacome
 
24 Hours of PASS -- Enterprise Data Mining with SQL Server
24 Hours of PASS -- Enterprise Data Mining with SQL Server24 Hours of PASS -- Enterprise Data Mining with SQL Server
24 Hours of PASS -- Enterprise Data Mining with SQL ServerMark Tabladillo
 
Linda Ege Resume
Linda Ege ResumeLinda Ege Resume
Linda Ege ResumeLinda Ege
 
Scaling up your Analytics & Insights
Scaling up your Analytics & InsightsScaling up your Analytics & Insights
Scaling up your Analytics & InsightsLoQutus
 
SQL Saturday 119 Chicago -- Enterprise Data Mining with SQL Server
SQL Saturday 119 Chicago -- Enterprise Data Mining with SQL ServerSQL Saturday 119 Chicago -- Enterprise Data Mining with SQL Server
SQL Saturday 119 Chicago -- Enterprise Data Mining with SQL ServerMark Tabladillo
 
Building the BI system and analytics capabilities at the company based on Rea...
Building the BI system and analytics capabilities at the company based on Rea...Building the BI system and analytics capabilities at the company based on Rea...
Building the BI system and analytics capabilities at the company based on Rea...GameCamp
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking VN
 
Prithvi Prabhu + Shivam Bansal, H2O.ai - Building Blocks for AI Applications ...
Prithvi Prabhu + Shivam Bansal, H2O.ai - Building Blocks for AI Applications ...Prithvi Prabhu + Shivam Bansal, H2O.ai - Building Blocks for AI Applications ...
Prithvi Prabhu + Shivam Bansal, H2O.ai - Building Blocks for AI Applications ...Sri Ambati
 
Introduction To SQL Server 2014
Introduction To SQL Server 2014Introduction To SQL Server 2014
Introduction To SQL Server 2014Vishal Pawar
 
Yandex Metrica - SEO Meet-up Melbourne
Yandex Metrica - SEO Meet-up MelbourneYandex Metrica - SEO Meet-up Melbourne
Yandex Metrica - SEO Meet-up MelbourneAnton Surov
 
Build Answer-generating Apps that Users Love: Development best practices for ...
Build Answer-generating Apps that Users Love: Development best practices for ...Build Answer-generating Apps that Users Love: Development best practices for ...
Build Answer-generating Apps that Users Love: Development best practices for ...TIBCO Jaspersoft
 
Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...
Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...
Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...Rahul Neel Mani
 
IRJET- Data Analytics & Visualization using Qlik
IRJET- Data Analytics & Visualization using QlikIRJET- Data Analytics & Visualization using Qlik
IRJET- Data Analytics & Visualization using QlikIRJET Journal
 
MANISH SHARMA(MSBI-QLIKVIEW)
MANISH SHARMA(MSBI-QLIKVIEW)MANISH SHARMA(MSBI-QLIKVIEW)
MANISH SHARMA(MSBI-QLIKVIEW)manish sharma
 
SQL Saturday 108 -- Enterprise Data Mining with SQL Server
SQL Saturday 108 -- Enterprise Data Mining with SQL ServerSQL Saturday 108 -- Enterprise Data Mining with SQL Server
SQL Saturday 108 -- Enterprise Data Mining with SQL ServerMark Tabladillo
 
Navigating the Workday Analytics and Reporting Ecosystem
Navigating the Workday Analytics and Reporting EcosystemNavigating the Workday Analytics and Reporting Ecosystem
Navigating the Workday Analytics and Reporting EcosystemWorkday, Inc.
 

Similar to Moving From MySQL to Elasticsearch for Analytics (20)

Splunk at Oscar Health
Splunk at Oscar HealthSplunk at Oscar Health
Splunk at Oscar Health
 
Sql server 2008 r2 data mining whitepaper overview
Sql server 2008 r2 data mining whitepaper overviewSql server 2008 r2 data mining whitepaper overview
Sql server 2008 r2 data mining whitepaper overview
 
24 Hours of PASS -- Enterprise Data Mining with SQL Server
24 Hours of PASS -- Enterprise Data Mining with SQL Server24 Hours of PASS -- Enterprise Data Mining with SQL Server
24 Hours of PASS -- Enterprise Data Mining with SQL Server
 
Linda Ege Resume
Linda Ege ResumeLinda Ege Resume
Linda Ege Resume
 
Data mining (Part I)
Data mining (Part I)Data mining (Part I)
Data mining (Part I)
 
Scaling up your Analytics & Insights
Scaling up your Analytics & InsightsScaling up your Analytics & Insights
Scaling up your Analytics & Insights
 
SQL Saturday 119 Chicago -- Enterprise Data Mining with SQL Server
SQL Saturday 119 Chicago -- Enterprise Data Mining with SQL ServerSQL Saturday 119 Chicago -- Enterprise Data Mining with SQL Server
SQL Saturday 119 Chicago -- Enterprise Data Mining with SQL Server
 
Building the BI system and analytics capabilities at the company based on Rea...
Building the BI system and analytics capabilities at the company based on Rea...Building the BI system and analytics capabilities at the company based on Rea...
Building the BI system and analytics capabilities at the company based on Rea...
 
marutibabu
marutibabumarutibabu
marutibabu
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...
 
Prithvi Prabhu + Shivam Bansal, H2O.ai - Building Blocks for AI Applications ...
Prithvi Prabhu + Shivam Bansal, H2O.ai - Building Blocks for AI Applications ...Prithvi Prabhu + Shivam Bansal, H2O.ai - Building Blocks for AI Applications ...
Prithvi Prabhu + Shivam Bansal, H2O.ai - Building Blocks for AI Applications ...
 
Introduction To SQL Server 2014
Introduction To SQL Server 2014Introduction To SQL Server 2014
Introduction To SQL Server 2014
 
Yandex Metrica - SEO Meet-up Melbourne
Yandex Metrica - SEO Meet-up MelbourneYandex Metrica - SEO Meet-up Melbourne
Yandex Metrica - SEO Meet-up Melbourne
 
Build Answer-generating Apps that Users Love: Development best practices for ...
Build Answer-generating Apps that Users Love: Development best practices for ...Build Answer-generating Apps that Users Love: Development best practices for ...
Build Answer-generating Apps that Users Love: Development best practices for ...
 
Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...
Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...
Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...
 
IRJET- Data Analytics & Visualization using Qlik
IRJET- Data Analytics & Visualization using QlikIRJET- Data Analytics & Visualization using Qlik
IRJET- Data Analytics & Visualization using Qlik
 
MANISH SHARMA(MSBI-QLIKVIEW)
MANISH SHARMA(MSBI-QLIKVIEW)MANISH SHARMA(MSBI-QLIKVIEW)
MANISH SHARMA(MSBI-QLIKVIEW)
 
SQL Saturday 108 -- Enterprise Data Mining with SQL Server
SQL Saturday 108 -- Enterprise Data Mining with SQL ServerSQL Saturday 108 -- Enterprise Data Mining with SQL Server
SQL Saturday 108 -- Enterprise Data Mining with SQL Server
 
Navigating the Workday Analytics and Reporting Ecosystem
Navigating the Workday Analytics and Reporting EcosystemNavigating the Workday Analytics and Reporting Ecosystem
Navigating the Workday Analytics and Reporting Ecosystem
 
Mine craft:
Mine craft: Mine craft:
Mine craft:
 

More from Percolate

7 Project Management Tips from Across Disciplines
7 Project Management Tips from Across Disciplines7 Project Management Tips from Across Disciplines
7 Project Management Tips from Across DisciplinesPercolate
 
Moving from Stateful Components to Stateless Components
Moving from Stateful Components to Stateless ComponentsMoving from Stateful Components to Stateless Components
Moving from Stateful Components to Stateless ComponentsPercolate
 
Content Strategy in a Changing World
Content Strategy in a Changing WorldContent Strategy in a Changing World
Content Strategy in a Changing WorldPercolate
 
Pratt Parser in Python
Pratt Parser in PythonPratt Parser in Python
Pratt Parser in PythonPercolate
 
The 50 Most Important Marketing Charts of 2016
The 50 Most Important Marketing Charts of 2016The 50 Most Important Marketing Charts of 2016
The 50 Most Important Marketing Charts of 2016Percolate
 
The Secret to Brand Growth? Mental and Physical Availability
The Secret to Brand Growth? Mental and Physical AvailabilityThe Secret to Brand Growth? Mental and Physical Availability
The Secret to Brand Growth? Mental and Physical AvailabilityPercolate
 
Advertising for the Long Term
Advertising for the Long TermAdvertising for the Long Term
Advertising for the Long TermPercolate
 
Be Distinctive, Not Different
Be Distinctive, Not DifferentBe Distinctive, Not Different
Be Distinctive, Not DifferentPercolate
 
Why Mass Marketing Wins Over Targeted Efforts
Why Mass Marketing Wins Over Targeted EffortsWhy Mass Marketing Wins Over Targeted Efforts
Why Mass Marketing Wins Over Targeted EffortsPercolate
 
Small vs. Large Brands: How to Become a Market Leader
Small vs. Large Brands: How to Become a Market LeaderSmall vs. Large Brands: How to Become a Market Leader
Small vs. Large Brands: How to Become a Market LeaderPercolate
 
11 Charts that Predict the Future of Marketing
11 Charts that Predict the Future of Marketing11 Charts that Predict the Future of Marketing
11 Charts that Predict the Future of MarketingPercolate
 
Percolate's Company Values
Percolate's Company ValuesPercolate's Company Values
Percolate's Company ValuesPercolate
 
7 Lessons Marketers Can Learn From MasterCard to become a Global Publishing P...
7 Lessons Marketers Can Learn From MasterCard to become a Global Publishing P...7 Lessons Marketers Can Learn From MasterCard to become a Global Publishing P...
7 Lessons Marketers Can Learn From MasterCard to become a Global Publishing P...Percolate
 
How Much Does Marketing Really Cost?
How Much Does Marketing Really Cost?How Much Does Marketing Really Cost?
How Much Does Marketing Really Cost?Percolate
 
Technology Macro Trends - What Marketers Need to Know in 2014
Technology Macro Trends - What Marketers Need to Know in 2014Technology Macro Trends - What Marketers Need to Know in 2014
Technology Macro Trends - What Marketers Need to Know in 2014Percolate
 
State of Content Marketing
State of Content MarketingState of Content Marketing
State of Content MarketingPercolate
 
How Percolate uses CFEngine to Manage AWS Stateless Infrastructure
How Percolate uses CFEngine to Manage AWS Stateless InfrastructureHow Percolate uses CFEngine to Manage AWS Stateless Infrastructure
How Percolate uses CFEngine to Manage AWS Stateless InfrastructurePercolate
 
IPG Media Lab's Jack Pollock presents employees as signals at Percolate's #SP...
IPG Media Lab's Jack Pollock presents employees as signals at Percolate's #SP...IPG Media Lab's Jack Pollock presents employees as signals at Percolate's #SP...
IPG Media Lab's Jack Pollock presents employees as signals at Percolate's #SP...Percolate
 
Building Community with American Express OPEN
Building Community with American Express OPENBuilding Community with American Express OPEN
Building Community with American Express OPENPercolate
 
MasterCard's Jennifer Stalzer presents The Evolution of the Corporate Newsroo...
MasterCard's Jennifer Stalzer presents The Evolution of the Corporate Newsroo...MasterCard's Jennifer Stalzer presents The Evolution of the Corporate Newsroo...
MasterCard's Jennifer Stalzer presents The Evolution of the Corporate Newsroo...Percolate
 

More from Percolate (20)

7 Project Management Tips from Across Disciplines
7 Project Management Tips from Across Disciplines7 Project Management Tips from Across Disciplines
7 Project Management Tips from Across Disciplines
 
Moving from Stateful Components to Stateless Components
Moving from Stateful Components to Stateless ComponentsMoving from Stateful Components to Stateless Components
Moving from Stateful Components to Stateless Components
 
Content Strategy in a Changing World
Content Strategy in a Changing WorldContent Strategy in a Changing World
Content Strategy in a Changing World
 
Pratt Parser in Python
Pratt Parser in PythonPratt Parser in Python
Pratt Parser in Python
 
The 50 Most Important Marketing Charts of 2016
The 50 Most Important Marketing Charts of 2016The 50 Most Important Marketing Charts of 2016
The 50 Most Important Marketing Charts of 2016
 
The Secret to Brand Growth? Mental and Physical Availability
The Secret to Brand Growth? Mental and Physical AvailabilityThe Secret to Brand Growth? Mental and Physical Availability
The Secret to Brand Growth? Mental and Physical Availability
 
Advertising for the Long Term
Advertising for the Long TermAdvertising for the Long Term
Advertising for the Long Term
 
Be Distinctive, Not Different
Be Distinctive, Not DifferentBe Distinctive, Not Different
Be Distinctive, Not Different
 
Why Mass Marketing Wins Over Targeted Efforts
Why Mass Marketing Wins Over Targeted EffortsWhy Mass Marketing Wins Over Targeted Efforts
Why Mass Marketing Wins Over Targeted Efforts
 
Small vs. Large Brands: How to Become a Market Leader
Small vs. Large Brands: How to Become a Market LeaderSmall vs. Large Brands: How to Become a Market Leader
Small vs. Large Brands: How to Become a Market Leader
 
11 Charts that Predict the Future of Marketing
11 Charts that Predict the Future of Marketing11 Charts that Predict the Future of Marketing
11 Charts that Predict the Future of Marketing
 
Percolate's Company Values
Percolate's Company ValuesPercolate's Company Values
Percolate's Company Values
 
7 Lessons Marketers Can Learn From MasterCard to become a Global Publishing P...
7 Lessons Marketers Can Learn From MasterCard to become a Global Publishing P...7 Lessons Marketers Can Learn From MasterCard to become a Global Publishing P...
7 Lessons Marketers Can Learn From MasterCard to become a Global Publishing P...
 
How Much Does Marketing Really Cost?
How Much Does Marketing Really Cost?How Much Does Marketing Really Cost?
How Much Does Marketing Really Cost?
 
Technology Macro Trends - What Marketers Need to Know in 2014
Technology Macro Trends - What Marketers Need to Know in 2014Technology Macro Trends - What Marketers Need to Know in 2014
Technology Macro Trends - What Marketers Need to Know in 2014
 
State of Content Marketing
State of Content MarketingState of Content Marketing
State of Content Marketing
 
How Percolate uses CFEngine to Manage AWS Stateless Infrastructure
How Percolate uses CFEngine to Manage AWS Stateless InfrastructureHow Percolate uses CFEngine to Manage AWS Stateless Infrastructure
How Percolate uses CFEngine to Manage AWS Stateless Infrastructure
 
IPG Media Lab's Jack Pollock presents employees as signals at Percolate's #SP...
IPG Media Lab's Jack Pollock presents employees as signals at Percolate's #SP...IPG Media Lab's Jack Pollock presents employees as signals at Percolate's #SP...
IPG Media Lab's Jack Pollock presents employees as signals at Percolate's #SP...
 
Building Community with American Express OPEN
Building Community with American Express OPENBuilding Community with American Express OPEN
Building Community with American Express OPEN
 
MasterCard's Jennifer Stalzer presents The Evolution of the Corporate Newsroo...
MasterCard's Jennifer Stalzer presents The Evolution of the Corporate Newsroo...MasterCard's Jennifer Stalzer presents The Evolution of the Corporate Newsroo...
MasterCard's Jennifer Stalzer presents The Evolution of the Corporate Newsroo...
 

Recently uploaded

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx9to5mart
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 

Recently uploaded (20)

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 

Moving From MySQL to Elasticsearch for Analytics

  • 1. Yannick Dawant & Vinh Nguyen MovingfromMySQLto ElasticsearchforAnalytics
  • 2. — What is Analytics, and why is it important to Percolate? — Analytics 1.0 - MySQL — Analytics 2.0 - Elasticsearch — Next Steps Agenda
  • 6. MySQLDataModel post_id service_id tag created_at 1 1 blog 2016-01-01 10:11:15 2 1 blog, video 2016-01-01 12:12:30 3 2 election 2016 2016-01-01 10:10:57 metric_id service_id name 1 1 likes 2 1 comments 3 1 follows 4 2 follows 5 2 mentions 6 2 retweets post_id metric_id metric_value captured_at 1 1 10 2016-01-01 10:11:15 1 1 20 2016-01-01 12:12:30 2 2 5 2016-01-01 10:10:57 2 2 10 2016-01-01 13:12:20 3 1 15 2016-01-01 13:12:45 3 2 30 2016-01-01 17:05:11 [post] service_id name 1 facebook 2 twitter 3 instagram [service] [post_metrics] [metric_names]
  • 7. — Relational data models — Very well known pattern — Application-level objects map cleanly to DB tables — Joins are easy to do — Easy to use — Amazon RDS for managed hosting/deployment/monitoring — Very familiar to Ops team and other developers, shared knowledge base — Lots of support available online — Met product requirements WhyMySQL?
  • 9. — Data Modeling Issues — Starts easy but becomes complex over time (increasing number of tables) — Schema inflexibility (dynamic changes, unused columns) — Hard to modify live schemas, may require downtime — Slow Queries — Lots of joins at query time — Tables grow larger and larger over time — Hard to partition Time series data — Expensive post-processing on application side MySQLTradeoffs
  • 10. — Scalability Issues — Database grows larger and larger over time — Scaling is mostly vertical (add more CPU/RAM/disk to same node), may require downtime — Hard to scale horizontally — Not suitable for our Search needs MySQLTradeoffs
  • 14. — Decouples data collection from storage — Enhances reliability of our data pipelines — Message queue persistence, replay — Enhances horizontal scalability of our data pipelines — Multiple brokers, parallel consumers/producers WhyKafka?
  • 15. — Applies data transformation rules — Validation, enrichment, denormalization, rollups — Writes data to various indexes in ES — Error handling — Network issues, ES load/timeout issues, mapping conflicts — Multiple workers to increase overall throughput — Real time and asynchronous workers DataTransformation
  • 16. {
 "_index" : "analytics_2016-11-01",
 "_type" : "post",
 "_id" : "f6065582-a2d7-11e6-bee7-22000ae51cc9",
 "post_id": "19398339", "service": "facebook",
 "captured_at": "2016-10-31T20:32:17+00:00",
 "metrics": {
 "comments": 13,
 "consumptions": 132, “engaged": 24, "impressions": 132, "likes": 50, “negative_feedback": 5, "reach": 93,
 "shares": 76 “video_views": 42
 },
 "tags": ["blog","video"]
 } ElasticsearchDataModel
  • 17. — Document based datastore — Flexible schemas, dynamic mapping, mapping templates — JSON, rich data structures, nested objects — REST APIs make integration simple — Query performance — Shards spread across nodes (versus entire MySQL DB/table on single node) — Rolling indexes for Time series data == querying only the indexes needed (versus entire MySQL table) WhyElasticsearch?
  • 18. — Search — Rich set of built-in queries — Powerful aggregations (and sub aggregations) — Scalability — More control over shards and indexes — Horizontally scale by adding more nodes and clusters — Easy to archive old data/indexes to free up resources — Meets current and *new* product requirements WhyElasticsearch?
  • 20. — Data updates are more complex — Update by query, upserts, script security issues — Not truly schema-less — Reindexing is time consuming — Adding fields, mapping conflicts — Still need custom, index management layer — Index mappings, settings, templates, naming patterns, data retention, backup/restore — Operating ES requires effort — Deployment, configuration, performance tuning, monitoring ElasticsearchTradeoffs
  • 21. — More index management — Better support for different types of indexes, each with own settings — Add APIs + Tools for operations — Avoid oversharding, which causes cluster stability issues — More focus on UPDATE operations — Field updates (i.e. tags) require update by query/script — Faster reindexing (i.e. adding new fields, changing field mappings) — Slow updates/reindexing can affect other system operations/transactions — Data denormalization vs joins — More production monitoring NextSteps
  • 22.