SlideShare a Scribd company logo
1 of 59
Download to read offline
Casting the Spell:
Druid in Practice
October 2020
Yakir Buskilla, Nielsen
Itai Yaffe, Imply
1
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
… …
@ItaiYaffe, @yakiro
… …
@ItaiYaffe, @yakiro
… …
Casting the Spell:
Druid in Practice
October 2020
Yakir Buskilla, Nielsen
Itai Yaffe, Imply
6
@ItaiYaffe, @yakiro
●
●
●
●
●
@ItaiYaffe, @yakiro
01. 02.
03. 04.
@ItaiYaffe, @yakiro
01. 02.
03. 04.
@ItaiYaffe, @yakiro
●
●
●
○
○
@ItaiYaffe, @yakiro
>10B events/day >30TB/day
S3
3000’s nodes/day
10’s of TB
ingested/day
druid
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
Awareness Consideration Intent Purchase
Drop-off
Drop-off
Drop-off
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
●
@ItaiYaffe, @yakiro
●
●
●
●
@ItaiYaffe, @yakiro
●
●
○
●
○
○
●
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
●
○
●
●
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
●
○
●
○
@ItaiYaffe, @yakiro
●
○
●
○
@ItaiYaffe, @yakiro
●
○
○
●
@ItaiYaffe, @yakiro
●
○
●
○
○
○
@ItaiYaffe, @yakiro
●
○
○
@ItaiYaffe, @yakiro
●
●
○
○
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
●
○
○
●
○
●
○
@ItaiYaffe, @yakiro
●
○
○
●
○
@ItaiYaffe, @yakiro
●
●
@ItaiYaffe, @yakiro
●
●
●
@ItaiYaffe, @yakiro
●
●
●
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
●
○
○
●
@ItaiYaffe, @yakiro
●
●
●
@ItaiYaffe, @yakiro
@ItaiYaffe, @yakiro
●
●
@ItaiYaffe, @yakiro
●
●
@ItaiYaffe, @yakiro
● …
●
○
○
○
@ItaiYaffe, @yakiro
●
○
○
■
■
@ItaiYaffe, @yakiro
●
○ SELECT APPROX_COUNT_DISTINCT_DS_THETA(user_id_sketch)
FROM campaign_1012
WHERE tactic = 1 AND
__time BETWEEN TIMESTAMP '2018-02-01' AND TIMESTAMP '2020-09-08'
@ItaiYaffe, @yakiro
●
○ SELECT APPROX_COUNT_DISTINCT_DS_THETA(user_id_sketch)
FROM campaign_1012
WHERE tactic = 1 AND
__time BETWEEN TIMESTAMP '2018-02-01' AND TIMESTAMP '2020-09-08'
○ SELECT APPROX_COUNT_DISTINCT_DS_THETA(user_id_sketch,65536)...
@ItaiYaffe, @yakiro
●
○ SELECT APPROX_COUNT_DISTINCT_DS_THETA(user_id_sketch)
FROM campaign_1012
WHERE tactic = 1 AND
__time BETWEEN TIMESTAMP '2018-02-01' AND TIMESTAMP '2020-09-08'
○ SELECT APPROX_COUNT_DISTINCT_DS_THETA(user_id_sketch,65536)...
○ APPROX_COUNT_DISTINCT_DS_THETA(expr, [size])
@ItaiYaffe, @yakiro
●
●
●
○
@ItaiYaffe, @yakiro
●
○
○
○
●
○
○
○
○
@ItaiYaffe, @yakiro
●
○
■
○
■
○
@ItaiYaffe, @yakiro
…
@ItaiYaffe, @yakiro
●
○
■
○
○
●
●
○
Time for questions
@ItaiYaffe
@yakiro
58
Thank you!
Apache Druid is an independent project of The Apache Software Foundation. More information can be found at https://druid.apache.org.
Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.
Dates: TBD
imply.io
59
Register Now for
the Next Druid
Virtual Summit

More Related Content

More from Imply

Apache Druid®: A Dance of Distributed Processes
 Apache Druid®: A Dance of Distributed Processes Apache Druid®: A Dance of Distributed Processes
Apache Druid®: A Dance of Distributed ProcessesImply
 
Splunk: Druid on Kubernetes with Druid-operator
Splunk: Druid on Kubernetes with Druid-operatorSplunk: Druid on Kubernetes with Druid-operator
Splunk: Druid on Kubernetes with Druid-operatorImply
 
Zeotap: Data Modeling in Druid for Non temporal and Nested Data
Zeotap: Data Modeling in Druid for Non temporal and Nested DataZeotap: Data Modeling in Druid for Non temporal and Nested Data
Zeotap: Data Modeling in Druid for Non temporal and Nested DataImply
 
Archmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidArchmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidImply
 
Building a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidBuilding a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidImply
 
Building Data Applications with Apache Druid
Building Data Applications with Apache DruidBuilding Data Applications with Apache Druid
Building Data Applications with Apache DruidImply
 
Maximizing Apache Druid performance: Beyond the basics
Maximizing Apache Druid performance: Beyond the basicsMaximizing Apache Druid performance: Beyond the basics
Maximizing Apache Druid performance: Beyond the basicsImply
 
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...Imply
 
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...Imply
 
How TrafficGuard uses Druid to Fight Ad Fraud and Bots
How TrafficGuard uses Druid to Fight Ad Fraud and BotsHow TrafficGuard uses Druid to Fight Ad Fraud and Bots
How TrafficGuard uses Druid to Fight Ad Fraud and BotsImply
 
Self Service Analytics at Twitch
Self Service Analytics at TwitchSelf Service Analytics at Twitch
Self Service Analytics at TwitchImply
 
Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...
Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...
Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...Imply
 
August meetup - All about Apache Druid
August meetup - All about Apache Druid August meetup - All about Apache Druid
August meetup - All about Apache Druid Imply
 
Benchmarking Apache Druid
Benchmarking Apache DruidBenchmarking Apache Druid
Benchmarking Apache DruidImply
 
Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Imply
 
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsWhy data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsImply
 
What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18Imply
 
Apache Druid Vision and Roadmap
Apache Druid Vision and RoadmapApache Druid Vision and Roadmap
Apache Druid Vision and RoadmapImply
 
Analytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at TwitterAnalytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at TwitterImply
 

More from Imply (19)

Apache Druid®: A Dance of Distributed Processes
 Apache Druid®: A Dance of Distributed Processes Apache Druid®: A Dance of Distributed Processes
Apache Druid®: A Dance of Distributed Processes
 
Splunk: Druid on Kubernetes with Druid-operator
Splunk: Druid on Kubernetes with Druid-operatorSplunk: Druid on Kubernetes with Druid-operator
Splunk: Druid on Kubernetes with Druid-operator
 
Zeotap: Data Modeling in Druid for Non temporal and Nested Data
Zeotap: Data Modeling in Druid for Non temporal and Nested DataZeotap: Data Modeling in Druid for Non temporal and Nested Data
Zeotap: Data Modeling in Druid for Non temporal and Nested Data
 
Archmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidArchmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on Druid
 
Building a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidBuilding a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache Druid
 
Building Data Applications with Apache Druid
Building Data Applications with Apache DruidBuilding Data Applications with Apache Druid
Building Data Applications with Apache Druid
 
Maximizing Apache Druid performance: Beyond the basics
Maximizing Apache Druid performance: Beyond the basicsMaximizing Apache Druid performance: Beyond the basics
Maximizing Apache Druid performance: Beyond the basics
 
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
 
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
 
How TrafficGuard uses Druid to Fight Ad Fraud and Bots
How TrafficGuard uses Druid to Fight Ad Fraud and BotsHow TrafficGuard uses Druid to Fight Ad Fraud and Bots
How TrafficGuard uses Druid to Fight Ad Fraud and Bots
 
Self Service Analytics at Twitch
Self Service Analytics at TwitchSelf Service Analytics at Twitch
Self Service Analytics at Twitch
 
Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...
Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...
Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...
 
August meetup - All about Apache Druid
August meetup - All about Apache Druid August meetup - All about Apache Druid
August meetup - All about Apache Druid
 
Benchmarking Apache Druid
Benchmarking Apache DruidBenchmarking Apache Druid
Benchmarking Apache Druid
 
Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)
 
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsWhy data warehouses cannot support hot analytics
Why data warehouses cannot support hot analytics
 
What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18
 
Apache Druid Vision and Roadmap
Apache Druid Vision and RoadmapApache Druid Vision and Roadmap
Apache Druid Vision and Roadmap
 
Analytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at TwitterAnalytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at Twitter
 

Recently uploaded

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Recently uploaded (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

Nielsen: Casting the Spell - Druid in Practice