SlideShare a Scribd company logo
1 of 15
Download to read offline
1
Dr. Stephen Dodson, Tech Lead Machine Learning,
Elastic
Machine Learning and
the Elastic Stack
2
Overview
•  Background
•  Machine Learning Overview
•  Machine Learning and the Elastic Stack
•  Demo
•  Architecture
Background
•  Me
–  Currently, Tech Lead, Machine Learning @ Elastic
–  Formally, Founder and CTO of Prelert (acquired by Elastic
September 2016)
‒  Presented overview of Prelert at Elastic London User Group in
May 2016
•  Prelert
–  VC backed software company, founded 2009
–  Behavioural analytics for machine data based (mainly) on
unsupervised machine learning
–  100+ customers + OEMs with CA, Bluecoat, NetApp + others
‒  IT Operations, IT Security, Retail analytics, IoT etc..
4
Machine Learning
•  Algorithms and methods for data driven prediction, decision making, and
modelling1
‒  Learn models from past behaviour (training, modelling)
‒  Use models to predict future behaviour (prediction)
‒  Use predictions to make decisions
•  Examples
‒  Image Recognition
‒  Language Translation
‒  Anomaly Detection
1Machine Learning Overview, Tommi Jaakkola, MIT
5
How is this relevant to the Elastic Stack?
•  Extracting useful, valuable information is hard
Search
Aggregations
Visualization
Machine Learning
6
How is this relevant to the Elastic Stack?
•  What if we want to search for:
‒  Has my order rate dropped significantly?
‒  Do my application logs contain unusual messages?
‒  Are any users behaving unusually?
‒  What transactions are fraudulent?
•  Goal of ML at Elastic: Extend the Elastic Stack to allow the user to ask these type of
questions and get understandable answers
•  Constraints:
‒  Data may be limited: no markup may be available or relevant
‒  Compute resource dedicated to machine learning may be limited
‒  User should not need to be a machine learning expert or data scientist
7
Has my order rate dropped significantly?
8
Has my order rate dropped significantly?
•  Learn models from past
behaviour (training, modelling)
•  Use models to predict future
behaviour (prediction)
•  Use predictions to make
decisions
Expected value @ 15:05 = 1859
Actual value @ 15:05 = 280
Probability = 0.0000174025
Demo: Simple Time Series
10
Do my application logs contain unusual messages?
11
Do my application logs contain unusual messages?
Classify unstructured log messages by clustering similar messages
NormalLogMessages
UnusuallogMessages
Demo: Multiple Data Sources
13
Analytics Outside of Elastic Architecture
Beats
Logstash
Kibana
X-Pack X-Pack
Elasticsearch Prelert analysis node
Data
Kibana
Prelert UI
•  Issues
–  Data Gravity – data from Elasticsearch needs to be sent to Prelert analytics node
–  Context – anomalies and data are stored in different data stores and viewed in different Uis
–  Scale – Prelert analysis was not easily distributable across nodes
–  Resilience – Prelert analysis needed to be restored manually on failover
14
Architecture
•  Machine Learning will be part of X-Pack
•  Machine Learning jobs will be automatically distributed across
the Elasticsearch cluster
•  Machine Learning jobs will be resilient to failover
•  Machine Learning results and data can be in the same cluster
Beats
Logstash
Kibana
X-Pack X-Pack
Elasticsearch
Security
Alerting
Monitoring
Reporting
Graph
Machine LearningICON
TBD!!
X-Pack
15
Status
•  Demo on Elastic 5.4 available at Elastic{ON} (March 7th 2017)
•  GA shortly after… (ask Sophie!)
•  Focus of initial ML product is time series analysis in real-time
‒  Metric anomaly detection
‒  Log message classification and anomaly detection
‒  Population analysis (entity profiling)
•  Shrink-wrapped configurations on Beats data - full Elastic Stack
experience!
Beats
X-Pack
Elasticsearch AlertingMachine LearningICON
TBD!!
Kibana

More Related Content

What's hot

What's hot (20)

What is MLOps
What is MLOpsWhat is MLOps
What is MLOps
 
Lakehouse in Azure
Lakehouse in AzureLakehouse in Azure
Lakehouse in Azure
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Windows Azure Blob Storage
Windows Azure Blob StorageWindows Azure Blob Storage
Windows Azure Blob Storage
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistent
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Azure SQL Database
Azure SQL Database Azure SQL Database
Azure SQL Database
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)
 
What’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningWhat’s New with Databricks Machine Learning
What’s New with Databricks Machine Learning
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLMigration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQL
 
Using AIOps to reduce incidents volume
Using AIOps to reduce incidents volumeUsing AIOps to reduce incidents volume
Using AIOps to reduce incidents volume
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
 
Ml ops on AWS
Ml ops on AWSMl ops on AWS
Ml ops on AWS
 
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
How HSBC Uses Serverless to Process Millions of Transactions in Real Time (FS...
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
 

Similar to Machine Learning and the Elastic Stack

Similar to Machine Learning and the Elastic Stack (20)

Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Machine Learning + Analytics in Splunk
Machine Learning + Analytics in Splunk Machine Learning + Analytics in Splunk
Machine Learning + Analytics in Splunk
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
 
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
 
Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
Machine Learning and Analytics in Splunk
Machine Learning and Analytics in SplunkMachine Learning and Analytics in Splunk
Machine Learning and Analytics in Splunk
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big data
 
Rise of the machines -- Owasp israel -- June 2014 meetup
Rise of the machines -- Owasp israel -- June 2014 meetupRise of the machines -- Owasp israel -- June 2014 meetup
Rise of the machines -- Owasp israel -- June 2014 meetup
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Making Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedMaking Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons Learned
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Philips john huffman
Philips john huffmanPhilips john huffman
Philips john huffman
 
Machine Learning in the Real World
Machine Learning in the Real WorldMachine Learning in the Real World
Machine Learning in the Real World
 
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
 
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should knowAIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
 

More from Yann Cluchey

Concurrency Patterns with MongoDB
Concurrency Patterns with MongoDBConcurrency Patterns with MongoDB
Concurrency Patterns with MongoDB
Yann Cluchey
 

More from Yann Cluchey (6)

Implementing Keyword Sort with Elasticsearch
Implementing Keyword Sort with ElasticsearchImplementing Keyword Sort with Elasticsearch
Implementing Keyword Sort with Elasticsearch
 
Annotated Text feature in Elasticsearch
Annotated Text feature in ElasticsearchAnnotated Text feature in Elasticsearch
Annotated Text feature in Elasticsearch
 
Elasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindowElasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindow
 
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
 
Lightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at CogentaLightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at Cogenta
 
Concurrency Patterns with MongoDB
Concurrency Patterns with MongoDBConcurrency Patterns with MongoDB
Concurrency Patterns with MongoDB
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 

Machine Learning and the Elastic Stack

  • 1. 1 Dr. Stephen Dodson, Tech Lead Machine Learning, Elastic Machine Learning and the Elastic Stack
  • 2. 2 Overview •  Background •  Machine Learning Overview •  Machine Learning and the Elastic Stack •  Demo •  Architecture
  • 3. Background •  Me –  Currently, Tech Lead, Machine Learning @ Elastic –  Formally, Founder and CTO of Prelert (acquired by Elastic September 2016) ‒  Presented overview of Prelert at Elastic London User Group in May 2016 •  Prelert –  VC backed software company, founded 2009 –  Behavioural analytics for machine data based (mainly) on unsupervised machine learning –  100+ customers + OEMs with CA, Bluecoat, NetApp + others ‒  IT Operations, IT Security, Retail analytics, IoT etc..
  • 4. 4 Machine Learning •  Algorithms and methods for data driven prediction, decision making, and modelling1 ‒  Learn models from past behaviour (training, modelling) ‒  Use models to predict future behaviour (prediction) ‒  Use predictions to make decisions •  Examples ‒  Image Recognition ‒  Language Translation ‒  Anomaly Detection 1Machine Learning Overview, Tommi Jaakkola, MIT
  • 5. 5 How is this relevant to the Elastic Stack? •  Extracting useful, valuable information is hard Search Aggregations Visualization Machine Learning
  • 6. 6 How is this relevant to the Elastic Stack? •  What if we want to search for: ‒  Has my order rate dropped significantly? ‒  Do my application logs contain unusual messages? ‒  Are any users behaving unusually? ‒  What transactions are fraudulent? •  Goal of ML at Elastic: Extend the Elastic Stack to allow the user to ask these type of questions and get understandable answers •  Constraints: ‒  Data may be limited: no markup may be available or relevant ‒  Compute resource dedicated to machine learning may be limited ‒  User should not need to be a machine learning expert or data scientist
  • 7. 7 Has my order rate dropped significantly?
  • 8. 8 Has my order rate dropped significantly? •  Learn models from past behaviour (training, modelling) •  Use models to predict future behaviour (prediction) •  Use predictions to make decisions Expected value @ 15:05 = 1859 Actual value @ 15:05 = 280 Probability = 0.0000174025
  • 10. 10 Do my application logs contain unusual messages?
  • 11. 11 Do my application logs contain unusual messages? Classify unstructured log messages by clustering similar messages NormalLogMessages UnusuallogMessages
  • 13. 13 Analytics Outside of Elastic Architecture Beats Logstash Kibana X-Pack X-Pack Elasticsearch Prelert analysis node Data Kibana Prelert UI •  Issues –  Data Gravity – data from Elasticsearch needs to be sent to Prelert analytics node –  Context – anomalies and data are stored in different data stores and viewed in different Uis –  Scale – Prelert analysis was not easily distributable across nodes –  Resilience – Prelert analysis needed to be restored manually on failover
  • 14. 14 Architecture •  Machine Learning will be part of X-Pack •  Machine Learning jobs will be automatically distributed across the Elasticsearch cluster •  Machine Learning jobs will be resilient to failover •  Machine Learning results and data can be in the same cluster Beats Logstash Kibana X-Pack X-Pack Elasticsearch Security Alerting Monitoring Reporting Graph Machine LearningICON TBD!! X-Pack
  • 15. 15 Status •  Demo on Elastic 5.4 available at Elastic{ON} (March 7th 2017) •  GA shortly after… (ask Sophie!) •  Focus of initial ML product is time series analysis in real-time ‒  Metric anomaly detection ‒  Log message classification and anomaly detection ‒  Population analysis (entity profiling) •  Shrink-wrapped configurations on Beats data - full Elastic Stack experience! Beats X-Pack Elasticsearch AlertingMachine LearningICON TBD!! Kibana