SlideShare a Scribd company logo
1 of 32
Download to read offline
eventually elasticsearch
dealing with temporal inconsistencies in
the real world ™
AnneVeling | @anneveling | March 25, 2015
agenda
• Introduction
• Bol.com Plaza / Square project
• Using ElasticSearch in a mixed DB landscape
– ES as a DB free-text index or as a separate DB
• Consistency issues and solutions
• Lessons learned
bol.com
• Leading ecommerce platform inThe Netherlands and Belgium
– 5M active customers
– 1M visits every day
– 9M products
– €680M revenue
• Growing (pains)
– 750 employees, 37 scrum teams
– moving towards continuous deployment, team independence
• Plaza / Square Seller platform
– 7k sellers, 16% of total revenue
Square ElasticSearch
• Using ElasticSearch to combine Offer and Product information
– Offers from Oracle
– Products from MongoDb
• ReplacingOracle SQL queries
– Too slow for faceting and result sets (for sellers with over 2k offers)
• About 12M productoffer documents
• Scala,Team 1B
• ElasticSearch 1.4
– With Search, Master and Data nodes
• In production now, rolling out to sellers
data model
products offers
productoffers
architecture
SDD
SDD
PCS
PCS
STEP
SSY
ES
products offers
productoffers
??
option: right
• ElasticSearch as a free-text DB index on Offers
• DB update  update ES too
– In the same ‘transaction’
• Benefits
– easier
• Drawbacks
– Less service independence
– Slower (b/c refresh)
SDD
SDD
PCS
PCS
STEP
SSY
ES
option: left
SDD
SDD
PCS
PCS
STEP
SSY
ES
• ElasticSearch as a separate database
• Updates from DB sent to ES via async queues
• Benefits
– Architecture more loosely coupled
– Search performance
• Drawbacks
– some latency between DB and ES: eventual consistency
architecture
SDD
SDD
PCS
PCS
STEP
SSY
ES
products offers
productoffers
SDD
SDD
PCS
PCS
STEP
SSY
ES
update Offerupdate Product
SDDPCS
ES
offer data
facets
results
product data
eventual consistency
consistent
consistent
inconsistent
user db
time
temporal inconsistency
“immediate” consistency?
• Relational databases
– User view vs. DB view
– Take it or leave it
– Only vertical scaling
• ElasticSearch
– Read snapshots by
refresh interval
– Caching
– Write once, read many
user 1 db user 2
START TRANSACTION;
UPDATE OFFERS SET STOCK=1 WHERE ID=42;
COMMIT TRANSACTION;
sources of temporal inconsistencies
• Internal inconsistencies
– within ElasticSearch
• External inconsistencies
– nature of ElasticSearch
– between Database and ElasticSearch
– between User expectations and Application behavior
send data to index API
receives new data
updates index
quorum says ‘ok’
app master replica
got ‘ok’
user
curl -XPOST localhost:9200/demo/drinks -d
'{brand:"Glenlivet", age:18}’
{"_index":"demo","_type":"drinks","_id":"AUxKuw5pxgWzNUrImnD4
","_version":1,"created":true}
app master searchuser
curl -XPOST localhost:9200/demo/drinks -d '{brand:"Glenlivet", age:18}’
{"_index":"demo","_type":"drinks","_id":"AUxKuw5pxgWzNUrImnD4","_version":1
,"created":true}
curl -XPOST localhost:9200/demo/drinks/_search -d
'{query:{match:{brand:"Glenlivet"}}}'
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0}
,"hits":{"total":0,"max_score":null,"hits":[]}}
refresh
refresh
index.refresh_interval
influencing search refresh
• Set index.refresh_interval
curl -XPUT localhost:9200/demo/_settings -d
'{index:{refresh_interval:"30s"}}’
• Refresh on demand
curl -XPOST localhost:9200/demo/_refresh
• Refresh after index (be careful!)
curl -XPOST
'localhost:9200/demo/drinks?refresh=true' -d
'{brand:"Famous Grouse", age:12}’
dealing with search delay
For a user updating a single item in the UI
• On the client
– Wait until refresh_interval has passed before searching again
– Do a get-by-id for changed item (=real time)
• And only change the single item (but: aggregations out sync)
• On the server
– Wait until refresh_interval has passed
– Show a “done” message and hope user is slow
– Refresh all searchers upon index (all searches slower!)
– Add queue priority
– Update ES too
• Or: accept eventual consistency
app ES dbqueue
async queue issue
Measure DB  ES latency
{drinks: { _timestamp: {enabled: true, store: 'yes'}}}
localhost:9200/demo/_search?fields=_timestamp,_version,_source
measuring DB  ES latency
POST /productoffer-005/_search?fields=_timestamp,_source
{
"size":0,
"query": {
"range": {
"modificationDate": {
"from": "now-7d"
}
}
},
"aggs": {
"hokje": {
"date_histogram": {
"field": "dateModification",
"interval": "10m"
},
"aggs": {
"q": {
"stats": {
"script”:
"doc['_timestamp'].value - doc['modificationDate'].value"
}
}
}
}
app ES db
async queue issue
app ES dbqueue
queue order issue
• Only update if newer (w/ optimistic locking)
– read (with _version)  update  index (with expected _version)  retry
• version_type=external, use DB last-modified timestamp
curl -XPUT
localhost:9200/demo/drinks/1?version=1427279177904&version_type=
external -d '{brand: "Glenlivet", age: 12}'
conclusions
• Compromises hurt someone
• Are you sure you want an eventual-consistent
database?
– Lots of patch work needed by bol.com…
– Choose left, make it look like you chose right
• In real-life, consistency concerns
– more than just ES-writes
– Also ES-reads
– How to get data in and keep fresh influences
DBES
DBES
right: as a free-text index
left: as a separate DB
ES Consistency
knobs to control “consistency level”
eventualimmediate
faster
slower
1
4
2
3
1. Optimistic
locking &
refresh=true
2. -
3. -
4. Eventually
consistent
ES DB
ES
ES
searcher
R CUD
refresh_interval
?consistency
_version
action.write_consistency
?refresh
indexer
immediateeventual
consistency
slower faster
performance
(read & write)
lessons learned
• Make assumptions even more clear
• There is more to eventual consistency than you think
– User-oriented round-trip consistency latency in a mixed DB
context
• Use the ES knobs and dials to make it
– as consistent as you need
– while keeping it as fast as you can
• You have to know what you’re doing
thank you
@anneveling
‘t is een kwestie van geduld
rustig wachten op de dag
dat heel Holland Elasticsearch lult
dat heel Holland Elasticsearch lult
eventually: Elasticsearch.

More Related Content

What's hot

What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...Rahul K Chauhan
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginnersNeil Baker
 
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Edureka!
 
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...Lucidworks
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneRahul Jain
 
Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Neo4j
 
Elasticsearch From the Bottom Up
Elasticsearch From the Bottom UpElasticsearch From the Bottom Up
Elasticsearch From the Bottom Upfoundsearch
 
Searching Relational Data with Elasticsearch
Searching Relational Data with ElasticsearchSearching Relational Data with Elasticsearch
Searching Relational Data with Elasticsearchsirensolutions
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchBo Andersen
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseRobert Lujo
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Federico Panini
 
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Fwdays
 
Elasticsearch in 15 minutes
Elasticsearch in 15 minutesElasticsearch in 15 minutes
Elasticsearch in 15 minutesDavid Pilato
 
Search domain basics
Search domain basicsSearch domain basics
Search domain basicspmanvi
 

What's hot (18)

What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginners
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch Introduction
 
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
 
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...
 
elasticsearch
elasticsearchelasticsearch
elasticsearch
 
Elasticsearch From the Bottom Up
Elasticsearch From the Bottom UpElasticsearch From the Bottom Up
Elasticsearch From the Bottom Up
 
Searching Relational Data with Elasticsearch
Searching Relational Data with ElasticsearchSearching Relational Data with Elasticsearch
Searching Relational Data with Elasticsearch
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
 
Elasticsearch in 15 minutes
Elasticsearch in 15 minutesElasticsearch in 15 minutes
Elasticsearch in 15 minutes
 
Search domain basics
Search domain basicsSearch domain basics
Search domain basics
 
Cascalog
CascalogCascalog
Cascalog
 

Viewers also liked

Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseKristijan Duvnjak
 
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen..."Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...Yelp Engineering
 
eBay Experimentation Platform on Hadoop
eBay Experimentation Platform on HadoopeBay Experimentation Platform on Hadoop
eBay Experimentation Platform on HadoopTony Ng
 
Guias Brain Trauma
Guias Brain TraumaGuias Brain Trauma
Guias Brain Traumaguest324998
 
PARTES INTERNAS Y EXTERNAS DEL COMPUTADOR
PARTES INTERNAS Y EXTERNAS DEL COMPUTADORPARTES INTERNAS Y EXTERNAS DEL COMPUTADOR
PARTES INTERNAS Y EXTERNAS DEL COMPUTADORDiseñadora Gráfica
 
Thiet ke Brochure - Vietcapital 2008
Thiet ke Brochure - Vietcapital 2008Thiet ke Brochure - Vietcapital 2008
Thiet ke Brochure - Vietcapital 2008Viết Nội Dung
 
Webconference: La legalidad en Internet
Webconference: La legalidad en InternetWebconference: La legalidad en Internet
Webconference: La legalidad en InternetEAE Business School
 
Data science como motor de la innovación
Data science como motor de la innovaciónData science como motor de la innovación
Data science como motor de la innovaciónRoberto Muñoz
 
Impact of Media on Indian Soceity
Impact of Media on Indian SoceityImpact of Media on Indian Soceity
Impact of Media on Indian Soceitysumanth ch
 
Screendesign und Web-Accessibility
Screendesign und Web-AccessibilityScreendesign und Web-Accessibility
Screendesign und Web-AccessibilityMaria Putzhuber
 
Brick Essay & Notes
Brick Essay & NotesBrick Essay & Notes
Brick Essay & NotesMonty Sansom
 
Fct de consommation et l'épargne
Fct de consommation et l'épargneFct de consommation et l'épargne
Fct de consommation et l'épargneMejdoubi Amal
 
Ljus & oljus (ljuskvalitet & hälsa)
Ljus & oljus (ljuskvalitet & hälsa)Ljus & oljus (ljuskvalitet & hälsa)
Ljus & oljus (ljuskvalitet & hälsa)Inger Glimmero
 
The ADHD Epidemic in America
The ADHD Epidemic in AmericaThe ADHD Epidemic in America
The ADHD Epidemic in Americaworldwideww
 

Viewers also liked (20)

Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational database
 
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen..."Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
 
eBay Experimentation Platform on Hadoop
eBay Experimentation Platform on HadoopeBay Experimentation Platform on Hadoop
eBay Experimentation Platform on Hadoop
 
Guias Brain Trauma
Guias Brain TraumaGuias Brain Trauma
Guias Brain Trauma
 
PARTES INTERNAS Y EXTERNAS DEL COMPUTADOR
PARTES INTERNAS Y EXTERNAS DEL COMPUTADORPARTES INTERNAS Y EXTERNAS DEL COMPUTADOR
PARTES INTERNAS Y EXTERNAS DEL COMPUTADOR
 
Thiet ke Brochure - Vietcapital 2008
Thiet ke Brochure - Vietcapital 2008Thiet ke Brochure - Vietcapital 2008
Thiet ke Brochure - Vietcapital 2008
 
Webconference: La legalidad en Internet
Webconference: La legalidad en InternetWebconference: La legalidad en Internet
Webconference: La legalidad en Internet
 
Data science como motor de la innovación
Data science como motor de la innovaciónData science como motor de la innovación
Data science como motor de la innovación
 
Marketing and consumer trends
Marketing and consumer trendsMarketing and consumer trends
Marketing and consumer trends
 
De triana al orbe
De triana al orbeDe triana al orbe
De triana al orbe
 
Catalog spanish
Catalog spanishCatalog spanish
Catalog spanish
 
Impact of Media on Indian Soceity
Impact of Media on Indian SoceityImpact of Media on Indian Soceity
Impact of Media on Indian Soceity
 
Screendesign und Web-Accessibility
Screendesign und Web-AccessibilityScreendesign und Web-Accessibility
Screendesign und Web-Accessibility
 
Asturias
AsturiasAsturias
Asturias
 
Brick Essay & Notes
Brick Essay & NotesBrick Essay & Notes
Brick Essay & Notes
 
Jabon con inclusiones
Jabon con inclusionesJabon con inclusiones
Jabon con inclusiones
 
Fct de consommation et l'épargne
Fct de consommation et l'épargneFct de consommation et l'épargne
Fct de consommation et l'épargne
 
Ljus & oljus (ljuskvalitet & hälsa)
Ljus & oljus (ljuskvalitet & hälsa)Ljus & oljus (ljuskvalitet & hälsa)
Ljus & oljus (ljuskvalitet & hälsa)
 
Chp13 E Blueprint
Chp13 E BlueprintChp13 E Blueprint
Chp13 E Blueprint
 
The ADHD Epidemic in America
The ADHD Epidemic in AmericaThe ADHD Epidemic in America
The ADHD Epidemic in America
 

Similar to Eventually Elasticsearch: Eventual Consistency in the Real World

Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbMongoDB APAC
 
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015NoSQLmatters
 
Elastic search and Symfony3 - A practical approach
Elastic search and Symfony3 - A practical approachElastic search and Symfony3 - A practical approach
Elastic search and Symfony3 - A practical approachSymfonyMu
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveIBM Cloud Data Services
 
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersSQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersLucidworks
 
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Spark Summit
 
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News! ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News! Embarcadero Technologies
 
Full Stack Development With Node.Js And NoSQL (Nic Raboy & Arun Gupta)
Full Stack Development With Node.Js And NoSQL (Nic Raboy & Arun Gupta)Full Stack Development With Node.Js And NoSQL (Nic Raboy & Arun Gupta)
Full Stack Development With Node.Js And NoSQL (Nic Raboy & Arun Gupta)Red Hat Developers
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Lucidworks
 
2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQLYu Ishikawa
 
Mtn view sql server nov 2014
Mtn view sql server nov 2014Mtn view sql server nov 2014
Mtn view sql server nov 2014EspressoLogic
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksGrega Kespret
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overviewAmit Juneja
 
MSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance AppsMSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance AppsMarc Obaldo
 
Access Data from XPages with the Relational Controls
Access Data from XPages with the Relational ControlsAccess Data from XPages with the Relational Controls
Access Data from XPages with the Relational ControlsTeamstudio
 
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's DataFrom Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's DataDatabricks
 

Similar to Eventually Elasticsearch: Eventual Consistency in the Real World (20)

Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
Buildingsocialanalyticstoolwithmongodb
 
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
 
Elastic search and Symfony3 - A practical approach
Elastic search and Symfony3 - A practical approachElastic search and Symfony3 - A practical approach
Elastic search and Symfony3 - A practical approach
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersSQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
 
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
 
Dev Ops Training
Dev Ops TrainingDev Ops Training
Dev Ops Training
 
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News! ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
 
Full Stack Development With Node.Js And NoSQL (Nic Raboy & Arun Gupta)
Full Stack Development With Node.Js And NoSQL (Nic Raboy & Arun Gupta)Full Stack Development With Node.Js And NoSQL (Nic Raboy & Arun Gupta)
Full Stack Development With Node.Js And NoSQL (Nic Raboy & Arun Gupta)
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6
 
2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL
 
Mtn view sql server nov 2014
Mtn view sql server nov 2014Mtn view sql server nov 2014
Mtn view sql server nov 2014
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
 
CDC to the Max!
CDC to the Max!CDC to the Max!
CDC to the Max!
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overview
 
MSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance AppsMSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance Apps
 
Access Data from XPages with the Relational Controls
Access Data from XPages with the Relational ControlsAccess Data from XPages with the Relational Controls
Access Data from XPages with the Relational Controls
 
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
 
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's DataFrom Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
 

Recently uploaded

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 

Recently uploaded (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 

Eventually Elasticsearch: Eventual Consistency in the Real World

  • 1. eventually elasticsearch dealing with temporal inconsistencies in the real world ™ AnneVeling | @anneveling | March 25, 2015
  • 2. agenda • Introduction • Bol.com Plaza / Square project • Using ElasticSearch in a mixed DB landscape – ES as a DB free-text index or as a separate DB • Consistency issues and solutions • Lessons learned
  • 3. bol.com • Leading ecommerce platform inThe Netherlands and Belgium – 5M active customers – 1M visits every day – 9M products – €680M revenue • Growing (pains) – 750 employees, 37 scrum teams – moving towards continuous deployment, team independence • Plaza / Square Seller platform – 7k sellers, 16% of total revenue
  • 4.
  • 5.
  • 6. Square ElasticSearch • Using ElasticSearch to combine Offer and Product information – Offers from Oracle – Products from MongoDb • ReplacingOracle SQL queries – Too slow for faceting and result sets (for sellers with over 2k offers) • About 12M productoffer documents • Scala,Team 1B • ElasticSearch 1.4 – With Search, Master and Data nodes • In production now, rolling out to sellers
  • 9. option: right • ElasticSearch as a free-text DB index on Offers • DB update  update ES too – In the same ‘transaction’ • Benefits – easier • Drawbacks – Less service independence – Slower (b/c refresh) SDD SDD PCS PCS STEP SSY ES
  • 10. option: left SDD SDD PCS PCS STEP SSY ES • ElasticSearch as a separate database • Updates from DB sent to ES via async queues • Benefits – Architecture more loosely coupled – Search performance • Drawbacks – some latency between DB and ES: eventual consistency
  • 14.
  • 17. “immediate” consistency? • Relational databases – User view vs. DB view – Take it or leave it – Only vertical scaling • ElasticSearch – Read snapshots by refresh interval – Caching – Write once, read many user 1 db user 2 START TRANSACTION; UPDATE OFFERS SET STOCK=1 WHERE ID=42; COMMIT TRANSACTION;
  • 18. sources of temporal inconsistencies • Internal inconsistencies – within ElasticSearch • External inconsistencies – nature of ElasticSearch – between Database and ElasticSearch – between User expectations and Application behavior
  • 19. send data to index API receives new data updates index quorum says ‘ok’ app master replica got ‘ok’ user curl -XPOST localhost:9200/demo/drinks -d '{brand:"Glenlivet", age:18}’ {"_index":"demo","_type":"drinks","_id":"AUxKuw5pxgWzNUrImnD4 ","_version":1,"created":true}
  • 20. app master searchuser curl -XPOST localhost:9200/demo/drinks -d '{brand:"Glenlivet", age:18}’ {"_index":"demo","_type":"drinks","_id":"AUxKuw5pxgWzNUrImnD4","_version":1 ,"created":true} curl -XPOST localhost:9200/demo/drinks/_search -d '{query:{match:{brand:"Glenlivet"}}}' {"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0} ,"hits":{"total":0,"max_score":null,"hits":[]}} refresh refresh index.refresh_interval
  • 21. influencing search refresh • Set index.refresh_interval curl -XPUT localhost:9200/demo/_settings -d '{index:{refresh_interval:"30s"}}’ • Refresh on demand curl -XPOST localhost:9200/demo/_refresh • Refresh after index (be careful!) curl -XPOST 'localhost:9200/demo/drinks?refresh=true' -d '{brand:"Famous Grouse", age:12}’
  • 22. dealing with search delay For a user updating a single item in the UI • On the client – Wait until refresh_interval has passed before searching again – Do a get-by-id for changed item (=real time) • And only change the single item (but: aggregations out sync) • On the server – Wait until refresh_interval has passed – Show a “done” message and hope user is slow – Refresh all searchers upon index (all searches slower!) – Add queue priority – Update ES too • Or: accept eventual consistency
  • 23. app ES dbqueue async queue issue Measure DB  ES latency {drinks: { _timestamp: {enabled: true, store: 'yes'}}} localhost:9200/demo/_search?fields=_timestamp,_version,_source
  • 24. measuring DB  ES latency POST /productoffer-005/_search?fields=_timestamp,_source { "size":0, "query": { "range": { "modificationDate": { "from": "now-7d" } } }, "aggs": { "hokje": { "date_histogram": { "field": "dateModification", "interval": "10m" }, "aggs": { "q": { "stats": { "script”: "doc['_timestamp'].value - doc['modificationDate'].value" } } } }
  • 25. app ES db async queue issue
  • 26. app ES dbqueue queue order issue • Only update if newer (w/ optimistic locking) – read (with _version)  update  index (with expected _version)  retry • version_type=external, use DB last-modified timestamp curl -XPUT localhost:9200/demo/drinks/1?version=1427279177904&version_type= external -d '{brand: "Glenlivet", age: 12}'
  • 27. conclusions • Compromises hurt someone • Are you sure you want an eventual-consistent database? – Lots of patch work needed by bol.com… – Choose left, make it look like you chose right • In real-life, consistency concerns – more than just ES-writes – Also ES-reads – How to get data in and keep fresh influences DBES DBES right: as a free-text index left: as a separate DB
  • 28. ES Consistency knobs to control “consistency level” eventualimmediate faster slower 1 4 2 3 1. Optimistic locking & refresh=true 2. - 3. - 4. Eventually consistent
  • 31. lessons learned • Make assumptions even more clear • There is more to eventual consistency than you think – User-oriented round-trip consistency latency in a mixed DB context • Use the ES knobs and dials to make it – as consistent as you need – while keeping it as fast as you can • You have to know what you’re doing
  • 32. thank you @anneveling ‘t is een kwestie van geduld rustig wachten op de dag dat heel Holland Elasticsearch lult dat heel Holland Elasticsearch lult eventually: Elasticsearch.