SlideShare a Scribd company logo
1 of 23
ELASTICSEARCH INTRODUCTION
Kristijan Duvnjak & Mladen MaravićZagreb, 27.03.2015.
Elasticsearch as a search
alternative to a relational database
PART 1
1
What is Elasticsearch?
What is Elasticsearch (ES)?
Document-oriented schema-free "database"
Built on top of Apache Lucene
Real-time search and data analytics
Full-text search
Distributed (horizontal scalability)
High-avalability
REST API
2
"Open Source (Apache 2)
distributed
RESTful
search engine
built on top of Lucene"
ES for relational database users...
3
Oracle Elasticsearch
Database Index
Partition Shard
Table Type
Row Document
Column Field
Schema Mapping
Index - (everything is indexed)
SQL Query DSL
Clustering – single node cluster
Node = running instance of ES
Cluster = 1+ nodes with the same cluster.name
Every cluster has 1 master node
Clients talk to any node in the cluster
1 Cluster can have any number of indexes
4
About indexes & shards
All data is stored inside one or more indexes
Index has one or more shards (change
requires reindexing)
One index is one folder somewhere on disk
Backup an index? Just tar/zip the folder....
5
Each shard is one full instance of Lucene
Each shard can have zero or more replicas
(can be changed at any time)
Index Shard
Clustering – adding a second node
Example above:
3 indexes
Each index has one primary (P) and one replica (R) shard
6
Clustering – adding a third node
More primary shards:
faster indexing
more scale
More replicas:
faster searching
more failover
7
About documents...
Documents are JSON-based
Schema-free, but not necessarily!
If no schema:
ES guesses field type
and indexes it
With schema (or explicit mapping):
Mapping applies to specific document type (type is just a label)
Mapping defines the following for each field:
─ kind (string, number, date...)
─ to index or not
─ to store data or not
8
About documents...
Each document has an ID (auto-generated or manually assigned)
You can force placement of a document into a specific shard – routing!
Versioning is available – optimistic version control !
9
Index details
inverted index
Elasticsearch Server 1.0 (doc 1)
Mastering Elasticsearch (doc 2)
Apache Solr 4 Cookbook (doc 3)
10
Term Count Document
1.0 1 <1>
4 1 <3>
apache 1 <3>
cookbook 1 <3>
elasticsearch 2 <1>,<2>
mastering 1 <2>
server 1 <1>
solr 1 <3>
Indexing example
GET /blog/_search
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "blog",
"_type": "blog_comment",
"_id": "AUzhH9M9HW_GzrF8oLAj",
"_score": 1,
"_source": {
"user_id": 1,
"date": "2015-04-01T13:12:12",
"comment": "What’s so cool about Elasticsearch?"
}
}
]
}
}
11
POST /blog/blog_comment?routing=1
{
"user_id" : 1,
"date" : "2015-04-01T13:12:12",
"comment" : "What’s so cool about Elasticsearch?"
}
GET /blog/_mapping
{
"blog": {
"mappings": {
"blog_comment": {
"properties": {
"comment": {
"type": "string"
},
"date": {
"type": "date",
"format": "dateOptionalTime"
},
"user_id": {
"type": "long"
}
}
}
}
}
}
Storing data - indexing
data input: REST, Java API, Rivers*
data analysis: tokenizer and one or more filters
types of filters:
lowercase filter – makes all tokens lowercased
synonyms filter – changes one token to another on the basis of synonym rules
language stemming filters - reducing tokens into root or base forms, the stem
different data storing needs
string analyze,not_analyze field configuration
_all in field
memory field data or doc values
segments, segment merging, throttling
routing, indexing with routing
12
We query them!
All the usual stuff (think of WHERE in SQL)
Full text search with support for:
highlighting
stemming
ngrams & edge-ngrams
Aggregations: term facets, date histograms, ranges
Geo search: bounding box, distance,distance ranges, polygons
Percolators (or reverse-search!)
So, we can store documents
and then what?!?
13
Query details
search types (query_then_fetch, query_and_fetch ...)
same type of analysis as indexing
explain plan
sorting,aggregating data with in memory or on disk values
search filters
Boolean
And/Or/Not
filter cache, BitSets
routing, searching with routing
14
PBZ use case
turnovers by account: 600M documents, 200M/year
routing by account number
indexing performance, 30k-40k documents per second
DB performance in seconds, ES performance in ms (3500 queries/sec):
find last 100 turnovers for a given account number: < 50 ms
find last 100 turnovers for a given account number where description contains some words:
<100ms
15
PART 2
16
Cluster architecture
PBZ ES cluster architecture
17
DATA node 1
DATA node 2
Elasticsearch cluster
CLIENT node 1
NETWORK
DISPATCHER
CLIENT node 2
MASTER node 1
MASTER node 2
MASTER node 3
Cluster per datacenter
DATA node 1
DATA node 2Elasticsearch cluster
CLIENT node
NETWORK
DISPATCHER
MASTER node
PBZ ES cluster architecture
18
Cluster per datacenter
Elasticsearch Administration
plugins
Marvel – monitoring console (GC, throttiling, CPU, memory, heap, search/indexing statistics ...)
Sense – REST UI to Elasticsearch
custom plugins (JDBC rivers ...)
security
Apache Web server
Elasticsearch Shield
speeding up queries using warmers
19
PART 3
20
ELK
PART 4
21
Q & A
Mladen Maravić & Kristijan Duvnjak
22

More Related Content

What's hot

Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiRobert Calcavecchia
 
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineDaniel N
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedBeyondTrees
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Vinay Kumar
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchSperasoft
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic IntroductionMayur Rathod
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...Rahul K Chauhan
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchBo Andersen
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data AnalyticsFelipe
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsAlaa Elhadba
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseRobert Lujo
 
Elastic Search
Elastic SearchElastic Search
Elastic SearchNavule Rao
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearchpmanvi
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneRahul Jain
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Philips Kokoh Prasetyo
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearchhypto
 

What's hot (19)

Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch Introduction
 
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic Introduction
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data Analytics
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & Aggregations
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Elastic search
Elastic searchElastic search
Elastic search
 

Viewers also liked

Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...DataWorks Summit
 
Data modeling for Elasticsearch
Data modeling for ElasticsearchData modeling for Elasticsearch
Data modeling for ElasticsearchFlorian Hopf
 
Elasticsearch in Zalando
Elasticsearch in ZalandoElasticsearch in Zalando
Elasticsearch in ZalandoAlaa Elhadba
 
Elasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English versionElasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English versionDavid Pilato
 
Elastic Search (엘라스틱서치) 입문
Elastic Search (엘라스틱서치) 입문Elastic Search (엘라스틱서치) 입문
Elastic Search (엘라스틱서치) 입문SeungHyun Eom
 
Logging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & KibanaLogging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & KibanaAmazee Labs
 

Viewers also liked (6)

Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
 
Data modeling for Elasticsearch
Data modeling for ElasticsearchData modeling for Elasticsearch
Data modeling for Elasticsearch
 
Elasticsearch in Zalando
Elasticsearch in ZalandoElasticsearch in Zalando
Elasticsearch in Zalando
 
Elasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English versionElasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English version
 
Elastic Search (엘라스틱서치) 입문
Elastic Search (엘라스틱서치) 입문Elastic Search (엘라스틱서치) 입문
Elastic Search (엘라스틱서치) 입문
 
Logging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & KibanaLogging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & Kibana
 

Similar to Elasticsearch as a search alternative to a relational database

Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیDeep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیEhsan Asgarian
 
Using elasticsearch with rails
Using elasticsearch with railsUsing elasticsearch with rails
Using elasticsearch with railsTom Z Zeng
 
Elasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analyticsElasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analyticsTiziano Fagni
 
Wanna search? Piece of cake!
Wanna search? Piece of cake!Wanna search? Piece of cake!
Wanna search? Piece of cake!Alex Kursov
 
ELK - Stack - Munich .net UG
ELK - Stack - Munich .net UGELK - Stack - Munich .net UG
ELK - Stack - Munich .net UGSteve Behrendt
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearchMinsoo Jun
 
Elasticsearch - basics and beyond
Elasticsearch - basics and beyondElasticsearch - basics and beyond
Elasticsearch - basics and beyondErnesto Reig
 
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdfELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdfcadejaumafiq
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginnersNeil Baker
 
ElasticSearch Basics
ElasticSearch BasicsElasticSearch Basics
ElasticSearch BasicsAmresh Singh
 
Elasticsearch and Spark
Elasticsearch and SparkElasticsearch and Spark
Elasticsearch and SparkAudible, Inc.
 
Elasticsearch - under the hood
Elasticsearch - under the hoodElasticsearch - under the hood
Elasticsearch - under the hoodSmartCat
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersBen van Mol
 
Inside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTPInside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTPBob Ward
 
Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Aaron Shilo
 
Polyglot metadata for Hadoop
Polyglot metadata for HadoopPolyglot metadata for Hadoop
Polyglot metadata for HadoopJim Dowling
 
Elasticsearch: An Overview
Elasticsearch: An OverviewElasticsearch: An Overview
Elasticsearch: An OverviewRuby Shrestha
 
Elasticsearch Architechture
Elasticsearch ArchitechtureElasticsearch Architechture
Elasticsearch ArchitechtureAnurag Sharma
 

Similar to Elasticsearch as a search alternative to a relational database (20)

Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیDeep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
 
Using elasticsearch with rails
Using elasticsearch with railsUsing elasticsearch with rails
Using elasticsearch with rails
 
Elasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analyticsElasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analytics
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Wanna search? Piece of cake!
Wanna search? Piece of cake!Wanna search? Piece of cake!
Wanna search? Piece of cake!
 
ELK - Stack - Munich .net UG
ELK - Stack - Munich .net UGELK - Stack - Munich .net UG
ELK - Stack - Munich .net UG
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearch
 
Elasticsearch - basics and beyond
Elasticsearch - basics and beyondElasticsearch - basics and beyond
Elasticsearch - basics and beyond
 
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdfELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginners
 
Elastic search
Elastic searchElastic search
Elastic search
 
ElasticSearch Basics
ElasticSearch BasicsElasticSearch Basics
ElasticSearch Basics
 
Elasticsearch and Spark
Elasticsearch and SparkElasticsearch and Spark
Elasticsearch and Spark
 
Elasticsearch - under the hood
Elasticsearch - under the hoodElasticsearch - under the hood
Elasticsearch - under the hood
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET Developers
 
Inside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTPInside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTP
 
Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…
 
Polyglot metadata for Hadoop
Polyglot metadata for HadoopPolyglot metadata for Hadoop
Polyglot metadata for Hadoop
 
Elasticsearch: An Overview
Elasticsearch: An OverviewElasticsearch: An Overview
Elasticsearch: An Overview
 
Elasticsearch Architechture
Elasticsearch ArchitechtureElasticsearch Architechture
Elasticsearch Architechture
 

Recently uploaded

Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 

Recently uploaded (20)

Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 

Elasticsearch as a search alternative to a relational database

  • 1. ELASTICSEARCH INTRODUCTION Kristijan Duvnjak & Mladen MaravićZagreb, 27.03.2015. Elasticsearch as a search alternative to a relational database
  • 2. PART 1 1 What is Elasticsearch?
  • 3. What is Elasticsearch (ES)? Document-oriented schema-free "database" Built on top of Apache Lucene Real-time search and data analytics Full-text search Distributed (horizontal scalability) High-avalability REST API 2 "Open Source (Apache 2) distributed RESTful search engine built on top of Lucene"
  • 4. ES for relational database users... 3 Oracle Elasticsearch Database Index Partition Shard Table Type Row Document Column Field Schema Mapping Index - (everything is indexed) SQL Query DSL
  • 5. Clustering – single node cluster Node = running instance of ES Cluster = 1+ nodes with the same cluster.name Every cluster has 1 master node Clients talk to any node in the cluster 1 Cluster can have any number of indexes 4
  • 6. About indexes & shards All data is stored inside one or more indexes Index has one or more shards (change requires reindexing) One index is one folder somewhere on disk Backup an index? Just tar/zip the folder.... 5 Each shard is one full instance of Lucene Each shard can have zero or more replicas (can be changed at any time) Index Shard
  • 7. Clustering – adding a second node Example above: 3 indexes Each index has one primary (P) and one replica (R) shard 6
  • 8. Clustering – adding a third node More primary shards: faster indexing more scale More replicas: faster searching more failover 7
  • 9. About documents... Documents are JSON-based Schema-free, but not necessarily! If no schema: ES guesses field type and indexes it With schema (or explicit mapping): Mapping applies to specific document type (type is just a label) Mapping defines the following for each field: ─ kind (string, number, date...) ─ to index or not ─ to store data or not 8
  • 10. About documents... Each document has an ID (auto-generated or manually assigned) You can force placement of a document into a specific shard – routing! Versioning is available – optimistic version control ! 9
  • 11. Index details inverted index Elasticsearch Server 1.0 (doc 1) Mastering Elasticsearch (doc 2) Apache Solr 4 Cookbook (doc 3) 10 Term Count Document 1.0 1 <1> 4 1 <3> apache 1 <3> cookbook 1 <3> elasticsearch 2 <1>,<2> mastering 1 <2> server 1 <1> solr 1 <3>
  • 12. Indexing example GET /blog/_search { "took": 6, "timed_out": false, "_shards": { "total": 2, "successful": 2, "failed": 0 }, "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "blog", "_type": "blog_comment", "_id": "AUzhH9M9HW_GzrF8oLAj", "_score": 1, "_source": { "user_id": 1, "date": "2015-04-01T13:12:12", "comment": "What’s so cool about Elasticsearch?" } } ] } } 11 POST /blog/blog_comment?routing=1 { "user_id" : 1, "date" : "2015-04-01T13:12:12", "comment" : "What’s so cool about Elasticsearch?" } GET /blog/_mapping { "blog": { "mappings": { "blog_comment": { "properties": { "comment": { "type": "string" }, "date": { "type": "date", "format": "dateOptionalTime" }, "user_id": { "type": "long" } } } } } }
  • 13. Storing data - indexing data input: REST, Java API, Rivers* data analysis: tokenizer and one or more filters types of filters: lowercase filter – makes all tokens lowercased synonyms filter – changes one token to another on the basis of synonym rules language stemming filters - reducing tokens into root or base forms, the stem different data storing needs string analyze,not_analyze field configuration _all in field memory field data or doc values segments, segment merging, throttling routing, indexing with routing 12
  • 14. We query them! All the usual stuff (think of WHERE in SQL) Full text search with support for: highlighting stemming ngrams & edge-ngrams Aggregations: term facets, date histograms, ranges Geo search: bounding box, distance,distance ranges, polygons Percolators (or reverse-search!) So, we can store documents and then what?!? 13
  • 15. Query details search types (query_then_fetch, query_and_fetch ...) same type of analysis as indexing explain plan sorting,aggregating data with in memory or on disk values search filters Boolean And/Or/Not filter cache, BitSets routing, searching with routing 14
  • 16. PBZ use case turnovers by account: 600M documents, 200M/year routing by account number indexing performance, 30k-40k documents per second DB performance in seconds, ES performance in ms (3500 queries/sec): find last 100 turnovers for a given account number: < 50 ms find last 100 turnovers for a given account number where description contains some words: <100ms 15
  • 18. PBZ ES cluster architecture 17 DATA node 1 DATA node 2 Elasticsearch cluster CLIENT node 1 NETWORK DISPATCHER CLIENT node 2 MASTER node 1 MASTER node 2 MASTER node 3 Cluster per datacenter
  • 19. DATA node 1 DATA node 2Elasticsearch cluster CLIENT node NETWORK DISPATCHER MASTER node PBZ ES cluster architecture 18 Cluster per datacenter
  • 20. Elasticsearch Administration plugins Marvel – monitoring console (GC, throttiling, CPU, memory, heap, search/indexing statistics ...) Sense – REST UI to Elasticsearch custom plugins (JDBC rivers ...) security Apache Web server Elasticsearch Shield speeding up queries using warmers 19
  • 23. Mladen Maravić & Kristijan Duvnjak 22