SlideShare a Scribd company logo
1 of 71
Download to read offline
Solr Anti - patterns
Rafał Kuć, Sematext Group, Inc.
@kucrafal
@sematext
http://sematext.com
About me
Sematext consultant & engineer
Solr.pl co-founder
Father & husband
The (not so) perfect migration
http://en.wikipedia.org/wiki/Bird_migration
http://www.likesbooks.com/aarafterhours/?p=750
From 3.1 to 4.10 (and hopefully not back)
March 2011 September 2014
The lonely solrconfig.xml
<requestHandler name="/update" class="solr.XmlUpdateRequestHandler" />
<requestHandler name="/update/javabin" class="solr.BinaryUpdateRequestHandler" />
<requestHandler name="/update/csv" class="solr.CSVRequestHandler" />
<requestHandler name="/update/json" class="solr.JsonUpdateRequestHandler" />
<luceneMatchVersion>LUCENE_31</luceneMatchVersion>
<directoryFactory name="DirectoryFactory"
class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
DOC
DOC
DOC
And faulty indexing
EXCEPTIONS :)
And faulty indexing
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">400</int>
<int name="QTime">0</int>
</lst>
<lst name="error">
<str name="msg">missing content stream</str>
<int name="code">400</int>
</lst>
</response>
109173 [qtp1223685984-20] ERROR org.apache.solr.core.SolrCore ľ org.apache.solr.common.SolrException: missing content stream
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:647)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Unknown Source)
Let’s make that right
<requestHandler name="/update" class="solr.UpdateRequestHandler" />
<requestHandler name="/update/json" class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="stream.contentType">application/json</str>
</lst>
</requestHandler>
<luceneMatchVersion>LUCENE_4.10.0</luceneMatchVersion>
<directoryFactory name="DirectoryFactory"
class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/>
<nrtMode>true</nrtMode>
<updateLog>
<str name="dir">
${solr.ulog.dir:}
</str>
</updateLog>
The old schema.xml
<fieldType name="int" class="solr.IntField" omitNorms="true"/>
<fieldType name="long" class="solr.LongField" omitNorms="true"/>
<fieldType name="float" class="solr.FloatField" omitNorms="true"/>
<fieldType name="double" class="solr.DoubleField" omitNorms="true"/>
<fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="slong" class="solr.SortableLongField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sfloat" class="solr.SortableFloatField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sdouble" class="solr.SortableDoubleField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="int" class="solr.IntField" omitNorms="true"/>
<fieldType name="long" class="solr.LongField" omitNorms="true"/>
<fieldType name="float" class="solr.FloatField" omitNorms="true"/>
<fieldType name="double" class="solr.DoubleField" omitNorms="true"/>
<fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="slong" class="solr.SortableLongField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sfloat" class="solr.SortableFloatField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sdouble" class="solr.SortableDoubleField" sortMissingLast="true" omitNorms="true"/>
The old schema.xml
The new schema.xml
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/>
<fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" positionIncrementGap="0"/>
<fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" positionIncrementGap="0"/>
<fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" positionIncrementGap="0"/>
<fieldType name="tdate" class="solr.TrieDateField" precisionStep="6" positionIncrementGap="0"/>
Threads? What threads?
<Set name="ThreadPool">
<New class="org.eclipse.jetty.util.thread.QueuedThreadPool">
<Set name="minThreads">10</Set>
<Set name="maxThreads">200</Set>
<Set name="detailedDump">false</Set>
</New>
</Set>
I see deadlocks
Threads? What threads?
<Set name="ThreadPool">
<New class="org.eclipse.jetty.util.thread.QueuedThreadPool">
<Set name="minThreads">10</Set>
<Set name="maxThreads">200</Set>
<Set name="detailedDump">false</Set>
</New>
</Set>
OK, so now we can actually run queries
<Set name="ThreadPool">
<New class="org.eclipse.jetty.util.thread.QueuedThreadPool">
<Set name="minThreads">10</Set>
<Set name="maxThreads">10000</Set>
<Set name="detailedDump">false</Set>
</New>
</Set>
The ZooKeeper
The ZooKeeper
The ZooKeeper
The ZooKeeper
The ZooKeeper
The ZooKeeper – production
The ZooKeeper – production
-DzkHost=zk1:2181,zk2:2181,zk3:2181
The ZooKeeper – production
-DzkHost=zk1:2181,zk2:2181,zk3:2181
The ZooKeeper – production
-DzkHost=zk1:2181,zk2:2181,zk3:2181
The ZooKeeper – production
-DzkHost=zk1:2181,zk2:2181,zk3:2181
Let’s cache everything
<filterCache class="solr.LRUCache"
size="1048576"
initialSize="1048576"
autowarmCount="524288"/>
<queryResultCache class="solr.LRUCache"
size="1048576"
initialSize="1048576"
autowarmCount="524288"/><documentCache class="solr.LRUCache"
size="1048576"
initialSize="1048576"
autowarmCount="0"/>
And now let’s look at the warmup times
And now let’s look at the warmup times
OK, show us the way „Mr. Consultant”
<filterCache class="solr.FastLRUCache"
size="1024"
initialSize="1024"
autowarmCount="512"/>
<queryResultCache class="solr.LRUCache"
size="16000"
initialSize="16000"
autowarmCount="8000"/><documentCache class="solr.LRUCache"
size="16384"
initialSize="16384"
autowarmCount="0"/>
Let’s look at the warmup times again
Let’s look at the warmup times again
Bulks are for noobs
Application Application Application
Doc Doc Doc
Bulks are for noobs
Application Application Application
Doc Doc Doc
But let’s use bulks, just in case
But let’s use bulks, just in case
We need to refresh and hard commit
<autoCommit>
<maxTime>1000</maxTime>
<openSearcher>true</openSearcher>
</autoCommit>
<autoSoftCommit>
<maxTime>1000</maxTime>
</autoSoftCommit>
Maybe we should only refresh?
<autoCommit>
<maxTime>60000</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>
<autoSoftCommit>
<maxTime>1000</maxTime>
</autoSoftCommit>
OK, let’s go easy with refreshing
<autoCommit>
<maxTime>60000</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>
<autoSoftCommit>
<maxTime>30000</maxTime>
</autoSoftCommit>
But I really need all that data
curl -XGET 'localhost:8983/solr/select?q=*:*&start=3000000&rows=100'
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">9418</int>
<lst name="params">
<str name="start">3000000</str>
<str name="q">*:*</str>
<str name="rows">100</str>
</lst>
</lst>
<result name="response" numFound="3284000" start="3000000">
.
.
.
</result>
</response>
But I really need all that data
curl -XGET 'localhost:8983/solr/select?q=*:*&start=3000000&rows=100'
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">9418</int>
<lst name="params">
<str name="start">3000000</str>
<str name="q">*:*</str>
<str name="rows">5</str>
</lst>
</lst>
<result name="response" numFound="3284000" start="3000000">
.
.
.
</result>
</response>
But I really need all that data
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="error">
<str name="msg">java.lang.OutOfMemoryError: Java heap space</str>
<str name="trace">java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:796)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:448)
.
.
.
Caused by: java.lang.OutOfMemoryError: Java heap space
.
.
.
</str>
<int name="code">500</int>
</lst>
</response>
curl -XGET 'localhost:8983/solr/select?q=*:*&start=3000000&rows=100'
But I really need all that data
Query
But I really need all that data
But I really need all that data
But I really need all that data
Response
Use the scroll Luke
curl -XGET 'localhost:8983/solr/select?q=*:*&cursorMark=*&sort=score+desc,id+desc'
Use the scroll Luke
curl -XGET 'localhost:8983/solr/select?q=*:*&cursorMark=*&sort=score+desc,id+desc'
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">189</int>
<lst name="params">
<str name="sort">score desc,id desc</str>
<str name="q">*:*</str>
<str name="cursorMark">*</str>
</lst>
</lst>
<result name="response" numFound="3284000" start="0">
<doc>
...
</doc>
.
.
.
</result>
<str name="nextCursorMark">AoIIP4AAACY5OTk5OTA=</str>
</response>
Use the scroll Luke
curl -XGET 'localhost:8983/solr/select?q=*:*&sort=score+desc,id+desc
&cursorMark=AoIIP4AAACY5OTk5OTA='
Use the scroll Luke
curl -XGET 'localhost:8983/solr/select?q=*:*&sort=score+desc,id+desc
&cursorMark=AoIIP4AAACY5OTk5OTA='
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">184</int>
<lst name="params">
<str name="sort">score desc,id desc</str>
<str name="q">*:*</str>
<str name="cursorMark">AoIIP4AAACY5OTk5OTA=</str>
</lst>
</lst>
<result name="response" numFound="3284000" start="0">
<doc>
...
</doc>
.
.
.
</result>
<str name="nextCursorMark">AoIIP4AAACY5OTk5ODE=</str>
</response>
Limiting faceting, why bother?
curl -XGET 'localhost:8983/solr/select?q=*:*&facet=true&facet.field=tag&…
facet.limit=-1&facet.mincount=0'
Limiting faceting, why bother?
curl -XGET 'localhost:8983/solr/select?q=*:*&facet=true&facet.field=tag&…
facet.limit=-1&facet.mincount=0'
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">9967</int>
<lst name="params">
...
</lst>
</lst>
<result name="response" numFound="3284000" start="0">
.
.
.
</result>
<lst name="facet_counts">
<lst name="facet_fields">
<lst name="tag">
...
</lst>
</lst>
</lst>
</response>
Limiting faceting, why bother?
curl -XGET 'localhost:8983/solr/select?q=*:*&facet=true&facet.field=tag&…
facet.limit=-1&facet.mincount=0'
<?xml version="1.0" encoding="UTF-8"?>
<response>
.
.
.
<lst name="error">
<str name="msg">Error while processing facet fields: java.lang.OutOfMemoryError: Java heap space</str>
<str name="trace">org.apache.solr.common.SolrException: Error while processing facet fields:
java.lang.OutOfMemoryError: Java heap space
.
.
.
Caused by: java.lang.OutOfMemoryError: Java heap space
at org.apache.solr.request.SimpleFacets.getFieldCacheCounts(SimpleFacets.java:685)
.
.
.
</str>
<int name="code">500</int>
</lst>
</response>
Now let’s look at performance
Now let’s look at performance
Now let’s look at performance
Now let’s look at performance
Now let’s look at performance
Magic happens with small changes
curl -XGET 'localhost:8983/solr/select?q=*:*&facet=true&facet.field=tag&…
facet.limit=100&facet.mincount=1'
Magic happens with small changes
Magic happens with small changes
Magic happens with small changes
Magic happens with small changes
Magic happens with small changes
Magic happens with small changes
Magic happens with small changes
Monitoring in production
http://sematext.com/spm/index.html
And remember…
<luceneMatchVersion>
3.1
</luceneMatchVersion>
Quick summary
http://www.soothetube.com/2013/12/29/thats-all-folks/
We are hiring!
Dig Search?
Dig Analytics?
Dig Big Data?
Dig Performance?
Dig Logging?
Dig working with and in open – source?
We’re hiring world – wide!
http://sematext.com/about/jobs.html
Thank you!
Rafał Kuć
@kucrafal
rafal.kuc@sematext.com
Sematext
@sematext
http://sematext.com
http://blog.sematext.com

More Related Content

What's hot

Mastering solr
Mastering solrMastering solr
Mastering solrjurcello
 
Make your gui shine with ajax solr
Make your gui shine with ajax solrMake your gui shine with ajax solr
Make your gui shine with ajax solrlucenerevolution
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksErik Hatcher
 
New SPL Features in PHP 5.3
New SPL Features in PHP 5.3New SPL Features in PHP 5.3
New SPL Features in PHP 5.3Matthew Turland
 
Let's write secure Drupal code! DUG Belgium - 08/08/2019
Let's write secure Drupal code! DUG Belgium - 08/08/2019Let's write secure Drupal code! DUG Belgium - 08/08/2019
Let's write secure Drupal code! DUG Belgium - 08/08/2019Balázs Tatár
 
SPL: The Undiscovered Library - DataStructures
SPL: The Undiscovered Library -  DataStructuresSPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library - DataStructuresMark Baker
 
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasksSearching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasksAlexandre Rafalovitch
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Alexandre Rafalovitch
 
The Origin of Lithium
The Origin of LithiumThe Origin of Lithium
The Origin of LithiumNate Abele
 
Solr and Lucene at Etsy - By Gregg Donovan
Solr and Lucene at Etsy - By Gregg DonovanSolr and Lucene at Etsy - By Gregg Donovan
Solr and Lucene at Etsy - By Gregg Donovanlucenerevolution
 
The State of Lithium
The State of LithiumThe State of Lithium
The State of LithiumNate Abele
 
Class-based views with Django
Class-based views with DjangoClass-based views with Django
Class-based views with DjangoSimon Willison
 
Drupal for ng_os
Drupal for ng_osDrupal for ng_os
Drupal for ng_osdstuartnz
 

What's hot (20)

Mastering solr
Mastering solrMastering solr
Mastering solr
 
Make your gui shine with ajax solr
Make your gui shine with ajax solrMake your gui shine with ajax solr
Make your gui shine with ajax solr
 
it's just search
it's just searchit's just search
it's just search
 
Solr workshop
Solr workshopSolr workshop
Solr workshop
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis Tricks
 
New SPL Features in PHP 5.3
New SPL Features in PHP 5.3New SPL Features in PHP 5.3
New SPL Features in PHP 5.3
 
Let's write secure Drupal code! DUG Belgium - 08/08/2019
Let's write secure Drupal code! DUG Belgium - 08/08/2019Let's write secure Drupal code! DUG Belgium - 08/08/2019
Let's write secure Drupal code! DUG Belgium - 08/08/2019
 
JSON in Solr: from top to bottom
JSON in Solr: from top to bottomJSON in Solr: from top to bottom
JSON in Solr: from top to bottom
 
Intro to The PHP SPL
Intro to The PHP SPLIntro to The PHP SPL
Intro to The PHP SPL
 
SPL: The Undiscovered Library - DataStructures
SPL: The Undiscovered Library -  DataStructuresSPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library - DataStructures
 
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasksSearching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
 
My shell
My shellMy shell
My shell
 
The Origin of Lithium
The Origin of LithiumThe Origin of Lithium
The Origin of Lithium
 
Solr & Lucene at Etsy
Solr & Lucene at EtsySolr & Lucene at Etsy
Solr & Lucene at Etsy
 
Solr and Lucene at Etsy - By Gregg Donovan
Solr and Lucene at Etsy - By Gregg DonovanSolr and Lucene at Etsy - By Gregg Donovan
Solr and Lucene at Etsy - By Gregg Donovan
 
The State of Lithium
The State of LithiumThe State of Lithium
The State of Lithium
 
Class-based views with Django
Class-based views with DjangoClass-based views with Django
Class-based views with Django
 
Living with garbage
Living with garbageLiving with garbage
Living with garbage
 
Drupal for ng_os
Drupal for ng_osDrupal for ng_os
Drupal for ng_os
 

Viewers also liked

Tuning Elasticsearch Indexing Pipeline for Logs
Tuning Elasticsearch Indexing Pipeline for LogsTuning Elasticsearch Indexing Pipeline for Logs
Tuning Elasticsearch Indexing Pipeline for LogsSematext Group, Inc.
 
Going All-In With Go For CLI Apps
Going All-In With Go For CLI AppsGoing All-In With Go For CLI Apps
Going All-In With Go For CLI AppsTom Elliott
 
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerRunning High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerSematext Group, Inc.
 
Musings on Secondary Indexing in HBase
Musings on Secondary Indexing in HBaseMusings on Secondary Indexing in HBase
Musings on Secondary Indexing in HBaseJesse Yates
 
MongoDB and Apache HBase: Benchmarking
MongoDB and Apache HBase: BenchmarkingMongoDB and Apache HBase: Benchmarking
MongoDB and Apache HBase: BenchmarkingOlga Lavrentieva
 
Ease of use in Apache Solr
Ease of use in Apache SolrEase of use in Apache Solr
Ease of use in Apache SolrAnshum Gupta
 
Search Analytics with Flume and HBase
Search Analytics with Flume and HBaseSearch Analytics with Flume and HBase
Search Analytics with Flume and HBaseSematext Group, Inc.
 
Apache HBase Application Archetypes
Apache HBase Application ArchetypesApache HBase Application Archetypes
Apache HBase Application ArchetypesCloudera, Inc.
 
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...Sematext Group, Inc.
 
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Sematext Group, Inc.
 
Improvements to Flink & it's Applications in Alibaba Search
Improvements to Flink & it's Applications in Alibaba SearchImprovements to Flink & it's Applications in Alibaba Search
Improvements to Flink & it's Applications in Alibaba SearchDataWorks Summit/Hadoop Summit
 
From Zero to Hero - Centralized Logging with Logstash & Elasticsearch
From Zero to Hero - Centralized Logging with Logstash & ElasticsearchFrom Zero to Hero - Centralized Logging with Logstash & Elasticsearch
From Zero to Hero - Centralized Logging with Logstash & ElasticsearchSematext Group, Inc.
 
Numeric Range Queries in Lucene and Solr
Numeric Range Queries in Lucene and SolrNumeric Range Queries in Lucene and Solr
Numeric Range Queries in Lucene and SolrVadim Kirilchuk
 
Using Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLUsing Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLCloudera, Inc.
 

Viewers also liked (20)

Tuning Solr for Logs
Tuning Solr for LogsTuning Solr for Logs
Tuning Solr for Logs
 
Tuning Elasticsearch Indexing Pipeline for Logs
Tuning Elasticsearch Indexing Pipeline for LogsTuning Elasticsearch Indexing Pipeline for Logs
Tuning Elasticsearch Indexing Pipeline for Logs
 
Going All-In With Go For CLI Apps
Going All-In With Go For CLI AppsGoing All-In With Go For CLI Apps
Going All-In With Go For CLI Apps
 
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerRunning High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
 
Docker Logging Webinar
Docker Logging  WebinarDocker Logging  Webinar
Docker Logging Webinar
 
Top Node.js Metrics to Watch
Top Node.js Metrics to WatchTop Node.js Metrics to Watch
Top Node.js Metrics to Watch
 
Tuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for LogsTuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for Logs
 
Musings on Secondary Indexing in HBase
Musings on Secondary Indexing in HBaseMusings on Secondary Indexing in HBase
Musings on Secondary Indexing in HBase
 
MongoDB and Apache HBase: Benchmarking
MongoDB and Apache HBase: BenchmarkingMongoDB and Apache HBase: Benchmarking
MongoDB and Apache HBase: Benchmarking
 
Ease of use in Apache Solr
Ease of use in Apache SolrEase of use in Apache Solr
Ease of use in Apache Solr
 
Search Analytics with Flume and HBase
Search Analytics with Flume and HBaseSearch Analytics with Flume and HBase
Search Analytics with Flume and HBase
 
Docker Monitoring Webinar
Docker Monitoring  WebinarDocker Monitoring  Webinar
Docker Monitoring Webinar
 
Apache HBase Application Archetypes
Apache HBase Application ArchetypesApache HBase Application Archetypes
Apache HBase Application Archetypes
 
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
 
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
 
Improvements to Flink & it's Applications in Alibaba Search
Improvements to Flink & it's Applications in Alibaba SearchImprovements to Flink & it's Applications in Alibaba Search
Improvements to Flink & it's Applications in Alibaba Search
 
Introduction to solr
Introduction to solrIntroduction to solr
Introduction to solr
 
From Zero to Hero - Centralized Logging with Logstash & Elasticsearch
From Zero to Hero - Centralized Logging with Logstash & ElasticsearchFrom Zero to Hero - Centralized Logging with Logstash & Elasticsearch
From Zero to Hero - Centralized Logging with Logstash & Elasticsearch
 
Numeric Range Queries in Lucene and Solr
Numeric Range Queries in Lucene and SolrNumeric Range Queries in Lucene and Solr
Numeric Range Queries in Lucene and Solr
 
Using Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLUsing Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETL
 

Similar to Solr Anti Patterns

[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자Donghyeok Kang
 
A noobs lesson on solr (configuration)
A noobs lesson on solr (configuration)A noobs lesson on solr (configuration)
A noobs lesson on solr (configuration)BTI360
 
Apache Solr Search Mastery
Apache Solr Search MasteryApache Solr Search Mastery
Apache Solr Search MasteryAcquia
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseAlexandre Rafalovitch
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBertrand Delacretaz
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UNSolr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UNLucidworks
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloudlucenerevolution
 
Cassandra summit
Cassandra summitCassandra summit
Cassandra summitmattstump
 
เกี่ยวกับ Apache solr 4.0
เกี่ยวกับ Apache solr 4.0เกี่ยวกับ Apache solr 4.0
เกี่ยวกับ Apache solr 4.0Somkiat Puisungnoen
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialSourcesense
 
XamarinとAWSをつないでみた話
XamarinとAWSをつないでみた話XamarinとAWSをつないでみた話
XamarinとAWSをつないでみた話Takehito Tanabe
 
Spring Web Service, Spring Integration and Spring Batch
Spring Web Service, Spring Integration and Spring BatchSpring Web Service, Spring Integration and Spring Batch
Spring Web Service, Spring Integration and Spring BatchEberhard Wolff
 

Similar to Solr Anti Patterns (20)

[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
 
A noobs lesson on solr (configuration)
A noobs lesson on solr (configuration)A noobs lesson on solr (configuration)
A noobs lesson on solr (configuration)
 
Apache Solr Search Mastery
Apache Solr Search MasteryApache Solr Search Mastery
Apache Solr Search Mastery
 
Solr02 fields
Solr02 fieldsSolr02 fields
Solr02 fields
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by Case
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and Solr
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UNSolr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UN
 
Scaling Solr with Solr Cloud
Scaling Solr with Solr CloudScaling Solr with Solr Cloud
Scaling Solr with Solr Cloud
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
XML Schemas
XML SchemasXML Schemas
XML Schemas
 
Cassandra summit
Cassandra summitCassandra summit
Cassandra summit
 
เกี่ยวกับ Apache solr 4.0
เกี่ยวกับ Apache solr 4.0เกี่ยวกับ Apache solr 4.0
เกี่ยวกับ Apache solr 4.0
 
Apache solr liferay
Apache solr liferayApache solr liferay
Apache solr liferay
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr Tutorial
 
XamarinとAWSをつないでみた話
XamarinとAWSをつないでみた話XamarinとAWSをつないでみた話
XamarinとAWSをつないでみた話
 
Spring Web Service, Spring Integration and Spring Batch
Spring Web Service, Spring Integration and Spring BatchSpring Web Service, Spring Integration and Spring Batch
Spring Web Service, Spring Integration and Spring Batch
 
Eu odeio OpenSocial
Eu odeio OpenSocialEu odeio OpenSocial
Eu odeio OpenSocial
 
Create WSDL
Create WSDLCreate WSDL
Create WSDL
 

More from Sematext Group, Inc.

Tweaking the Base Score: Lucene/Solr Similarities Explained
Tweaking the Base Score: Lucene/Solr Similarities ExplainedTweaking the Base Score: Lucene/Solr Similarities Explained
Tweaking the Base Score: Lucene/Solr Similarities ExplainedSematext Group, Inc.
 
OOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM appsOOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM appsSematext Group, Inc.
 
Is observability good for your brain?
Is observability good for your brain?Is observability good for your brain?
Is observability good for your brain?Sematext Group, Inc.
 
Introducing log analysis to your organization
Introducing log analysis to your organization Introducing log analysis to your organization
Introducing log analysis to your organization Sematext Group, Inc.
 
Solr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSolr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSematext Group, Inc.
 
Solr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySolr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySematext Group, Inc.
 
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Building Resilient Log Aggregation Pipeline with Elasticsearch & KafkaBuilding Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Building Resilient Log Aggregation Pipeline with Elasticsearch & KafkaSematext Group, Inc.
 
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveSematext Group, Inc.
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerSematext Group, Inc.
 
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at ScaleMetrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at ScaleSematext Group, Inc.
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Sematext Group, Inc.
 
Side by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and SolrSide by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and SolrSematext Group, Inc.
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersSematext Group, Inc.
 

More from Sematext Group, Inc. (19)

Tweaking the Base Score: Lucene/Solr Similarities Explained
Tweaking the Base Score: Lucene/Solr Similarities ExplainedTweaking the Base Score: Lucene/Solr Similarities Explained
Tweaking the Base Score: Lucene/Solr Similarities Explained
 
OOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM appsOOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM apps
 
Is observability good for your brain?
Is observability good for your brain?Is observability good for your brain?
Is observability good for your brain?
 
Introducing log analysis to your organization
Introducing log analysis to your organization Introducing log analysis to your organization
Introducing log analysis to your organization
 
Solr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSolr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for You
 
Solr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySolr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the Ugly
 
Monitoring and Log Management for
Monitoring and Log Management forMonitoring and Log Management for
Monitoring and Log Management for
 
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Building Resilient Log Aggregation Pipeline with Elasticsearch & KafkaBuilding Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
 
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
 
How to Run Solr on Docker and Why
How to Run Solr on Docker and WhyHow to Run Solr on Docker and Why
How to Run Solr on Docker and Why
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
 
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at ScaleMetrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2
 
(Elastic)search in big data
(Elastic)search in big data(Elastic)search in big data
(Elastic)search in big data
 
Side by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and SolrSide by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and Solr
 
Open Source Search Evolution
Open Source Search EvolutionOpen Source Search Evolution
Open Source Search Evolution
 
Elasticsearch and Solr for Logs
Elasticsearch and Solr for LogsElasticsearch and Solr for Logs
Elasticsearch and Solr for Logs
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 

Recently uploaded

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Recently uploaded (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Solr Anti Patterns

  • 1.
  • 2. Solr Anti - patterns Rafał Kuć, Sematext Group, Inc. @kucrafal @sematext http://sematext.com
  • 3. About me Sematext consultant & engineer Solr.pl co-founder Father & husband
  • 4. The (not so) perfect migration http://en.wikipedia.org/wiki/Bird_migration http://www.likesbooks.com/aarafterhours/?p=750
  • 5. From 3.1 to 4.10 (and hopefully not back) March 2011 September 2014
  • 6. The lonely solrconfig.xml <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" /> <requestHandler name="/update/javabin" class="solr.BinaryUpdateRequestHandler" /> <requestHandler name="/update/csv" class="solr.CSVRequestHandler" /> <requestHandler name="/update/json" class="solr.JsonUpdateRequestHandler" /> <luceneMatchVersion>LUCENE_31</luceneMatchVersion> <directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
  • 8. And faulty indexing <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status">400</int> <int name="QTime">0</int> </lst> <lst name="error"> <str name="msg">missing content stream</str> <int name="code">400</int> </lst> </response> 109173 [qtp1223685984-20] ERROR org.apache.solr.core.SolrCore ľ org.apache.solr.common.SolrException: missing content stream at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:647) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Unknown Source)
  • 9. Let’s make that right <requestHandler name="/update" class="solr.UpdateRequestHandler" /> <requestHandler name="/update/json" class="solr.UpdateRequestHandler"> <lst name="defaults"> <str name="stream.contentType">application/json</str> </lst> </requestHandler> <luceneMatchVersion>LUCENE_4.10.0</luceneMatchVersion> <directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/> <nrtMode>true</nrtMode> <updateLog> <str name="dir"> ${solr.ulog.dir:} </str> </updateLog>
  • 10. The old schema.xml <fieldType name="int" class="solr.IntField" omitNorms="true"/> <fieldType name="long" class="solr.LongField" omitNorms="true"/> <fieldType name="float" class="solr.FloatField" omitNorms="true"/> <fieldType name="double" class="solr.DoubleField" omitNorms="true"/> <fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true"/> <fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/> <fieldType name="slong" class="solr.SortableLongField" sortMissingLast="true" omitNorms="true"/> <fieldType name="sfloat" class="solr.SortableFloatField" sortMissingLast="true" omitNorms="true"/> <fieldType name="sdouble" class="solr.SortableDoubleField" sortMissingLast="true" omitNorms="true"/>
  • 11. <fieldType name="int" class="solr.IntField" omitNorms="true"/> <fieldType name="long" class="solr.LongField" omitNorms="true"/> <fieldType name="float" class="solr.FloatField" omitNorms="true"/> <fieldType name="double" class="solr.DoubleField" omitNorms="true"/> <fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true"/> <fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/> <fieldType name="slong" class="solr.SortableLongField" sortMissingLast="true" omitNorms="true"/> <fieldType name="sfloat" class="solr.SortableFloatField" sortMissingLast="true" omitNorms="true"/> <fieldType name="sdouble" class="solr.SortableDoubleField" sortMissingLast="true" omitNorms="true"/> The old schema.xml
  • 12. The new schema.xml <fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/> <fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" positionIncrementGap="0"/> <fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" positionIncrementGap="0"/> <fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" positionIncrementGap="0"/> <fieldType name="tdate" class="solr.TrieDateField" precisionStep="6" positionIncrementGap="0"/>
  • 13. Threads? What threads? <Set name="ThreadPool"> <New class="org.eclipse.jetty.util.thread.QueuedThreadPool"> <Set name="minThreads">10</Set> <Set name="maxThreads">200</Set> <Set name="detailedDump">false</Set> </New> </Set>
  • 15. Threads? What threads? <Set name="ThreadPool"> <New class="org.eclipse.jetty.util.thread.QueuedThreadPool"> <Set name="minThreads">10</Set> <Set name="maxThreads">200</Set> <Set name="detailedDump">false</Set> </New> </Set>
  • 16. OK, so now we can actually run queries <Set name="ThreadPool"> <New class="org.eclipse.jetty.util.thread.QueuedThreadPool"> <Set name="minThreads">10</Set> <Set name="maxThreads">10000</Set> <Set name="detailedDump">false</Set> </New> </Set>
  • 22. The ZooKeeper – production
  • 23. The ZooKeeper – production -DzkHost=zk1:2181,zk2:2181,zk3:2181
  • 24. The ZooKeeper – production -DzkHost=zk1:2181,zk2:2181,zk3:2181
  • 25. The ZooKeeper – production -DzkHost=zk1:2181,zk2:2181,zk3:2181
  • 26. The ZooKeeper – production -DzkHost=zk1:2181,zk2:2181,zk3:2181
  • 27. Let’s cache everything <filterCache class="solr.LRUCache" size="1048576" initialSize="1048576" autowarmCount="524288"/> <queryResultCache class="solr.LRUCache" size="1048576" initialSize="1048576" autowarmCount="524288"/><documentCache class="solr.LRUCache" size="1048576" initialSize="1048576" autowarmCount="0"/>
  • 28. And now let’s look at the warmup times
  • 29. And now let’s look at the warmup times
  • 30. OK, show us the way „Mr. Consultant” <filterCache class="solr.FastLRUCache" size="1024" initialSize="1024" autowarmCount="512"/> <queryResultCache class="solr.LRUCache" size="16000" initialSize="16000" autowarmCount="8000"/><documentCache class="solr.LRUCache" size="16384" initialSize="16384" autowarmCount="0"/>
  • 31. Let’s look at the warmup times again
  • 32. Let’s look at the warmup times again
  • 33. Bulks are for noobs Application Application Application Doc Doc Doc
  • 34. Bulks are for noobs Application Application Application Doc Doc Doc
  • 35. But let’s use bulks, just in case
  • 36. But let’s use bulks, just in case
  • 37. We need to refresh and hard commit <autoCommit> <maxTime>1000</maxTime> <openSearcher>true</openSearcher> </autoCommit> <autoSoftCommit> <maxTime>1000</maxTime> </autoSoftCommit>
  • 38. Maybe we should only refresh? <autoCommit> <maxTime>60000</maxTime> <openSearcher>false</openSearcher> </autoCommit> <autoSoftCommit> <maxTime>1000</maxTime> </autoSoftCommit>
  • 39. OK, let’s go easy with refreshing <autoCommit> <maxTime>60000</maxTime> <openSearcher>false</openSearcher> </autoCommit> <autoSoftCommit> <maxTime>30000</maxTime> </autoSoftCommit>
  • 40. But I really need all that data curl -XGET 'localhost:8983/solr/select?q=*:*&start=3000000&rows=100'
  • 41. <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">9418</int> <lst name="params"> <str name="start">3000000</str> <str name="q">*:*</str> <str name="rows">100</str> </lst> </lst> <result name="response" numFound="3284000" start="3000000"> . . . </result> </response> But I really need all that data curl -XGET 'localhost:8983/solr/select?q=*:*&start=3000000&rows=100'
  • 42. <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">9418</int> <lst name="params"> <str name="start">3000000</str> <str name="q">*:*</str> <str name="rows">5</str> </lst> </lst> <result name="response" numFound="3284000" start="3000000"> . . . </result> </response> But I really need all that data <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="error"> <str name="msg">java.lang.OutOfMemoryError: Java heap space</str> <str name="trace">java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:796) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:448) . . . Caused by: java.lang.OutOfMemoryError: Java heap space . . . </str> <int name="code">500</int> </lst> </response> curl -XGET 'localhost:8983/solr/select?q=*:*&start=3000000&rows=100'
  • 43. But I really need all that data Query
  • 44. But I really need all that data
  • 45. But I really need all that data
  • 46. But I really need all that data Response
  • 47. Use the scroll Luke curl -XGET 'localhost:8983/solr/select?q=*:*&cursorMark=*&sort=score+desc,id+desc'
  • 48. Use the scroll Luke curl -XGET 'localhost:8983/solr/select?q=*:*&cursorMark=*&sort=score+desc,id+desc' <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">189</int> <lst name="params"> <str name="sort">score desc,id desc</str> <str name="q">*:*</str> <str name="cursorMark">*</str> </lst> </lst> <result name="response" numFound="3284000" start="0"> <doc> ... </doc> . . . </result> <str name="nextCursorMark">AoIIP4AAACY5OTk5OTA=</str> </response>
  • 49. Use the scroll Luke curl -XGET 'localhost:8983/solr/select?q=*:*&sort=score+desc,id+desc &cursorMark=AoIIP4AAACY5OTk5OTA='
  • 50. Use the scroll Luke curl -XGET 'localhost:8983/solr/select?q=*:*&sort=score+desc,id+desc &cursorMark=AoIIP4AAACY5OTk5OTA=' <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">184</int> <lst name="params"> <str name="sort">score desc,id desc</str> <str name="q">*:*</str> <str name="cursorMark">AoIIP4AAACY5OTk5OTA=</str> </lst> </lst> <result name="response" numFound="3284000" start="0"> <doc> ... </doc> . . . </result> <str name="nextCursorMark">AoIIP4AAACY5OTk5ODE=</str> </response>
  • 51. Limiting faceting, why bother? curl -XGET 'localhost:8983/solr/select?q=*:*&facet=true&facet.field=tag&… facet.limit=-1&facet.mincount=0'
  • 52. Limiting faceting, why bother? curl -XGET 'localhost:8983/solr/select?q=*:*&facet=true&facet.field=tag&… facet.limit=-1&facet.mincount=0' <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">9967</int> <lst name="params"> ... </lst> </lst> <result name="response" numFound="3284000" start="0"> . . . </result> <lst name="facet_counts"> <lst name="facet_fields"> <lst name="tag"> ... </lst> </lst> </lst> </response>
  • 53. Limiting faceting, why bother? curl -XGET 'localhost:8983/solr/select?q=*:*&facet=true&facet.field=tag&… facet.limit=-1&facet.mincount=0' <?xml version="1.0" encoding="UTF-8"?> <response> . . . <lst name="error"> <str name="msg">Error while processing facet fields: java.lang.OutOfMemoryError: Java heap space</str> <str name="trace">org.apache.solr.common.SolrException: Error while processing facet fields: java.lang.OutOfMemoryError: Java heap space . . . Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.solr.request.SimpleFacets.getFieldCacheCounts(SimpleFacets.java:685) . . . </str> <int name="code">500</int> </lst> </response>
  • 54. Now let’s look at performance
  • 55. Now let’s look at performance
  • 56. Now let’s look at performance
  • 57. Now let’s look at performance
  • 58. Now let’s look at performance
  • 59. Magic happens with small changes curl -XGET 'localhost:8983/solr/select?q=*:*&facet=true&facet.field=tag&… facet.limit=100&facet.mincount=1'
  • 60. Magic happens with small changes
  • 61. Magic happens with small changes
  • 62. Magic happens with small changes
  • 63. Magic happens with small changes
  • 64. Magic happens with small changes
  • 65. Magic happens with small changes
  • 66. Magic happens with small changes
  • 70. We are hiring! Dig Search? Dig Analytics? Dig Big Data? Dig Performance? Dig Logging? Dig working with and in open – source? We’re hiring world – wide! http://sematext.com/about/jobs.html