Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014

Scaling SolrCloud to a large
number of Collections
Shalin Shekhar Mangar, Lucidworks Inc.
shalin@apache.org
twitter.com/shalinmangar

Apache Solr has a huge install base and tremendous momentum.
SOLRmost widely used search
solution on the planet.
8M+
total downloads
Solr is both established & growing
250,000+
monthly downloads
Solr has tens of thousands
of applications in production.
You use Solr everyday.
Largest community of developers.
2500+open Solr jobs.

Solr scalability is unmatched.
• box.com (Dropbox for business)
• 10TB+ Index Size
• 10 Billion+ Documents
• 100 Million+ Daily Requests

Solr scalability is unmatched.

The traditional search use-case
• One large index distributed across multiple nodes
• A large number of users sharing the data
• Searches across the entire cluster

Example: Product Catalog
Must search across all products

Subset of optional features in Solr to enable
and simplify horizontal scaling a search index
using sharding and replication.
!
Goals
scalability, performance, high-availability,
simplicity, and elasticity
What is SolrCloud?

Terminology
• ZooKeeper: Distributed coordination service that provides centralised configuration,
cluster state management, and leader election
• Node: JVM process bound to a specific port on a machine
• Collection: Search index distributed across multiple nodes with same configuration
• Shard: Logical slice of a collection; each shard has a name, hash range, leader and
replication factor. Documents are assigned to one and only one shard per collection
using a hash-based document routing strategy
• Replica: A copy of a shard in a collection
• Overseer: A special node that executes cluster administration commands and writes
updated state to ZooKeeper. Automatic failover and leader election.

Collection with 2 shards across 4 nodes with replication factor 2
iv
Jetty (node 2, port 8984)
Solr webapp
logstash4solr
shard1 Replica
Java VM
iv
Solr webapp
logstash4solr
shard1 Leader
Java VM
iv
Solr webapp
logstash4solr
shard2 Replica
Java VM
iv
Solr webapp
logstash4solr
shard2 Leader
Java VM
Sharding
Replication
Replication
Zookeeper 1
Zookeeper 2
Zookeeper 3
Leader
ElectionCentralized
conﬁguration
management
ZooKeeper Ensemble
HTTP APIs
XML/JSON/CSV/PDF
Java/Ruby/Python/PHP
Millions of documents, millions of users

“The limits of the possible can only be
deﬁned by going beyond them into the
impossible” — Arthur C. Clarke

The curious case of multi-tenant platforms
• Multi-tenant platform for storage and search
• Thousands of tenant applications
• Each tenant application has millions of users

One SolrCloud collection per tenant
• Searches are specialised to a user’s data or the
tenant application’s dataset
• Some tenants create a lot of data, others very little
• Some use CPU intensive geo-spatial queries, some
just perform simple full text searches and sorting
• Some are write-heavy, others read-heavy
• Some have text in a different natural language

Measure and optimise
• Analyze and ﬁnd missing features
• Setup a performance testing environment on AWS
• Devise tests for stability and performance
• Find bugs and bottlenecks and ﬁx ’em

Problem #1: Cluster state and updates
• The SolrCloud cluster state has information about the
collections, their shards and replicas
• All nodes and (Java) clients watch the cluster state
• Every state change is notiﬁed to all nodes
• Limited to (slightly less than) 1MB by default
• 1 node bounce triggers a few 100 watcher ﬁres and
pulls from ZK for a 100 node cluster (three states:
down, recovering, active)

Solution - Split cluster state and scale
• Each collection gets it’s own state node in ZK
• Nodes selectively watch only those states which
they are a member of
• Clients cache state and use smart cache updates
instead of watching nodes
• http://issues.apache.org/jira/browse/SOLR-5473

Problem #2: Overseer performance
• Thousands of collections create a lot of state
updates
• Overseer falls behind and replicas can’t recover or
can’t elect a leader
• Under high indexing/search load, GC pauses can
cause overseer queue to back up

Solution - Improve the overseer
• Harden the overseer code against ZooKeeper
connection loss (SOLR-5325)
• Optimise polling for new items in overseer queue
(SOLR-5436)
• Dedicated overseers nodes (SOLR-5476)
• New Overseer Status API (SOLR-5749)
• Asynchronous execution of collection commands
(SOLR-5477, SOLR-5681)

Problem #3: Moving data around
• Not all users are born equal - A tenant may have a
few very large users
• We wanted to be able to scale an individual user’s
data — maybe even as it’s own collection
• SolrCloud can split shards with no downtime but it
only splits in half
• No way to ‘extract’ user’s data to another collection
or shard

Solution: Improved data management
• Shard can be split on arbitrary hash ranges
(SOLR-5300)
• Shard can be split by a given key (SOLR-5338,
SOLR-5353)
• A new ‘migrate’ API to move a user’s data to
another (new) collection without downtime
(SOLR-5308)

Problem #4: Exporting data
• Lucene/Solr are designed for ﬁnding top-N search
results
• Trying to export full result set brings down the
system due to high memory requirements as you
go deeper

Solution - Distributed deep paging
New ‘cursorMark’ feature for deep paging (SOLR-5463)

–twitter.com/UweSays
“The JVM is completely irresponsible and can
only be killed with ‘kill -9’”
JVM Bugs!

“Testing scale” at scale
• Performance goals: 6 billion documents, 4000 queries/
sec, 400 updates/sec, 2 seconds NRT sustained
performance
• 5% large collections (50 shards), 15% medium (10
shards), 85% small (1 shard) with replication factor of 3
• Target hardware: 24 CPUs, 126G RAM, 7 SSDs (460G)
+ 1 HDD (200G)
• 80% trafﬁc served by 20% of the tenants

How to manage large SolrCloud clusters
• Developed Solr Scale Toolkit
• Fabric based tool to setup and manage SolrCloud
clusters in AWS complete with collectd and SiLK
• Backup/Restore from S3. Parallel clone commands.
• Open source!
• https://github.com/LucidWorks/solr-scale-tk

Gathering metrics and analysing logs
• LucidWorks SiLK (Solr + Logstash + Kibana)
• collectd daemons on each host
• rabbitmq to queue messages before delivering to log stash
• Initially started with Kafka but discarded thinking it is
overkill
• Not happy with rabbitmq — crashes/unstable
• Might try Kafka again soon
• http://www.lucidworks.com/lucidworks-silk

Generating data and load
• Custom randomized data generator (re-producible
using a seed)
• JMeter for generating load
• Embedded CloudSolrServer (Solr Java client)
using JMeter Java Action Sampler
• JMeter distributed mode was itself a bottleneck!
• Not open source (yet) but we’re working on it!

Numb3rs
• 30 hosts, 120 nodes, 1000 collections, 8B+ docs,
15000 queries/second, 2000 writes/second, 2 second
NRT sustained over 24-hours
• More than 3x the numbers our client needed
• Unfortunately, we had to stop testing at that point :(
• Turned out they had a 95-5 trafﬁc ratio than a 80-20
ratio so actual performance is even better :)
• Our biggest cluster cost us just $120/hour :)

Not over yet
• We continue to test performance at scale
• Published indexing performance benchmark,
working on others
• 15 nodes, 30 shards, 1 replica, 157195 docs/sec
• 15 nodes, 30 shards, 2 replicas, 61062 docs/sec
• http://searchhub.org/introducing-the-solr-scale-
toolkit/

Our users are also pushing the limits
https://twitter.com/bretthoerner/status/476830302430437376

Up, up and away!
https://twitter.com/bretthoerner/status/476838275106091008

Not over yet
• SolrCloud continues to be improved
• SOLR-6220 - Replica placement strategy
• SOLR-6273 - Cross data center replication
• SOLR-5656 - Auto-add replicas
• SOLR-5986 - Don’t allow runaway queries to harm
the cluster
• Many, many more

Questions?
• Shalin Shekhar Mangar
• shalin@apache.org
• twitter.com/shalinmangar
• meetup.com/Bangalore-Apache-Solr-Lucene-
Group/
• www.meetup.com/Bangalore-Baby-Apache-Solr-
Group/

Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014

Similar to Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014 (20)

More from Shalin Shekhar Mangar

More from Shalin Shekhar Mangar (11)

Recently uploaded

Recently uploaded (20)

Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014