AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Mahdi Ben Hamida - SignalFx
11/30/2016
DEV307
How to Scale and Operate
Elasticsearch on AWS

What to Expect from the Session
• Elasticsearch (ES) usage at SignalFx
• What do we use ES for?
• How ES is deployed on AWS?
• Backup/restore of ES on Amazon S3
• Important ES/AWS metrics to monitor; what to alert on
• ES capacity planning
• Zero-downtime re-sharding
• SignalFx metadata storage architecture overview
• Scaling up and zero-downtime re-sharding on AWS

ES Usage
Ad-hoc queries Auto-complete Full-text search

Cluster Size
• 4 clusters in production on Amazon EC2
• Biggest cluster
• 54 data nodes, 3 master nodes, 6 client nodes deployed
across 3 AZs
• Over 1.3 billion unique documents
• 10+ TB of data
• 270 shards (primaries + replica)
• Sustained 75 QPS, 1K index/sec

ES Deployment on AWS
• Dockerized ES 2.3/1.7 clusters. Orchestration done
using MaestroNG
• Biggest cluster
• Data nodes: i2.2xlarge – 16 GB heap (61GB total)
• Master nodes: m3.large – 2 GB heap (7.5GB total)
• Client nodes: m3.xlarge – 10 GB heap (15GB total)
• ES rack awareness to distribute primary and 2 replica
across 3 Availability Zones

Backup/Restore
• Made easy using the AWS Cloud plugin:
PUT _snapshot/s3-repo { "type": "s3",
"settings": { "bucket": ”signalfx-es-
backups", "region": "us-east" } }
• Incremental backups
• Un-versioned S3 bucket
• VPC S3 endpoint to avoid bandwidth constraints
• Instance profiles for authentication to S3
• Cron job for hourly snapshots and weekly rotation

Key Detectors
• High CPU usage, low disk size
• Sustained high heap usage
• Master nodes availability
• Cluster state (green/yellow/red)
• Unassigned shards
• Thread pool rejections (search, bulk, index are the most
critical)

Always Test your ES Detectors/Alerts

Elasticsearch Capacity Planning

Capacity Factors
• Indexing
• CPU/IO utilization can be considerable
• Merges are CPU/IO intensive. Improved in ES 2.0
• Queries
• CPU load
• Memory load

ES Sharding & Scale-up
1P
0R
0P
1R
node-1
node-2
1P
0P
node-1
node-2
0R
1R
node-3
node-4
1P
0P
node-1
node-2
0R
1R
node-3
node-4
0R
1R
node-5
node-6

Sizing Shards
• Create an index with one shard
• Simulate what you expect your indexing load to be –
measure CPU/IO load, find where it breaks
• Do the same with queries
• Determine disk consumption (average document size)

Why Re-shard?
• Required if you can’t scale up indexing by adding more
nodes
• If the index is read-only, you could implement a simpler
approach using aliases
• If the index is being written to, it’s more complicated

service-A
metabase-client
mb-
server-1
mb-
server-1metabase-1
index-topic
write-topic
(1) enqueue write
(2) dequeue write
(3) write to C*
(4) enqueue index
(7) index document
(5) dequeue index
(6) read from C*
SignalFx’s Metadata Storage Architecture

Index Re-sharding Process
• Pre-requisites
• Phase 1: create target index
• Phase 2: bulk re-indexing
• Phase 3: double writing & change re-conciliation
• Phase 4: testing new index
• Phase 5: complete re-sharding process

Pre-requisite 1: readers query from an alias
myindex_v1
myindex reader
reader
reader

Pre-requisite 2: indexing state +
generation number
myindex_v1
indexer generation: 42
extra: <null>
current: myindex_v1

myindex_v2
Phase 1: create new index with updated
mappings
myindex_v1
extra: <null>
current: myindex_v1

Phase 2: increment generation, then start
bulk re-indexing of older generations
myindex_v1 myindex_v2
_generation <= 42
extra: <null>
current: myindex_v1

During this step, documents may get
added/updated (or deleted*)
_generation <= 42
43
43
updated
created
indexer
myindex_v1
generation: 43
extra: <null>
current: myindex_v1
myindex_v2

Index state at the end of the bulk indexing
43
43
43
43
43
indexer
myindex_v1
generation: 43
extra: <null>
current: myindex_v1
myindex_v2

Phase 3 – (a): enable double writing & bump
generation
43
43
43
43
43
indexer
myindex_v2myindex_v1
generation: 44
extra: myindex_v2
current: myindex_v1
43

Phase 3 – (b): re-index documents at
generation 43
43
43
43
43
43
44
44 44
indexer
generation: 44
extra: myindex_v2
current: myindex_v1
43
44

Phase 3 – (c): re-index documents at
generation 43
43
43
43
43
43
44 44
44 44
indexer
generation: 44
extra: myindex_v2
current: myindex_v1
43
44 44

generation 43
43
43
43
43
43 43
44 44
44 44
44 44
indexer
generation: 44
extra: myindex_v2
current: myindex_v1
43
44 44

generation 43
43
43
43
43
43 43
44 44
44 44
44 44
44 44
44 44
indexer
generation: 44
extra: myindex_v2
current: myindex_v1
44 44
44 44

Phase 3 – (e): perfect sync of both indices
43
43
43
43
43 43
44 44
44 44
44 44
44 44
44 44
44 44
44 44
indexer
generation: 44
extra: myindex_v2
current: myindex_v1
44 44
44 44

Phase 4: A/B testing of the new index
43
43
43
43
43 43
44 44
44 44
44 44
44 44
44 44
44 44
44 44
indexer
generation: 44
extra: myindex_v2
current: myindex_v1
myindexreader
reader
reader
44 44
44 44

Phase 4: swap read alias (or swap back !)
43
43
43
43
43 43
44 44
44 44
44 44
44 44
44 44
44 44
44 44
indexer
generation: 44
extra: myindex_v2
current: myindex_v1
myindexreader
reader
reader
44 44
44 44

Phase 5: switch write index, generation,
stop double writing
43
43
43
43
43 43
44 44
44 44
44 44
44 44
44 44
44 44
44 44
45
indexer
45
45
45
myindex_v1
generation: 45
extra: <null>
current: myindex_v2
myindex_v2
44 44
44 44

Handling Failures
• Bulk re-indexing can fail (and it does); you don’t want to
re-start from scratch
• Use a “partition” field
• Migrate partition ranges
• Deletions could be a problem. We handle that by using
“deletion markers” instead then cleaning up

Performance Considerations
• Migrate using partition ranges to avoid holding segments
for a long time
• Add temporary nodes to handle the load
• Disable refreshes on the target index (so worth it!)
• Start with no replica (or one just in case)
• Avoid ”hot” shards by sorting on a field (a timestamp for
example)
• Have throttling controls to control indexing load

Thank you!
Sign-up for a free trial at
signalfx.com

Remember to complete
your evaluations!

AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)

Similar to AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307) (20)

More from Amazon Web Services

More from Amazon Web Services (20)

Recently uploaded

Recently uploaded (20)

AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)