Strongly Consistent Global Indexes for Phoenix

STRONGLY CONSISTENT
GLOBAL INDEXES for
Nontransactional Tables
Designed by: Kadir Ozdemir
Presenter: Gokcen Iskender

Outline
● Background
● What is new for mutable global indexes
● What is new for immutable global indexes
● Correctness of the new approach
● Performance implications

Terminology
● Global - Indexed data is stored in a separate physical table from the base
table
● Immutable - Once data is written to the base table (and automatically
persisted to the index), no indexed column in a row will ever change (though it
may be deleted or age out due to a TTL setting)
● Mutable - Data can be freely changed.
● Mutation - Upserts and Deletes

Background - Global Mutable Indexes
Application Server
Application
Phoenix Client
HBase Client
Upsert /
Delete
Batch of
Mutations
Region
WAL
Region Server (for a
data table region)
1
HFile
Indexer
Region
WAL
Region Server (for an
index table region)
4
HFile
2
3
3

Background - Global Immutable Indexes
Application Server
Application
Phoenix Client
HBase Client
Upsert /
Delete
Batch of
Mutations
Region
WAL
Region Server (for a data table region)
Region Servers (for an index table
region)
HFile
Region
WAL HFile

Global Indexes Can Get Out-of-Sync Easily!
MUTABLE Global Indexes
1. Indexer goes through data table mutations
and prepares corresponding mutations for
index tables
1. Applies mutations to data table
1. Applies mutations on index table. -->
These are likely to be done remotely as
index table regions are likely to be on
other region servers. Likely to fail due to
RPC timeout, network, region server
failures, etc
Indexer for IMMUTABLE Global Indexes
1. Mutations are prepared on the client side
1. Data table and Index table mutations are
sent to region servers in parallel
1. There is no deterministic order in which
mutations are applied. Index and table can
get out of sync.

Consistent Global Index Design Objectives
● Global indexes should be always in sync with their data tables
● Consistency should not result in significant performance or latency impact
● Redesign should not require rewriting of existing Phoenix modules
● Consistent indexes should result in operational simplification by eliminating
index rebuilds
Phoenix JIRAs (PHOENIX-5156 and PHOENIX-5211)

Observations
● An index table row can always be reconstructed from the corresponding data
table row
● In HBase writes are fast -- we can add extra write phase without severely
impacting write performance
● Distributed two-phase commit protocols, i.e., transactions, are known to be
expensive. Existing solutions are in Beta.

New Design
● VERIFIED column on Index rows
● Reordered operations
● Extra write phase

Design Change for Mutable Global Indexes
Current Design
Write Path
● Update the data table
● Update the index tables (and
wish for the best)
Read Path
● Read the index rows (and
assume they are all good)
New Design
Write Path
● Update the index table rows with unverified status
● Update the index table rows with verified status
Read Path
● Read the index rows and check their verify flag
● If a row is unverified, reconstruct the row from the
data table

Design Change for Immutable Global Indexes
Current Design
Write Path
● Update the data table and the index
tables in parallel (and wish for the
best)
Read Path
● Read the index rows (and assume
they are all good)
New Design (same as
Mutable)
Write Path
● Update the index tables rows with unverified
status
● Update the index table rows with verified status
Read Path
● Read the index rows and check their verify flag
● If a row is unverified, reconstruct the row from
the data table

Global Mutable Indexes - Mutate
Application Server
Application
Phoenix Client
HBase Client
Upsert /
Delete
Batch of
Mutations
Region
WAL
Region Server (for a data
table region)
0
3
HFile
Indexer
1, 2, 4, 6, 8
5,
9
index table region)
Region
WAL HFile
index table region)
Region
WAL HFile
5,
9
7

Global Mutable Indexes Batch Example - Update
Data Table:
Pk C1 C2 C3
1 A X Y
Index (on C1, include C3):
Pk C3
A, 1 Y
Update C1 from A to B
1. Index tables are updated in parallel
Update - Put {{A, 1}, VERIFIED=false}
Insert - Put {{B, 1}, VERIFIED=false}
1. Data table write
2. Index tables set to verified/deleted
Delete {A, 1} ---> Delete is done in third phase so that if it
fails in first phase we can't recover without rebuild.
Put {{B, 1}, VERIFIED = true}

Global Mutable Indexes Batch Example - Delete
Data Table:
Pk C1 C2 C3
1 A X Y
Index (on C1, has C3):
Pk C3
A, 1 Y
Delete row with Pk = 1:
1. Index tables are updated in parallel)
Update - Put {{A, 1}, VERIFIED=false}
1. Delete data table row
Delete {1}
1. Delete index table row
Delete {A, 1}

Global Immutable Indexes - Mutate
Application Server
Application
Phoenix Client
HBase Client
Upsert /
Delete
Batch of
Mutations
Region
WAL
Region Server (for a data table region)
region)
HFile
Region
WAL HFile
1,
3
2
2
1,
3
1,2,
3

Global Mutable & Immutable Indexes - Read
Application Server
Application
Phoenix Client
HBase Client
Select
Scan
Region
Region Server (for a data table
region)
region)
HFile
Region
WAL HFile
2,
7
Region
HFileWAL
A Scan
Region
Observer
Global
Index
Checker
Ungroupped
Aggregate
Region
Observer
Indexer
0
1 3
4
5
5
6
6
6

Correctness - Without concurrent updates
● VERIFIED = true => index update happened after data table update
● VERIFIED = false => data is read from data table
● Missing index row cases: Not possible. Because
○ Index table is updated first before that the data table in strict order,
having the row in the data table implies that the index table update has
been attempted.
○ If the index update is failed then the data table update will not be
attempted and therefore, it is not possible to have a data table row but
not the corresponding index row because of index update failures.
○ Since an index row is deleted only after the corresponding data table row
is deleted, there cannot be missing row because data row deletes.

Correctness - With concurrent updates
● Detect it and not proceed with Phase 3
● Read-repair reconstructs index from the data table

Upgrade
● No schema change since the VERIFIED column is an existing empty column.
● It is advised to rebuild indexes after PHOENIX-5156 to make sure that Index
is always consistent for both old and new data.

Performance
Preliminary results:
● Increase in 25% in write latency
● No noticeable increase in read latency
Test Env:
● Data table with two indexes.
● 200K large rows on data table.
● 10 node AWS cluster
○ 4 core nodes, 2.3 Ghz, 10 GB disk, 32 GB memory VMs

Resources
Phoenix Secondary Indexing:
http://phoenix.apache.org/secondary_indexing.html
PHOENIX-5018, PHOENIX-5190, PHOENIX-5156, PHOENIX-5211
Design doc:
https://docs.google.com/document/d/1Vsf23GCT0_CK4q8g_xaXyE_4Dw
3aH71BfZypEy3T9iQ/edit?usp=sharing
kozdemir@salesforce.com

Strongly Consistent Global Indexes for Phoenix

Recommended

Recommended

More Related Content

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Strongly Consistent Global Indexes for Phoenix

Editor's Notes