SlideShare a Scribd company logo
1 of 83
Download to read offline
About Me 
● Sr. Engineer at Pythian 
o Lead of Cassandra Practice 
#CassandraSummit 2014 
● Remote in Minnesota 
● Interests 
o Java, Clojure, Python dev 
o Data science 
o Information Security 
o Hobbyist electronics
About Pythian 
Pythian is a global data outsourcing and consulting company that 
specializes in optimizing and managing mission-critical data systems. 
Pythian blends the world’s leading data experts with advanced, secure 
service delivery processes to create the industry’s best standard of care 
for its clients. 
Since its inception, Pythian has managed some of the world’s largest, 
most business-critical data infrastructures. 
#CassandraSummit 2014 
10,000 
Pythian currently manages more than 10,000 
systems. 
350 
Pythian currently employs more than 350 people 
in 25 countries worldwide. 
1997 
Pythian was founded in 1997
About Cassandra 
● No Single Point of Failure 
● Fault Tolerant 
● Awesome properties for an operations team who does 
not want to get up at 3am 
#CassandraSummit 2014
About Cassandra 
● Nothing should be set up and forgotten about 
● Easy to do with Cassandra though 
o Fault tolerance on properly configured setup handles 
single node being down or having temp performance 
issues 
o No back pressure on writes until there is a lot of 
trouble 
#CassandraSummit 2014
Utilize the fault tolerance buffer 
● Need to observe and react to current issues 
● Predict future issues 
● Divide this into two approaches 
#CassandraSummit 2014 
o Proactive 
o Reactive
Proactive 
● Daily & Weekly checkups to prevent, and 
predict problems 
o Capacity 
o Performance bottlenecks 
o Data Modeling issues 
#CassandraSummit 2014
Reactive 
● Something about best laid plans… 
o Hardware failures 
o Bugs 
o Malicious or Non-Malicious users 
● Alarms, Pager Duty 
#CassandraSummit 2014
Common element 
#CassandraSummit 2014 
● Data is needed 
o form alerts 
o find anomalies 
o trending 
o debugging
Metrics 
● Window to the application 
o Bridge the gap - Coda Hale 
#CassandraSummit 2014
Gathering Metrics 
SOURCES 
Cassandra Environment 
OpsCenter Logs 
JMX CPU, Disk, Network 
Nodetool JVM, GC 
#CassandraSummit 2014
Metrics 
but of course… 
Without context, the data is just pretty graphs
JMX 
● Java Management Extensions 
● Complex… very engineered 
● Resources represented as objects with 
attributes and operations 
● Used for monitoring or as input 
#CassandraSummit 2014
JMX 
● The annoying gateway to metrics 
○ Poor tooling - requires java 
○ Slow, Memory Leaks 
○ Historically and currently frustrating for ops (pre 2.0.8) 
Cassandra 
Init connection to port 
7199 Reply with hostname:port for 
1024-65535 
#CassandraSummit 2014 
RMI connection 
Client (You) 
Gets new hostname:port, 
drops old connection and 
attempts to connect 
7199 
7199 
Connected!
JMX 
#CassandraSummit 2014 
● Visual 
o jconsole 
o visualvm 
● Command line 
o jmxterm 
o jmxsh 
● MX4J 
● Jolokia
JMX 
[domain]:[key]=[value],[key2]=[value2]... 
#CassandraSummit 2014
JMX 
[domain]:[key]=[value],[key2]=[value2]... 
com.pythian:site=blog,type=views,target=post1 
#CassandraSummit 2014
JMX 
[domain]:[key]=[value],[key2]=[value2]... 
com.pythian:site=blog,type=views,target=post1 
#CassandraSummit 2014
JMX 
[domain]:[key]=[value],[key2]=[value2]... 
com.pythian:site=blog,type=views,target=post1 
#CassandraSummit 2014
JMX Domains 
org.apache.cassandra. 
● db 
● internal 
● net 
● request 
#CassandraSummit 2014
JMX Beans 
org.apache.cassandra.metrics 
● db 
● internal 
● net 
● request 
#CassandraSummit 2014
JMX 
org.apache.cassandra.metrics :type= 
● Cache 
● Client 
● ClientRequest 
● ClientRequestMetrics 
● ColumnFamily 
● CommitLog 
● Compaction 
#CassandraSummit 2014 
● DroppedMessage 
● FileCache 
● Keyspace 
● Storage 
● ThreadPools
JMX 
org.apache.cassandra.metrics 
type=*, scope=*, name=*, 
type=ThreadPools, path=*, scope=*, name=*, 
type=ColumnFamily, keyspace=*, scope=*, name=*, 
type=Keyspace, keyspace=*, name=*, 
#CassandraSummit 2014
Metrics 
● Toolkit called metrics for metrics 
o By Coda Hale @ Yammer 
● Easy to use 
● Popular 
#CassandraSummit 2014
Types of Metrics 
#CassandraSummit 2014 
● Gauge 
o instantaneous value 
● Counter 
o number that can be incremented & decremented 
● Meter 
o rate of events over time (1/5/15 min moving avg) 
● Histogram 
o representation of statistical distribution 
§ 50, 75, 95, 98, 99, 99.9 percentile 
§ average, median, min, max, standard deviation 
● Timer 
o rate of events (meter) 
o histogram of duration
JMX 
#CassandraSummit 2014 
75th percentile is 683 MICROSECONDS 
(75% took 683us or less) 
One minute rate is 13,915 calls per SECOND
JMX 
● Overwhelming at first 
● Hard to tell what they mean without the source 
● Moves around a lot 
● Fortunately there is nodetool 
#CassandraSummit 2014
Nodetool 
● JMX command line wrapper 
● Many options 
● Operations and diagnostic procedures 
● For reactive analysis 
o ad hoc, spot checks 
#CassandraSummit 2014
Nodetool tpstats 
nodetool tpstats 
org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= 
{ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} 
Pool Name Active Pending Completed Blocked All time blocked 
ReadStage 0 0 113702 0 0 
RequestResponseStage 0 0 0 0 0 
MutationStage 0 0 164503 0 0 
... 
InternalResponseStage 0 0 0 0 0 
HintedHandoff 0 0 0 0 0 
#CassandraSummit 2014 
Message type Dropped 
RANGE_SLICE 0 
READ_REPAIR 0 
... 
REQUEST_RESPONSE 0 
COUNTER_MUTATION 0
Staged Event Driven Architecture 
● Decomposes complex event system 
● Set of stages (thread pools) 
● Queue between each 
● Shares a lot of pros cons as SOA 
#CassandraSummit 2014
Staged Event Driven Architecture 
#CassandraSummit 2014 
ReadStage 
Threads 
x32 
Client Request 
RequestResponse 
Threads 
ReadRepairStage 
Threads 
Messaging 
Service 
Node 2 
Node 1 Node 1 
Node 1 
= Task
Staged Event Driven Architecture 
● Its easy to overrun the processing capabilities of a stage 
that is not in the requests feedback loop (i.e. 
ReadRepairStage). 
● No write back pressure 
#CassandraSummit 2014
Nodetool tpstats 
nodetool tpstats 
org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= 
{ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} 
Pool Name Active Pending Completed Blocked All time blocked 
ReadStage 0 0 113702 0 0 
RequestResponseStage 0 0 0 0 0 
MutationStage 0 0 164503 0 0 
... 
InternalResponseStage 0 0 0 0 0 
HintedHandoff 0 0 0 0 0 
#CassandraSummit 2014 
Message type Dropped 
RANGE_SLICE 0 
READ_REPAIR 0 
... 
REQUEST_RESPONSE 0 
COUNTER_MUTATION 0
Nodetool tpstats 
nodetool tpstats 
org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= 
{ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} 
Pool Name Active Pending Completed Blocked All time blocked 
ReadStage 0 0 113702 0 0 
RequestResponseStage 0 0 0 0 0 
MutationStage 0 0 164503 0 0 
... 
InternalResponseStage 0 0 0 0 0 
HintedHandoff 0 0 0 0 0 
#CassandraSummit 2014 
Message type Dropped 
RANGE_SLICE 0 
READ_REPAIR 0 
... 
REQUEST_RESPONSE 0 
COUNTER_MUTATION 0
Nodetool tpstats 
nodetool tpstats 
org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= 
{ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} 
Pool Name Active Pending Completed Blocked All time blocked 
ReadStage 0 0 113702 0 0 
RequestResponseStage 0 0 0 0 0 
MutationStage 0 0 164503 0 0 
... 
InternalResponseStage 0 0 0 0 0 
HintedHandoff 0 0 0 0 0 
#CassandraSummit 2014 
Message type Dropped 
RANGE_SLICE 0 
READ_REPAIR 0 
... 
REQUEST_RESPONSE 0 
COUNTER_MUTATION 0
Nodetool tpstats 
nodetool tpstats 
org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= 
{ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} 
Pool Name Active Pending Completed Blocked All time blocked 
ReadStage 0 0 113702 0 0 
RequestResponseStage 0 0 0 0 0 
MutationStage 0 0 164503 0 0 
... 
InternalResponseStage 0 0 0 0 0 
HintedHandoff 0 0 0 0 0 
#CassandraSummit 2014 
Message type Dropped 
RANGE_SLICE 0 
READ_REPAIR 0 
... 
REQUEST_RESPONSE 0 
COUNTER_MUTATION 0
Nodetool tpstats 
nodetool tpstats 
org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= 
{ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} 
Pool Name Active Pending Completed Blocked All time blocked 
ReadStage 0 0 113702 0 0 
RequestResponseStage 0 0 0 0 0 
MutationStage 0 0 164503 0 0 
... 
InternalResponseStage 0 0 0 0 0 
HintedHandoff 0 0 0 0 0 
#CassandraSummit 2014 
Message type Dropped 
RANGE_SLICE 0 
READ_REPAIR 0 
... 
REQUEST_RESPONSE 0 
COUNTER_MUTATION 0
Nodetool tpstats 
nodetool tpstats 
org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= 
{ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} 
Pool Name Active Pending Completed Blocked All time blocked 
ReadStage 0 0 113702 0 0 
RequestResponseStage 0 0 0 0 0 
MutationStage 0 0 164503 0 0 
... 
InternalResponseStage 0 0 0 0 0 
HintedHandoff 0 0 0 0 0 
#CassandraSummit 2014 
Message type Dropped 
RANGE_SLICE 0 
READ_REPAIR 0 
... 
REQUEST_RESPONSE 0 
COUNTER_MUTATION 0
Nodetool tpstats 
nodetool tpstats 
org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= 
{ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} 
Pool Name Active Pending Completed Blocked All time blocked 
ReadStage 0 0 113702 0 0 
RequestResponseStage 0 0 0 0 0 
MutationStage 0 0 164503 0 0 
... 
InternalResponseStage 0 0 0 0 0 
HintedHandoff 0 0 0 0 0 
Message type Dropped 
RANGE_SLICE 0 
READ_REPAIR 0 
... 
REQUEST_RESPONSE 0 
COUNTER_MUTATION 0 
#CassandraSummit 2014 
RequestResponse 
Threads
Nodetool tpstats 
nodetool tpstats 
org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= 
{ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} 
Pool Name Active Pending Completed Blocked All time blocked 
ReadStage 0 0 113702 0 0 
RequestResponseStage 0 0 0 0 0 
MutationStage 0 0 164503 0 0 
... 
InternalResponseStage 0 0 0 0 0 
HintedHandoff 0 0 0 0 0 
Message type Dropped 
RANGE_SLICE 0 
READ_REPAIR 0 
... 
REQUEST_RESPONSE 1 
COUNTER_MUTATION 0 
#CassandraSummit 2014 
RequestResponse 
Threads
Nodetool tpstats 
nodetool tpstats 
org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= 
{ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} 
More at: 
http://www.evidencebasedit.com/guide-to-cassandra-thread-pools 
#CassandraSummit 2014
Nodetool cfhistograms 
nodetool cfhistograms {keyspace} {table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
SSTables per Read 
1 sstables: 98554 
2 sstables: 4534 
#CassandraSummit 2014 
Write Latency (microseconds) 
No Data 
Read Latency (microseconds) 
10 us: 2 
12 us: 17 
14 us: 96 
17 us: 208 
20 us: 677 
24 us: 3081 
29 us: 4552 
35 us: 3559
Read Write Path mile high overview 
Memtable SSTable 
#CassandraSummit 2014 
Writes Reads
Read Write Path mile high overview 
Memtable SSTable 
#CassandraSummit 2014 
Writes Reads
Read Write Path mile high overview 
Memtable SSTable 
#CassandraSummit 2014 
Writes Reads
Read Write Path mile high overview 
Memtable SSTable 
#CassandraSummit 2014 
Writes Reads
Read Write Path mile high overview 
Memtable SSTable 
#CassandraSummit 2014 
Writes Reads
Nodetool cfhistograms 
nodetool cfhistograms {keyspace} {table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
SSTables per Read 
1 sstables: 98554 
2 sstables: 4534 
#CassandraSummit 2014 
Write Latency (microseconds) 
No Data 
Read Latency (microseconds) 
10 us: 2 
12 us: 17 
14 us: 96 
17 us: 208 
20 us: 677 
24 us: 3081 
29 us: 4552 
35 us: 3559
Nodetool cfhistograms 1.1 
nodetool cfhistograms {keyspace} {table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Offset SSTables Write Latency Read Latency Row Size Column Count 
1 3579 0 0 0 0 
2 0 0 0 0 0 
. . . 
35 0 0 0 0 0 
42 0 0 27 0 0 
50 0 0 187 0 0 
60 0 10 460 0 0 
72 0 200 689 0 0 
86 0 663 552 0 0 
103 0 796 367 0 0 
124 0 297 736 0 0 
149 0 265 243 0 0 
179 0 460 263 0 0 
. . . 
25109160 0 0 0 0 0 
#CassandraSummit 2014
Nodetool cfhistograms 
#CassandraSummit 2014 
https://gist.github.com/clohfink/6068003
Nodetool cfhistograms 2.1 
nodetool cfhistograms {keyspace} {table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace/Table histograms 
Percentile SSTables Write Latency Read Latency Partition Size Cell Count 
(micros) (micros) (bytes) 
50% 1.00 10.00 524.00 310 5 
75% 1.00 11.75 888.00 310 5 
95% 1.00 15.00 4843.75 310 5 
98% 1.00 17.00 9658.90 310 5 
99% 1.00 19.00 12306.47 310 5 
Min 0.00 0.00 68.00 30 0 
Max 2.00 1219386.00 45383.00 310 5 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
SSTables in each level: [14/4, 1, 0, …, 0] 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool cfstats 
nodetool cfstats {-i} {keyspace}.{table} 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} 
Keyspace: Keyspace1 
Read Count: 11207 
Read Latency: 0.047931114482020164 ms. 
Write Count: 17598 
Write Latency: 0.053502954881236506 ms. 
Pending Tasks: 0 
Table: Standard1 
SSTable count: 3 
Space used (live), bytes: 9088955 
Space used (total), bytes: 9088955 
Space used by snapshots (total), bytes: 0 
SSTable Compression Ratio: 0.3672150946 
Memtable cell count: 0 
Memtable data size, bytes: 0 
Memtable switch count: 3 
Local read count: 11207 
Local read latency: 0.048 ms 
Local write count: 17598 
Local write latency: 0.054 ms 
Pending tasks: 0 
Bloom filter false positives: 0 
Bloom filter false ratio: 0.00000 
Bloom filter space used, bytes: 11688 
Compacted partition minimum bytes: 1110 
Compacted partition maximum bytes: 126934 
Compacted partition mean bytes: 2730 
Average live cells per slice: 0.0 
Average tombstones per slice: 0.0 
#CassandraSummit 2014
Nodetool proxyhistograms 
nodetool proxyhistograms 
org.apache.cassandra.metrics:type=ClientRequest,scope={Read|Write|RangeSlice},name=Latency 
#CassandraSummit 2014 
$ nodetool proxyhistograms 
proxy histograms 
Read Latency (microseconds) 
61214 us: 1 
Write Latency (microseconds) 
103 us: 22 
124 us: 142 
149 us: 297 
179 us: 1190 
215 us: 1823 
258 us: 2091 
...
Nodetool compactionstats 
#CassandraSummit 2014 
nodetool compactionstats 
org.apache.cassandra.metrics:type=Compaction 
pending tasks: 1 
compaction type keyspace table completed total unit Progress 
Compaction Keyspace1 Standard1 6076415 29605054 bytes 20.06% 
Active compaction remaining time : 0h00m03s
Nodetool compactionstats 
#CassandraSummit 2014 
nodetool compactionstats 
org.apache.cassandra.metrics:type=Compaction 
pending tasks: 1 
compaction type keyspace table completed total unit Progress 
Compaction Keyspace1 Standard1 6076415 29605054 bytes 20.06% 
Active compaction remaining time : 0h00m03s
Nodetool compactionstats 
#CassandraSummit 2014 
nodetool compactionstats 
org.apache.cassandra.metrics:type=Compaction 
pending tasks: 1 
compaction type keyspace table completed total unit Progress 
Compaction Keyspace1 Standard1 6076415 29605054 bytes 20.06% 
Active compaction remaining time : 0h00m03s
Nodetool compactionstats 
#CassandraSummit 2014 
nodetool compactionstats 
org.apache.cassandra.metrics:type=Compaction 
pending tasks: 1 
compaction type keyspace table completed total unit Progress 
Compaction Keyspace1 Standard1 6076415 29605054 bytes 20.06% 
Active compaction remaining time : 0h00m03s
Nodetool 
Much more!! 
http://www.datastax.com/documentation/ 
cassandra/2.0/cassandra/tools/ 
toolsNodetool_r.html 
#CassandraSummit 2014
OpsCenter 
● Provides visibility to key metrics 
● Alarming 
● Basic orchestration and config management 
● Constantly improving 
● Free* 
● Almost zero barrier to get setup 
● Very few reasons not to run it 
#CassandraSummit 2014
OpsCenter 
● Homogeneous tooling with rest of stack 
o Integrate metrics in with what app is using 
o orchestration and config management 
● (paid version) “Good enough” 
o a mature environment should have more 
#CassandraSummit 2014
Reporting Interface 
Default Addons Community 
JMX Ganglia Cassandra StatsD NewRelic Splunk 
Console Graphite Cloudwatch Kafka Riemann TempDB 
Csv Munin Riak InfluxDB Sematext 
Slf4j MongoDB OpenTSDB Librato … MORE 
#CassandraSummit 2014
Reporting Interface 
● Configurable with yaml 
o console, csv, ganglia, graphite 
● Create reporter with premain agent 
o compiling new jar with manifest 
o add to classpath 
o add javaagent in cassandra-env.sh 
#CassandraSummit 2014
Garbage Collection 
● Death, Taxes, and a stop the world GC 
● Common issue to all JVM based applications 
#CassandraSummit 2014
Garbage Collection 
Enable gc logging 
● Virtually no overhead 
● Can be very helpful in diagnosing 
performance issues 
#CassandraSummit 2014
Garbage Collection 
JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails" 
JVM_OPTS="$JVM_OPTS -XX:+PrintGCDateStamps" 
JVM_OPTS="$JVM_OPTS -XX:+PrintHeapAtGC" 
JVM_OPTS="$JVM_OPTS -XX:+PrintTenuringDistribution" 
JVM_OPTS="$JVM_OPTS -XX:+PrintGCApplicationStoppedTime" 
JVM_OPTS="$JVM_OPTS -XX:+PrintPromotionFailure" 
JVM_OPTS="$JVM_OPTS -XX:PrintFLSStatistics=1" 
JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc-`date +%s`.log" 
JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc.log" 
JVM_OPTS="$JVM_OPTS -XX:+UseGCLogFileRotation" 
JVM_OPTS="$JVM_OPTS -XX:NumberOfGCLogFiles=10" 
JVM_OPTS="$JVM_OPTS -XX:GCLogFileSize=10M" 
#CassandraSummit 2014
Garbage Collection 
JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails" 
JVM_OPTS="$JVM_OPTS -XX:+PrintGCDateStamps" 
JVM_OPTS="$JVM_OPTS -XX:+PrintHeapAtGC" 
JVM_OPTS="$JVM_OPTS -XX:+PrintTenuringDistribution" 
JVM_OPTS="$JVM_OPTS -XX:+PrintGCApplicationStoppedTime" 
JVM_OPTS="$JVM_OPTS -XX:+PrintPromotionFailure" 
JVM_OPTS="$JVM_OPTS -XX:PrintFLSStatistics=1" 
JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc-`date +%s`.log" 
JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc.log" 
JVM_OPTS="$JVM_OPTS -XX:+UseGCLogFileRotation" 
JVM_OPTS="$JVM_OPTS -XX:NumberOfGCLogFiles=10" 
JVM_OPTS="$JVM_OPTS -XX:GCLogFileSize=10M" 
#CassandraSummit 2014
Garbage Collection 
Could be its own talk 
Honorable mentions: 
● https://github.com/chewiebug/GCViewer 
● http://jworks.idv.tw/GcWeb/ 
● Python, R, Octave 
#CassandraSummit 2014
Logging 
/var/log/cassandra/system.log 
o provides a rolling log 
o log4j 
/var/log/cassandra/output.log 
o captured standard error and standard out 
o truncated on restart 
#CassandraSummit 2014 
System Logs 
o syslog, dmesg, etc
OS Metrics 
#CassandraSummit 2014 
Shout-out: 
http://www.brendangregg.com/linuxperf.html
JVM 
#CassandraSummit 2014 
● Heap 
o GC logs 
o JMX 
● Threads 
o jvmtop 
o Jstack (+htop) 
o kill -3 
o JMX
And Everything 
#CassandraSummit 2014
Questions 
? 
#CassandraSummit 2014

More Related Content

What's hot

Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
DataStax
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
Acunu
 
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
DataStax
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
Edward Capriolo
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
DataStax
 
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
DataStax
 

What's hot (20)

Monitoring Cassandra with Riemann
Monitoring Cassandra with RiemannMonitoring Cassandra with Riemann
Monitoring Cassandra with Riemann
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
 
Processing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and SparkProcessing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and Spark
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
 
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Micro-batching: High-performance writes
Micro-batching: High-performance writesMicro-batching: High-performance writes
Micro-batching: High-performance writes
 
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
 
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache Cassandra
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japan
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra Community
 
Helsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to CassandraHelsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to Cassandra
 
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
 
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
 

Viewers also liked

Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
DataStax Academy
 
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
DataStax Academy
 
Apache Cassandra at Narmal 2014
Apache Cassandra at Narmal 2014Apache Cassandra at Narmal 2014
Apache Cassandra at Narmal 2014
DataStax Academy
 

Viewers also liked (20)

Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
Cassandra Summit 2014: Cassandra in Large Scale Enterprise Grade xPatterns De...
 
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
Cassandra Summit 2014: Social Media Security Company Nexgate Relies on Cassan...
 
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
Cassandra Summit 2014: A Train of Thoughts About Growing and Scalability — Bu...
 
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
Cassandra Summit 2014: META — An Efficient Distributed Data Hub with Batch an...
 
Apache Cassandra at Narmal 2014
Apache Cassandra at Narmal 2014Apache Cassandra at Narmal 2014
Apache Cassandra at Narmal 2014
 
Introduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraIntroduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for Cassandra
 
Cassandra Summit 2014: Apache Cassandra at Telefonica CBS
Cassandra Summit 2014: Apache Cassandra at Telefonica CBSCassandra Summit 2014: Apache Cassandra at Telefonica CBS
Cassandra Summit 2014: Apache Cassandra at Telefonica CBS
 
Coursera's Adoption of Cassandra
Coursera's Adoption of CassandraCoursera's Adoption of Cassandra
Coursera's Adoption of Cassandra
 
Production Ready Cassandra (Beginner)
Production Ready Cassandra (Beginner)Production Ready Cassandra (Beginner)
Production Ready Cassandra (Beginner)
 
Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2
Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2
Cassandra Summit 2014: The Cassandra Experience at Orange — Season 2
 
New features in 3.0
New features in 3.0New features in 3.0
New features in 3.0
 
The Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to DatabaseThe Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to Database
 
Introduction to .Net Driver
Introduction to .Net DriverIntroduction to .Net Driver
Introduction to .Net Driver
 
Spark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and FurureSpark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and Furure
 
Playlists at Spotify
Playlists at SpotifyPlaylists at Spotify
Playlists at Spotify
 
Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2
 
Lessons Learned with Cassandra and Spark at the US Patent and Trademark Office
Lessons Learned with Cassandra and Spark at the US Patent and Trademark OfficeLessons Learned with Cassandra and Spark at the US Patent and Trademark Office
Lessons Learned with Cassandra and Spark at the US Patent and Trademark Office
 
Using Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with CassandraUsing Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with Cassandra
 
Signal Digital: The Skinny on Wide Rows
Signal Digital: The Skinny on Wide RowsSignal Digital: The Skinny on Wide Rows
Signal Digital: The Skinny on Wide Rows
 
Cassandra Summit 2014: Interactive OLAP Queries using Apache Cassandra and Spark
Cassandra Summit 2014: Interactive OLAP Queries using Apache Cassandra and SparkCassandra Summit 2014: Interactive OLAP Queries using Apache Cassandra and Spark
Cassandra Summit 2014: Interactive OLAP Queries using Apache Cassandra and Spark
 

Similar to Cassandra Summit 2014: Monitor Everything!

Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
DataStax
 

Similar to Cassandra Summit 2014: Monitor Everything! (20)

Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Taskerman: A Distributed Cluster Task Manager
Taskerman: A Distributed Cluster Task ManagerTaskerman: A Distributed Cluster Task Manager
Taskerman: A Distributed Cluster Task Manager
 
Clug 2012 March web server optimisation
Clug 2012 March   web server optimisationClug 2012 March   web server optimisation
Clug 2012 March web server optimisation
 
Monitoring Cassandra with graphite using Yammer Coda-Hale Library
Monitoring Cassandra with graphite using Yammer Coda-Hale LibraryMonitoring Cassandra with graphite using Yammer Coda-Hale Library
Monitoring Cassandra with graphite using Yammer Coda-Hale Library
 
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's Next
 
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Ben...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Ben...S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Ben...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Ben...
 
High performance computing tutorial, with checklist and tips to optimize clus...
High performance computing tutorial, with checklist and tips to optimize clus...High performance computing tutorial, with checklist and tips to optimize clus...
High performance computing tutorial, with checklist and tips to optimize clus...
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
WiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-TreeWiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-Tree
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
Incrementalism: An Industrial Strategy For Adopting Modern Automation
Incrementalism: An Industrial Strategy For Adopting Modern AutomationIncrementalism: An Industrial Strategy For Adopting Modern Automation
Incrementalism: An Industrial Strategy For Adopting Modern Automation
 
Fears, misconceptions, and accepted anti patterns of a first time cassandra a...
Fears, misconceptions, and accepted anti patterns of a first time cassandra a...Fears, misconceptions, and accepted anti patterns of a first time cassandra a...
Fears, misconceptions, and accepted anti patterns of a first time cassandra a...
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at NightHow Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
 

More from DataStax Academy

Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 

More from DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Cassandra Summit 2014: Monitor Everything!

  • 1. About Me ● Sr. Engineer at Pythian o Lead of Cassandra Practice #CassandraSummit 2014 ● Remote in Minnesota ● Interests o Java, Clojure, Python dev o Data science o Information Security o Hobbyist electronics
  • 2. About Pythian Pythian is a global data outsourcing and consulting company that specializes in optimizing and managing mission-critical data systems. Pythian blends the world’s leading data experts with advanced, secure service delivery processes to create the industry’s best standard of care for its clients. Since its inception, Pythian has managed some of the world’s largest, most business-critical data infrastructures. #CassandraSummit 2014 10,000 Pythian currently manages more than 10,000 systems. 350 Pythian currently employs more than 350 people in 25 countries worldwide. 1997 Pythian was founded in 1997
  • 3. About Cassandra ● No Single Point of Failure ● Fault Tolerant ● Awesome properties for an operations team who does not want to get up at 3am #CassandraSummit 2014
  • 4. About Cassandra ● Nothing should be set up and forgotten about ● Easy to do with Cassandra though o Fault tolerance on properly configured setup handles single node being down or having temp performance issues o No back pressure on writes until there is a lot of trouble #CassandraSummit 2014
  • 5. Utilize the fault tolerance buffer ● Need to observe and react to current issues ● Predict future issues ● Divide this into two approaches #CassandraSummit 2014 o Proactive o Reactive
  • 6. Proactive ● Daily & Weekly checkups to prevent, and predict problems o Capacity o Performance bottlenecks o Data Modeling issues #CassandraSummit 2014
  • 7. Reactive ● Something about best laid plans… o Hardware failures o Bugs o Malicious or Non-Malicious users ● Alarms, Pager Duty #CassandraSummit 2014
  • 8. Common element #CassandraSummit 2014 ● Data is needed o form alerts o find anomalies o trending o debugging
  • 9. Metrics ● Window to the application o Bridge the gap - Coda Hale #CassandraSummit 2014
  • 10. Gathering Metrics SOURCES Cassandra Environment OpsCenter Logs JMX CPU, Disk, Network Nodetool JVM, GC #CassandraSummit 2014
  • 11. Metrics but of course… Without context, the data is just pretty graphs
  • 12. JMX ● Java Management Extensions ● Complex… very engineered ● Resources represented as objects with attributes and operations ● Used for monitoring or as input #CassandraSummit 2014
  • 13. JMX ● The annoying gateway to metrics ○ Poor tooling - requires java ○ Slow, Memory Leaks ○ Historically and currently frustrating for ops (pre 2.0.8) Cassandra Init connection to port 7199 Reply with hostname:port for 1024-65535 #CassandraSummit 2014 RMI connection Client (You) Gets new hostname:port, drops old connection and attempts to connect 7199 7199 Connected!
  • 14. JMX #CassandraSummit 2014 ● Visual o jconsole o visualvm ● Command line o jmxterm o jmxsh ● MX4J ● Jolokia
  • 19. JMX Domains org.apache.cassandra. ● db ● internal ● net ● request #CassandraSummit 2014
  • 20. JMX Beans org.apache.cassandra.metrics ● db ● internal ● net ● request #CassandraSummit 2014
  • 21. JMX org.apache.cassandra.metrics :type= ● Cache ● Client ● ClientRequest ● ClientRequestMetrics ● ColumnFamily ● CommitLog ● Compaction #CassandraSummit 2014 ● DroppedMessage ● FileCache ● Keyspace ● Storage ● ThreadPools
  • 22. JMX org.apache.cassandra.metrics type=*, scope=*, name=*, type=ThreadPools, path=*, scope=*, name=*, type=ColumnFamily, keyspace=*, scope=*, name=*, type=Keyspace, keyspace=*, name=*, #CassandraSummit 2014
  • 23. Metrics ● Toolkit called metrics for metrics o By Coda Hale @ Yammer ● Easy to use ● Popular #CassandraSummit 2014
  • 24. Types of Metrics #CassandraSummit 2014 ● Gauge o instantaneous value ● Counter o number that can be incremented & decremented ● Meter o rate of events over time (1/5/15 min moving avg) ● Histogram o representation of statistical distribution § 50, 75, 95, 98, 99, 99.9 percentile § average, median, min, max, standard deviation ● Timer o rate of events (meter) o histogram of duration
  • 25. JMX #CassandraSummit 2014 75th percentile is 683 MICROSECONDS (75% took 683us or less) One minute rate is 13,915 calls per SECOND
  • 26. JMX ● Overwhelming at first ● Hard to tell what they mean without the source ● Moves around a lot ● Fortunately there is nodetool #CassandraSummit 2014
  • 27. Nodetool ● JMX command line wrapper ● Many options ● Operations and diagnostic procedures ● For reactive analysis o ad hoc, spot checks #CassandraSummit 2014
  • 28. Nodetool tpstats nodetool tpstats org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= {ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 113702 0 0 RequestResponseStage 0 0 0 0 0 MutationStage 0 0 164503 0 0 ... InternalResponseStage 0 0 0 0 0 HintedHandoff 0 0 0 0 0 #CassandraSummit 2014 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 ... REQUEST_RESPONSE 0 COUNTER_MUTATION 0
  • 29. Staged Event Driven Architecture ● Decomposes complex event system ● Set of stages (thread pools) ● Queue between each ● Shares a lot of pros cons as SOA #CassandraSummit 2014
  • 30. Staged Event Driven Architecture #CassandraSummit 2014 ReadStage Threads x32 Client Request RequestResponse Threads ReadRepairStage Threads Messaging Service Node 2 Node 1 Node 1 Node 1 = Task
  • 31. Staged Event Driven Architecture ● Its easy to overrun the processing capabilities of a stage that is not in the requests feedback loop (i.e. ReadRepairStage). ● No write back pressure #CassandraSummit 2014
  • 32. Nodetool tpstats nodetool tpstats org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= {ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 113702 0 0 RequestResponseStage 0 0 0 0 0 MutationStage 0 0 164503 0 0 ... InternalResponseStage 0 0 0 0 0 HintedHandoff 0 0 0 0 0 #CassandraSummit 2014 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 ... REQUEST_RESPONSE 0 COUNTER_MUTATION 0
  • 33. Nodetool tpstats nodetool tpstats org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= {ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 113702 0 0 RequestResponseStage 0 0 0 0 0 MutationStage 0 0 164503 0 0 ... InternalResponseStage 0 0 0 0 0 HintedHandoff 0 0 0 0 0 #CassandraSummit 2014 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 ... REQUEST_RESPONSE 0 COUNTER_MUTATION 0
  • 34. Nodetool tpstats nodetool tpstats org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= {ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 113702 0 0 RequestResponseStage 0 0 0 0 0 MutationStage 0 0 164503 0 0 ... InternalResponseStage 0 0 0 0 0 HintedHandoff 0 0 0 0 0 #CassandraSummit 2014 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 ... REQUEST_RESPONSE 0 COUNTER_MUTATION 0
  • 35. Nodetool tpstats nodetool tpstats org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= {ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 113702 0 0 RequestResponseStage 0 0 0 0 0 MutationStage 0 0 164503 0 0 ... InternalResponseStage 0 0 0 0 0 HintedHandoff 0 0 0 0 0 #CassandraSummit 2014 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 ... REQUEST_RESPONSE 0 COUNTER_MUTATION 0
  • 36. Nodetool tpstats nodetool tpstats org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= {ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 113702 0 0 RequestResponseStage 0 0 0 0 0 MutationStage 0 0 164503 0 0 ... InternalResponseStage 0 0 0 0 0 HintedHandoff 0 0 0 0 0 #CassandraSummit 2014 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 ... REQUEST_RESPONSE 0 COUNTER_MUTATION 0
  • 37. Nodetool tpstats nodetool tpstats org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= {ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 113702 0 0 RequestResponseStage 0 0 0 0 0 MutationStage 0 0 164503 0 0 ... InternalResponseStage 0 0 0 0 0 HintedHandoff 0 0 0 0 0 #CassandraSummit 2014 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 ... REQUEST_RESPONSE 0 COUNTER_MUTATION 0
  • 38. Nodetool tpstats nodetool tpstats org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= {ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 113702 0 0 RequestResponseStage 0 0 0 0 0 MutationStage 0 0 164503 0 0 ... InternalResponseStage 0 0 0 0 0 HintedHandoff 0 0 0 0 0 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 ... REQUEST_RESPONSE 0 COUNTER_MUTATION 0 #CassandraSummit 2014 RequestResponse Threads
  • 39. Nodetool tpstats nodetool tpstats org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= {ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 113702 0 0 RequestResponseStage 0 0 0 0 0 MutationStage 0 0 164503 0 0 ... InternalResponseStage 0 0 0 0 0 HintedHandoff 0 0 0 0 0 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 ... REQUEST_RESPONSE 1 COUNTER_MUTATION 0 #CassandraSummit 2014 RequestResponse Threads
  • 40. Nodetool tpstats nodetool tpstats org.apache.cassandra.metrics:type=ThreadPools,path={internal|request|transport},scope={*},name= {ActiveTasks|PendingTasks|CompletedTasks|CurrentlyBlockedTasks|TotalBlockedTasks} More at: http://www.evidencebasedit.com/guide-to-cassandra-thread-pools #CassandraSummit 2014
  • 41. Nodetool cfhistograms nodetool cfhistograms {keyspace} {table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} SSTables per Read 1 sstables: 98554 2 sstables: 4534 #CassandraSummit 2014 Write Latency (microseconds) No Data Read Latency (microseconds) 10 us: 2 12 us: 17 14 us: 96 17 us: 208 20 us: 677 24 us: 3081 29 us: 4552 35 us: 3559
  • 42. Read Write Path mile high overview Memtable SSTable #CassandraSummit 2014 Writes Reads
  • 43. Read Write Path mile high overview Memtable SSTable #CassandraSummit 2014 Writes Reads
  • 44. Read Write Path mile high overview Memtable SSTable #CassandraSummit 2014 Writes Reads
  • 45. Read Write Path mile high overview Memtable SSTable #CassandraSummit 2014 Writes Reads
  • 46. Read Write Path mile high overview Memtable SSTable #CassandraSummit 2014 Writes Reads
  • 47. Nodetool cfhistograms nodetool cfhistograms {keyspace} {table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} SSTables per Read 1 sstables: 98554 2 sstables: 4534 #CassandraSummit 2014 Write Latency (microseconds) No Data Read Latency (microseconds) 10 us: 2 12 us: 17 14 us: 96 17 us: 208 20 us: 677 24 us: 3081 29 us: 4552 35 us: 3559
  • 48. Nodetool cfhistograms 1.1 nodetool cfhistograms {keyspace} {table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Offset SSTables Write Latency Read Latency Row Size Column Count 1 3579 0 0 0 0 2 0 0 0 0 0 . . . 35 0 0 0 0 0 42 0 0 27 0 0 50 0 0 187 0 0 60 0 10 460 0 0 72 0 200 689 0 0 86 0 663 552 0 0 103 0 796 367 0 0 124 0 297 736 0 0 149 0 265 243 0 0 179 0 460 263 0 0 . . . 25109160 0 0 0 0 0 #CassandraSummit 2014
  • 49. Nodetool cfhistograms #CassandraSummit 2014 https://gist.github.com/clohfink/6068003
  • 50. Nodetool cfhistograms 2.1 nodetool cfhistograms {keyspace} {table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace/Table histograms Percentile SSTables Write Latency Read Latency Partition Size Cell Count (micros) (micros) (bytes) 50% 1.00 10.00 524.00 310 5 75% 1.00 11.75 888.00 310 5 95% 1.00 15.00 4843.75 310 5 98% 1.00 17.00 9658.90 310 5 99% 1.00 19.00 12306.47 310 5 Min 0.00 0.00 68.00 30 0 Max 2.00 1219386.00 45383.00 310 5 #CassandraSummit 2014
  • 51. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 52. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 53. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 54. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 55. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 SSTables in each level: [14/4, 1, 0, …, 0] Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 56. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 57. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 58. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 59. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 60. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 61. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 62. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 63. Nodetool cfstats nodetool cfstats {-i} {keyspace}.{table} org.apache.cassandra.metrics:type=ColumnFamily,keyspace={keyspace},scope={table} Keyspace: Keyspace1 Read Count: 11207 Read Latency: 0.047931114482020164 ms. Write Count: 17598 Write Latency: 0.053502954881236506 ms. Pending Tasks: 0 Table: Standard1 SSTable count: 3 Space used (live), bytes: 9088955 Space used (total), bytes: 9088955 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.3672150946 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 3 Local read count: 11207 Local read latency: 0.048 ms Local write count: 17598 Local write latency: 0.054 ms Pending tasks: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 11688 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 126934 Compacted partition mean bytes: 2730 Average live cells per slice: 0.0 Average tombstones per slice: 0.0 #CassandraSummit 2014
  • 64. Nodetool proxyhistograms nodetool proxyhistograms org.apache.cassandra.metrics:type=ClientRequest,scope={Read|Write|RangeSlice},name=Latency #CassandraSummit 2014 $ nodetool proxyhistograms proxy histograms Read Latency (microseconds) 61214 us: 1 Write Latency (microseconds) 103 us: 22 124 us: 142 149 us: 297 179 us: 1190 215 us: 1823 258 us: 2091 ...
  • 65. Nodetool compactionstats #CassandraSummit 2014 nodetool compactionstats org.apache.cassandra.metrics:type=Compaction pending tasks: 1 compaction type keyspace table completed total unit Progress Compaction Keyspace1 Standard1 6076415 29605054 bytes 20.06% Active compaction remaining time : 0h00m03s
  • 66. Nodetool compactionstats #CassandraSummit 2014 nodetool compactionstats org.apache.cassandra.metrics:type=Compaction pending tasks: 1 compaction type keyspace table completed total unit Progress Compaction Keyspace1 Standard1 6076415 29605054 bytes 20.06% Active compaction remaining time : 0h00m03s
  • 67. Nodetool compactionstats #CassandraSummit 2014 nodetool compactionstats org.apache.cassandra.metrics:type=Compaction pending tasks: 1 compaction type keyspace table completed total unit Progress Compaction Keyspace1 Standard1 6076415 29605054 bytes 20.06% Active compaction remaining time : 0h00m03s
  • 68. Nodetool compactionstats #CassandraSummit 2014 nodetool compactionstats org.apache.cassandra.metrics:type=Compaction pending tasks: 1 compaction type keyspace table completed total unit Progress Compaction Keyspace1 Standard1 6076415 29605054 bytes 20.06% Active compaction remaining time : 0h00m03s
  • 69. Nodetool Much more!! http://www.datastax.com/documentation/ cassandra/2.0/cassandra/tools/ toolsNodetool_r.html #CassandraSummit 2014
  • 70. OpsCenter ● Provides visibility to key metrics ● Alarming ● Basic orchestration and config management ● Constantly improving ● Free* ● Almost zero barrier to get setup ● Very few reasons not to run it #CassandraSummit 2014
  • 71. OpsCenter ● Homogeneous tooling with rest of stack o Integrate metrics in with what app is using o orchestration and config management ● (paid version) “Good enough” o a mature environment should have more #CassandraSummit 2014
  • 72. Reporting Interface Default Addons Community JMX Ganglia Cassandra StatsD NewRelic Splunk Console Graphite Cloudwatch Kafka Riemann TempDB Csv Munin Riak InfluxDB Sematext Slf4j MongoDB OpenTSDB Librato … MORE #CassandraSummit 2014
  • 73. Reporting Interface ● Configurable with yaml o console, csv, ganglia, graphite ● Create reporter with premain agent o compiling new jar with manifest o add to classpath o add javaagent in cassandra-env.sh #CassandraSummit 2014
  • 74. Garbage Collection ● Death, Taxes, and a stop the world GC ● Common issue to all JVM based applications #CassandraSummit 2014
  • 75. Garbage Collection Enable gc logging ● Virtually no overhead ● Can be very helpful in diagnosing performance issues #CassandraSummit 2014
  • 76. Garbage Collection JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails" JVM_OPTS="$JVM_OPTS -XX:+PrintGCDateStamps" JVM_OPTS="$JVM_OPTS -XX:+PrintHeapAtGC" JVM_OPTS="$JVM_OPTS -XX:+PrintTenuringDistribution" JVM_OPTS="$JVM_OPTS -XX:+PrintGCApplicationStoppedTime" JVM_OPTS="$JVM_OPTS -XX:+PrintPromotionFailure" JVM_OPTS="$JVM_OPTS -XX:PrintFLSStatistics=1" JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc-`date +%s`.log" JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc.log" JVM_OPTS="$JVM_OPTS -XX:+UseGCLogFileRotation" JVM_OPTS="$JVM_OPTS -XX:NumberOfGCLogFiles=10" JVM_OPTS="$JVM_OPTS -XX:GCLogFileSize=10M" #CassandraSummit 2014
  • 77. Garbage Collection JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails" JVM_OPTS="$JVM_OPTS -XX:+PrintGCDateStamps" JVM_OPTS="$JVM_OPTS -XX:+PrintHeapAtGC" JVM_OPTS="$JVM_OPTS -XX:+PrintTenuringDistribution" JVM_OPTS="$JVM_OPTS -XX:+PrintGCApplicationStoppedTime" JVM_OPTS="$JVM_OPTS -XX:+PrintPromotionFailure" JVM_OPTS="$JVM_OPTS -XX:PrintFLSStatistics=1" JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc-`date +%s`.log" JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc.log" JVM_OPTS="$JVM_OPTS -XX:+UseGCLogFileRotation" JVM_OPTS="$JVM_OPTS -XX:NumberOfGCLogFiles=10" JVM_OPTS="$JVM_OPTS -XX:GCLogFileSize=10M" #CassandraSummit 2014
  • 78. Garbage Collection Could be its own talk Honorable mentions: ● https://github.com/chewiebug/GCViewer ● http://jworks.idv.tw/GcWeb/ ● Python, R, Octave #CassandraSummit 2014
  • 79. Logging /var/log/cassandra/system.log o provides a rolling log o log4j /var/log/cassandra/output.log o captured standard error and standard out o truncated on restart #CassandraSummit 2014 System Logs o syslog, dmesg, etc
  • 80. OS Metrics #CassandraSummit 2014 Shout-out: http://www.brendangregg.com/linuxperf.html
  • 81. JVM #CassandraSummit 2014 ● Heap o GC logs o JMX ● Threads o jvmtop o Jstack (+htop) o kill -3 o JMX