The document discusses the journey of a company's Cassandra platform team from a state of insanity to sanity. Key points:
1) The team initially struggled with multi-tenancy, configuration drift, too many metrics and heroes causing issues. This led to low morale.
2) To regain sanity, the team committed to agreements, pipeline everything, focus on current problems and use their graph database proof-of-concept.
3) The graph database helped provide recommendations, find misconfigurations and indirect views of connected data. Over 300 systems were rebuilt simplifying issues.
4) Learnings included paying attention to the data model, partitioning graphs, using graphs for indirection views and designing for rebuildability
5. A little history …
5
Pets
• Unique and indispensable
• Often long lived
• Scale-up
• Active-passive architecture, pairs of servers, etc
Cattle
• Disposable, one of a herd
• Often short lived
• Scale-out
• Active-active architecture
• ‘Cattle’ datastores include
• Apache Cassandra
• Apache Kafka
6. Databases are much like the transport industry
• Undergone massive transformations
• Various types suitable for different needs
• Trains
• Relatively expensive
• Built on-demand
• Longer lifespan
• Cars
• Manual to mass production
• Almost consumable
• Often cheaper to rebuild, recycle than repair
Our twist – trains versus cars
6
7. Our journey to NoSQL
7
A few years back we started with Cassandra
• Exotic key/value store (column family)
• Distributed active-active architecture
• Tuneable consistency
• Paradigm shift
• Nights and weekends are now free (including LCM)
• However now a liquid expectation!
RDBMSNoSQL
8. Adding graph to the landscape
8
Gap in our DB landscape - connected data
• Another NoSQL database!
Interesting observation
• Key/Value easy to mess up a partition
• RDBMS easy to mess up a table
• Graph easy to mess up the database!
?
RDBMSNoSQL Graph
9. Typical architecture for caching data
9
SOR
Challenges include
• Initial load is usually once-off or manual
• Lifecycle management (LCM) is hard and per component
• Resilience patterns need to be applied everywhere
• Cross DC resilience pattern also required
Real-Time
Sync
Batch
Loader
Cache
Cache
11. Cache ‘Cattle’ Paradigm – Taking it further
11
SOR
Real-Time
Sync
Single
Materialized ViewCommit Log
E.g. Apache Kafka supports compacting topics
• Ideal for rebuilding cache use cases
Batch
Loader
12. Cache ‘Cattle’ Paradigm – Going all the way
12
Real-Time
Sync
Single
Materialized ViewCommit Log
Batch
Loader
SOR
Tip: More details on Pipeline and admin-import utility
http://goo.gl/ZY7NES
16. Spiralling out of control
16
• Victim of our own success
• Too many data-models to review
• Multi-tenancy (extreme)
• Configuration drift
• Way too many metrics
• Repeatable mistakes
• Hero culture
• Quick tools that needed person X
• Morale was taking its toll
17. 17
Doing the same thing
over and over again and
expecting different results
- Albert Einstein
INSANITY !!!
18. • Commit to HARD agreements
• Configuration items
• Ownership
• NO budging
• Track it
• Do the tedious work
• Feel the pain
Breaking insanity
18
19. • Focus on the problem at hand
• Stay away from ‘new’ stuff not working
• Stick to agreements
• Pipeline everything
• We don’t value initial bursts of energy
• Pipeline the heroes
• Reap benefits of the Graph PoC
• Introduce max 1 ‘unknown’ item at a time
Setting the priorities
19
20. Architecture
20
• Simple and effective
• Executes ‘Ops’ commands:
• ps –ef
• netstat
• cat config.json
• Raw output sent as message
• All parsing rules for raw messages
• Store in databases
• Complex rules and analysis
24. Accidentally used same labels
• E.g. Source != Source
• Reads become trickier
• Confuses meta-data
• CALL db.schema()
Solutions
• Property existence feature
• Solve your reads with writes!
• Document graph
• Remove duplicate names
Modify code
Simply rebuild and …
Pay attention to the model
24
27. Consider traversing to reduce density
• (:Source)-[:LAST]->(:Message)->[:PREV]->(:Message)
Modify code
Simply rebuild and
Partitioning in graph
27
34. Connected data
34
Keyspace Client Expected DC Expected Num
Connected Hosts
Found Connection
From DC
Found Num
Connected Hosts
Observation
Keyspace1 Client1 DC1:DC2 5:5 DC1 3:0 Missing DC
Not fully connected.
Keyspace1 Client2 DC1:DC2 5:5 DC1:DC2 3:3 Not fully connected.
Keyspace2 Client3 DC1:DC2 5:5 DC1:DC2 3:5 Not fully connected.
Keyspace2 Client4 DC1:DC2 5:5 DC1:DC2 3:5 Not fully connected.
…
36. Graph proven to be a VERY good fit for this use case!
• Already producing 50+ commands
• Heroes have a place to ‘query’
• Various unexpected findings
• Misconfigured tenants
• 100+ recommendations made
Architecture now helps with
• Allow us to be ‘in’ control without ‘having’ control
• Facilitates being agile
• Pipelining the data
• Learning new technologies
300+ cars were destroyed in our short journey to save our sanity!
So what happened?
36
41. Follow us to stay a step ahead
ING.com
YouTube.com/ING
SlideShare.net/ING@ING_News LinkedIn.com/company/ING
Flickr.com/INGGroupFacebook.com/ING
42. ING Group’s Annual Accounts are prepared in accordance with
International Financial Reporting Standards as adopted by the
European Union (‘IFRS-EU’).
In preparing the financial information in this document, the same
accounting principles are applied as in the 2014 ING Group Annual
Accounts. All figures in this document are unaudited. Small
differences are possible in the tables due to rounding.
Certain of the statements contained herein are not historical facts,
including, without limitation, certain statements made of future
expectations and other forward-looking statements that are based on
management’s current views and assumptions and involve known
and unknown risks and uncertainties that could cause actual results,
performance or events to differ materially from those expressed or
implied in such statements. Actual results, performance or events
may differ materially from those in such statements due to, without
limitation: (1) changes in general economic conditions, in particular
economic conditions in ING’s core markets, (2) changes in
performance of financial markets, including developing markets, (3)
consequences of a potential (partial) break-up of the euro, (4) the
implementation of ING’s restructuring plan to separate banking and
insurance operations, (5) changes in the availability of, and costs
associated with, sources of liquidity such as interbank funding, as
well as conditions in the credit markets generally, including changes
in borrower and counterparty creditworthiness, (6) the frequency and
severity of insured loss events, (7) changes affecting mortality and
morbidity levels and trends,(8) changes affecting persistency levels,
(9) changes affecting interest rate levels, (10) changes affecting
currency exchange rates, (11) changes in investor, customer and
policyholder behaviour, (12) changes in general competitive factors,
(13) changes in laws and regulations, (14) changes in the policies of
governments and/or regulatory authorities, (15) conclusions with
regard to purchase accounting assumptions and methodologies, (16)
changes in ownership that could affect the future availability to us of
net operating loss, net capital and built-in loss carry forwards, (17)
changes in credit ratings, (18) ING’s ability to achieve projected
operational synergies and (19) the other risks and uncertainties
detailed in the Risk Factors section contained in the most recent
annual report of ING Groep N.V. Any forward-looking statements
made by or on behalf of ING speak only as of the date they are
made, and, ING assumes no obligation to publicly update or revise
any forward-looking statements, whether as a result of new
information or for any other reason.
This document does not constitute an offer to sell, or a solicitation of
an offer to purchase, any securities in the United States or any other
jurisdiction. The securities of NN Group have not been and will not
be registered under the U.S. Securities Act of 1933, as amended
(the “Securities Act”), and may not be offered or sold within the
United States absent registration or an applicable exemption from
the registration requirements of the Securities Act.
www.ing.com
Disclaimer
42