SlideShare a Scribd company logo
1 of 48
NoSQL Database: ApacheNoSQL Database: Apache
CassandraCassandra
www.folio3.com@folio_3
Folio3 – OverviewFolio3 – Overview
www.folio3.com @folio_3
Who We Are
 We are a Development Partner for our customers
 Design software solutions, not just implement them
 Focus on the solution – Platform and technology agnostic
 Expertise in building applications that are:
Mobile Social Cloud-based Gamified
What We Do
 Areas of Focus
 Enterprise
 Custom enterprise applications
 Product development targeting the enterprise
 Mobile
 Custom mobile apps for iOS, Android, Windows Phone, BB OS
 Mobile platform (server-to-server) development
 Social Media
 CMS based websites for consumers and enterprise (corporate, consumer,
community & social networking)
 Social media platform development (enterprise & consumer)
Folio3 At a Glance
 Founded in 2005
 Over 200 full time employees
 Offices in the US, Canada, Bulgaria & Pakistan
 Palo Alto, CA.
 Sofia, Bulgaria
 Karachi, Pakistan
Toronto, Canada
Areas of Focus: Enterprise
 Automating workflows
 Cloud based solutions
 Application integration
 Platform development
 Healthcare
 Mobile Enterprise
 Digital Media
 Supply Chain
Some of Our Enterprise Clients
Areas of Focus: Mobile
 Serious enterprise applications for Banks,
Businesses
 Fun consumer apps for app discovery,
interaction, exercise gamification and play
 Educational apps
 Augmented Reality apps
 Mobile Platforms
Some of Our Mobile Clients
Areas of Focus: Web & Social Media
 Community Sites based on
Content Management Systems
 Enterprise Social Networking
 Social Games for Facebook &
Mobile
 Companion Apps for games
Some of Our Web Clients
NoSQL Database: ApacheNoSQL Database: Apache
CassandraCassandra
www.folio3.com @folio_3
Agenda
 What is NOSQL?
 Motivations for NOSQL?
 Brewer’s CAP Theorem
 Taxonomy of NOSQL databases
 Apache Cassandra
 Features
 Data Model
 Consistency
 Operations
 Cluster Membership
 What Does NOSQL means for RDBMS?
What is NOSQL?
 Refers to databases that differs from traditional relational database
management system (RDBMS)
 Distributed, flexible, horizontally scalable data stores
 Confusion with the term NOSQL
 NOSQL != No SQL (or Anti-SQL)
 NOSQL = Not Only SQL
 NOSQL is an inaccurate term since it is commonly used to refer to
"non-relational" databases but the term has stuck
Motivations for NOSQL
 Classical RDBMS unsuitable for today's web applications
because:
 Performance (Latency): Variable
 Flexibility: Low
 Scalability: Variable
 Functionality
Brewer's CAP Theorm
 Consistency (C)
 Availability (A)
 Partition Tolerance (P)
 Pick any two
 Most NOSQL databases sacrifice Consistency
in favor of high Availability and Performance
Taxonomy of NOSQL
 Key/Value Stores - Distributed Hash Tables (DHT)
 Memcached, Amazon’s Dynamo, Redis, PStore
 Document Stores
 Semi structured data (stores entire documents)
 CouchDB, MongoDB, RDDB, Riak
 Graph Databases *
 Based on graph theory
 ActiveRDF, AllegroGraph, Neo4J
 Object Database *
 Versant, Objectivity
 Column-oriented Stores
 * these are considered soft NOSQL databases and are usually in NOSQL category because of being
"non-relational".
Column-Oriented Data Stores
 Semi-structured column-based data stores
 Stores each column separately so that aggregate operations for one column
of the entire table are significantly quicker than the traditional row storage
model
 Popular examples
 Hadoop/HBASE
 Apache Cassandra
 Google's BigTable
 HyperTable
 Amazon's SimpleDB
Apache Cassandra
 Fully distributed column oriented data store
 Also provides Map Reduce implementation using Hadoop (increased
performance)
 Based on Google's BigTable (Data Model) and Amazon's Dynamo
(Consistency & Partition Tolerance)
 Cassandra values Availability and Partitioning tolerance (AP) while
providing tunable consistency levels.
History
 Developed at Facebook
 Released as open source project on Google Code in July 2008
 Became an Apache Incubator Project in March 2009
 Became a top level Apache project in February 2010 Performance
 Rumors of Facebook having started working on its own separate
version of Cassandra
Features
 Fully Distributed
 Highly Scalable
 Fault Tolerant (No single point of failure)
 Tunable Consistency (Eventually Consistent)
 Semi-structured key-value store
 High Availability
 No Referential Integrity
 No Joins
Data Model
 KeySpace (Uppermost namespace)
 Column Family / Super Column Family (analogous to table)
 Super Column
 Column (Name, Value, Timestamp)
 Rows are referenced through keys
 Each column is stored in a separate physical file
Standard Column Family
Super Column Family
Super Column Family: Static/Static
Super Column Family: Static/Static
Super Column Family: Static/Dynamic
Super Column Family: Static/Dynamic
Super Column Family: Dynamic/Static
Super Column Family: Dynamic/Static
Super Column Family: Dynamic/Dynamic
Super Column Family: Dynamic/Dynamic
Apache Cassandra: Consistency
 Consistency refers to whether a system is left in a consistent state
after an operation. In distributed data systems like Cassandra, this
usually means that once a writer has written, all readers will see that
write.
 If W + R > N, you will have strong consistent behavior; that is, readers
will always see the most recent write
 W is the number of nodes to block for on write
 R is the number to block for on reads
 N is the replication factor (number of replicas)
Apache Cassandra: Consistency
 Relational databases provide strong consistency (ACID)
 Cassandra provide eventual consistency (BASE) meaning the database
will eventually reach a consistent state
 QUORUM reads and writes gives consistency while still allowing
availability
 Q = (N / 2) + 1 (simple majority)
 If latency is more important than consistency, you can lower values
for either or both W and R.
Apache Cassandra: Consistency Levels
 Write
 ZERO
 ANY
 ONE
 QUORUM
 ALL
 Read
 ZERO
 ANY
 ONE
 QUORUM
 ALL
Write Operation
 Client sends a write request to a random node; the random node
forwards the request to the proper node (1st replica responsible for
the partition - coordinator)
 Coordinator sends requests to N replicas
 If W replicas confirm the write operation then OK
 Always writable, hinted handoff (If a replica node for the key is down,
Cassandra will write a hint to the live replica node indicating that the
write needs to be replayed to the unavailable node.)
Read Operation
 Coordinator sends requests to N replicas, if R replicas respond then
OK
 If different versions are returned then reconcile and write back the
reconciled version (Read Repair)
Cluster Membership
 Gossip Protocol
 Every T seconds each node increments its heartbeat counter
and gossips to another node about the state of the cluster;
the receiving node merges the cluster info with its own copy
 Cluster state (node in/out, failure) propagated quickly:
O(LogN) where N is the number of nodes in the cluster
Storage Ring
 Cassandra cluster nodes are organized in a virtual ring.
 Each node has a single unique token that defines its place in the ring
and which keys it is responsible for
 Key ranges are adjusted when the nodes join or leave
Apache Cassandra: MySQL Comparison
 MySQL (> 50 GB data)
 Read Average: ~ 350 ms
 Write Average: ~ 300 ms
 Cassandra (> 50 GB data)
 Read Average: 15 ms
 Write Average: 0.12 ms
Apache Cassandra: Client API
 Low level API
 Thrift
 High Level API
 Java
 Hector, Pelops, Kundera
 .NET
 FluentCassandra, Aquiles
 Python
 Telephus, Pycassa
 PHP
 phpcassa, SimpleCassie
Apache Cassandra: Where to Use?
 Use Cassandra, if you want/need
 High write throughput
 Near-Linear scalability
 Automated replication/fault tolerance
 Can tolerate low consistency
 Can tolerate missing RDBMS features
Apache Cassandra: Users
 Facebook (of course)
 To power inbox search (previously)
 Twitter
 To handle user relationships, analytics (but not for tweets)
 Digg & Reddit
 Both use Cassandra to handle user comments and votes
 Rackspace
 IBM
 To build scalable email system
 Cisco's WebEx
 To store user feed and activity in near real time
What does NOSQL mean for the future of RDBMS?
 No worries! RDBMSs are here to stay for the foreseeable future
 NOSQL data stores can be used in combination with RDBMS in some
situations
 NOSQL still has a long way to go, in order to reach the widespread
(mainstream) use and support of the RDBMS
Weakness of NOSQL
 No or limited support for complex queries
 No transactions available (operations are atomic)
 No standard interface for NOSQL databases (like SQL in relational
databases)
 No or limited administrative features available for NOSQL databases
 Not suitable (yet) for mainstream use
Why Still Use RDBMS?
 All the weaknesses of NOSQL
 Relational databases are widely used and understood
 RDBMS DBAs and developers are easily available in the market
 For big business, relational databases are a safe choice because they
have heavily invested in relational technology
 Many database design and development tools available
References
 http://www.allthingsdistributed.com/2008/12/eventually_consistent.
html
 http://wiki.apache.org/cassandra/FrontPage
 http://en.wikipedia.org/wiki/Apache_Cassandra
 http://www.slideshare.net/gdusbabek/cassandra-presentation-for-
san-antonio-jug
 http://www.slideshare.net/Eweaver/cassandra-presentation-at-nosql
 http://nosql-database.org/
 http://nosqlpedia.com/
Contact
 For more details about our
services, please get in touch with
us.
contact@folio3.com
US Office: (408) 365-4638
www.folio3.com

More Related Content

What's hot

Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraGokhan Atil
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overviewPritamKathar
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 
[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google Cloud[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google CloudPgDay.Seoul
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basicsnickmbailey
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...DataStax Academy
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL AdministrationEDB
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraDataStax
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 

What's hot (20)

Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google Cloud[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google Cloud
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL Administration
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 

Viewers also liked

Nosql databases for the .net developer
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developerJesus Rodriguez
 
A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014Anuj Sahni
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsAndrew Brust
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL DatabasesRajith Pemabandu
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Chris Richardson
 
NoSQL-Database-Concepts
NoSQL-Database-ConceptsNoSQL-Database-Concepts
NoSQL-Database-ConceptsBhaskar Gunda
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big dataSteven Francia
 
Cassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoCassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoJon Haddad
 
Requêtes multi-critères avec Cassandra
Requêtes multi-critères avec CassandraRequêtes multi-critères avec Cassandra
Requêtes multi-critères avec CassandraJulien Dubois
 
NoSQL Databases, Not just a Buzzword
NoSQL Databases, Not just a Buzzword NoSQL Databases, Not just a Buzzword
NoSQL Databases, Not just a Buzzword Haitham El-Ghareeb
 
Test Automation for NoSQL Databases
Test Automation for NoSQL DatabasesTest Automation for NoSQL Databases
Test Automation for NoSQL DatabasesTobias Trelle
 
Oracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewOracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewDave Segleau
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Helena Edelson
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsDATAVERSITY
 
Wide-column Stores für Architekten (HBase, Cassandra)
Wide-column Stores für Architekten (HBase, Cassandra)Wide-column Stores für Architekten (HBase, Cassandra)
Wide-column Stores für Architekten (HBase, Cassandra)Andreas Buckenhofer
 

Viewers also liked (18)

NoSql Databases
NoSql DatabasesNoSql Databases
NoSql Databases
 
Nosql databases for the .net developer
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developer
 
A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL Databases
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
 
NoSQL-Database-Concepts
NoSQL-Database-ConceptsNoSQL-Database-Concepts
NoSQL-Database-Concepts
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
 
Cassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoCassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day Toronto
 
Requêtes multi-critères avec Cassandra
Requêtes multi-critères avec CassandraRequêtes multi-critères avec Cassandra
Requêtes multi-critères avec Cassandra
 
NoSQL Databases, Not just a Buzzword
NoSQL Databases, Not just a Buzzword NoSQL Databases, Not just a Buzzword
NoSQL Databases, Not just a Buzzword
 
Test Automation for NoSQL Databases
Test Automation for NoSQL DatabasesTest Automation for NoSQL Databases
Test Automation for NoSQL Databases
 
Oracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewOracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overview
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture Patterns
 
Wide-column Stores für Architekten (HBase, Cassandra)
Wide-column Stores für Architekten (HBase, Cassandra)Wide-column Stores für Architekten (HBase, Cassandra)
Wide-column Stores für Architekten (HBase, Cassandra)
 

Similar to Understanding Apache Cassandra and NoSQL Databases

SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerMichael Rys
 
Learning Cassandra NoSQL
Learning Cassandra NoSQLLearning Cassandra NoSQL
Learning Cassandra NoSQLPankaj Khattar
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLAhmed Helmy
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless DatabasesDan Gunter
 
Learn Cassandra at edureka!
Learn Cassandra at edureka!Learn Cassandra at edureka!
Learn Cassandra at edureka!Edureka!
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"Jihyun Ahn
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesJon Meredith
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupesh Bansal
 
05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.pptAnandKonj1
 
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'sankarapu posibabu
 
No SQL Databases.ppt
No SQL Databases.pptNo SQL Databases.ppt
No SQL Databases.pptssuser8c8fc1
 
cassandra
cassandracassandra
cassandraAkash R
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists jlacefie
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational DatabasesUdi Bauman
 
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...GeeksLab Odessa
 

Similar to Understanding Apache Cassandra and NoSQL Databases (20)

SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL Server
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Learning Cassandra NoSQL
Learning Cassandra NoSQLLearning Cassandra NoSQL
Learning Cassandra NoSQL
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
Learn Cassandra at edureka!
Learn Cassandra at edureka!Learn Cassandra at edureka!
Learn Cassandra at edureka!
 
No sql
No sqlNo sql
No sql
 
No sql
No sqlNo sql
No sql
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL Databases
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
 
05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt
 
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
 
No SQL Databases.ppt
No SQL Databases.pptNo SQL Databases.ppt
No SQL Databases.ppt
 
cassandra
cassandracassandra
cassandra
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational Databases
 
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
 

More from Folio3 Software

Shopify & Shopify Plus Ecommerce Development Experts
Shopify & Shopify Plus Ecommerce Development Experts Shopify & Shopify Plus Ecommerce Development Experts
Shopify & Shopify Plus Ecommerce Development Experts Folio3 Software
 
Magento and Magento 2 Ecommerce Development
Magento and Magento 2 Ecommerce Development Magento and Magento 2 Ecommerce Development
Magento and Magento 2 Ecommerce Development Folio3 Software
 
All You Need to Know About Type Script
All You Need to Know About Type ScriptAll You Need to Know About Type Script
All You Need to Know About Type ScriptFolio3 Software
 
A Guideline to Test Your Own Code - Developer Testing
A Guideline to Test Your Own Code - Developer TestingA Guideline to Test Your Own Code - Developer Testing
A Guideline to Test Your Own Code - Developer TestingFolio3 Software
 
OWIN (Open Web Interface for .NET)
OWIN (Open Web Interface for .NET)OWIN (Open Web Interface for .NET)
OWIN (Open Web Interface for .NET)Folio3 Software
 
An Introduction to CSS Preprocessors (SASS & LESS)
An Introduction to CSS Preprocessors (SASS & LESS)An Introduction to CSS Preprocessors (SASS & LESS)
An Introduction to CSS Preprocessors (SASS & LESS)Folio3 Software
 
Introduction to SharePoint 2013
Introduction to SharePoint 2013Introduction to SharePoint 2013
Introduction to SharePoint 2013Folio3 Software
 
An Overview of Blackberry 10
An Overview of Blackberry 10An Overview of Blackberry 10
An Overview of Blackberry 10Folio3 Software
 
StackOverflow Architectural Overview
StackOverflow Architectural OverviewStackOverflow Architectural Overview
StackOverflow Architectural OverviewFolio3 Software
 
Enterprise Mobility - An Introduction
Enterprise Mobility - An IntroductionEnterprise Mobility - An Introduction
Enterprise Mobility - An IntroductionFolio3 Software
 
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...Folio3 Software
 
Introduction to Enterprise Service Bus
Introduction to Enterprise Service BusIntroduction to Enterprise Service Bus
Introduction to Enterprise Service BusFolio3 Software
 
Regular Expression in Action
Regular Expression in ActionRegular Expression in Action
Regular Expression in ActionFolio3 Software
 
HTTP Server Push Techniques
HTTP Server Push TechniquesHTTP Server Push Techniques
HTTP Server Push TechniquesFolio3 Software
 
Best Practices of Software Development
Best Practices of Software DevelopmentBest Practices of Software Development
Best Practices of Software DevelopmentFolio3 Software
 
Offline Data Access in Enterprise Mobility
Offline Data Access in Enterprise MobilityOffline Data Access in Enterprise Mobility
Offline Data Access in Enterprise MobilityFolio3 Software
 
Realtime and Synchronous Applications
Realtime and Synchronous ApplicationsRealtime and Synchronous Applications
Realtime and Synchronous ApplicationsFolio3 Software
 

More from Folio3 Software (20)

Shopify & Shopify Plus Ecommerce Development Experts
Shopify & Shopify Plus Ecommerce Development Experts Shopify & Shopify Plus Ecommerce Development Experts
Shopify & Shopify Plus Ecommerce Development Experts
 
Magento and Magento 2 Ecommerce Development
Magento and Magento 2 Ecommerce Development Magento and Magento 2 Ecommerce Development
Magento and Magento 2 Ecommerce Development
 
All You Need to Know About Type Script
All You Need to Know About Type ScriptAll You Need to Know About Type Script
All You Need to Know About Type Script
 
Enter the Big Picture
Enter the Big PictureEnter the Big Picture
Enter the Big Picture
 
A Guideline to Test Your Own Code - Developer Testing
A Guideline to Test Your Own Code - Developer TestingA Guideline to Test Your Own Code - Developer Testing
A Guideline to Test Your Own Code - Developer Testing
 
OWIN (Open Web Interface for .NET)
OWIN (Open Web Interface for .NET)OWIN (Open Web Interface for .NET)
OWIN (Open Web Interface for .NET)
 
Introduction to Go-Lang
Introduction to Go-LangIntroduction to Go-Lang
Introduction to Go-Lang
 
An Introduction to CSS Preprocessors (SASS & LESS)
An Introduction to CSS Preprocessors (SASS & LESS)An Introduction to CSS Preprocessors (SASS & LESS)
An Introduction to CSS Preprocessors (SASS & LESS)
 
Introduction to SharePoint 2013
Introduction to SharePoint 2013Introduction to SharePoint 2013
Introduction to SharePoint 2013
 
An Overview of Blackberry 10
An Overview of Blackberry 10An Overview of Blackberry 10
An Overview of Blackberry 10
 
StackOverflow Architectural Overview
StackOverflow Architectural OverviewStackOverflow Architectural Overview
StackOverflow Architectural Overview
 
Enterprise Mobility - An Introduction
Enterprise Mobility - An IntroductionEnterprise Mobility - An Introduction
Enterprise Mobility - An Introduction
 
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
 
Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
 
Introduction to Enterprise Service Bus
Introduction to Enterprise Service BusIntroduction to Enterprise Service Bus
Introduction to Enterprise Service Bus
 
Regular Expression in Action
Regular Expression in ActionRegular Expression in Action
Regular Expression in Action
 
HTTP Server Push Techniques
HTTP Server Push TechniquesHTTP Server Push Techniques
HTTP Server Push Techniques
 
Best Practices of Software Development
Best Practices of Software DevelopmentBest Practices of Software Development
Best Practices of Software Development
 
Offline Data Access in Enterprise Mobility
Offline Data Access in Enterprise MobilityOffline Data Access in Enterprise Mobility
Offline Data Access in Enterprise Mobility
 
Realtime and Synchronous Applications
Realtime and Synchronous ApplicationsRealtime and Synchronous Applications
Realtime and Synchronous Applications
 

Recently uploaded

Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
cpct NetworkING BASICS AND NETWORK TOOL.ppt
cpct NetworkING BASICS AND NETWORK TOOL.pptcpct NetworkING BASICS AND NETWORK TOOL.ppt
cpct NetworkING BASICS AND NETWORK TOOL.pptrcbcrtm
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 

Recently uploaded (20)

Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
cpct NetworkING BASICS AND NETWORK TOOL.ppt
cpct NetworkING BASICS AND NETWORK TOOL.pptcpct NetworkING BASICS AND NETWORK TOOL.ppt
cpct NetworkING BASICS AND NETWORK TOOL.ppt
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 

Understanding Apache Cassandra and NoSQL Databases

  • 1. NoSQL Database: ApacheNoSQL Database: Apache CassandraCassandra www.folio3.com@folio_3
  • 2. Folio3 – OverviewFolio3 – Overview www.folio3.com @folio_3
  • 3. Who We Are  We are a Development Partner for our customers  Design software solutions, not just implement them  Focus on the solution – Platform and technology agnostic  Expertise in building applications that are: Mobile Social Cloud-based Gamified
  • 4. What We Do  Areas of Focus  Enterprise  Custom enterprise applications  Product development targeting the enterprise  Mobile  Custom mobile apps for iOS, Android, Windows Phone, BB OS  Mobile platform (server-to-server) development  Social Media  CMS based websites for consumers and enterprise (corporate, consumer, community & social networking)  Social media platform development (enterprise & consumer)
  • 5. Folio3 At a Glance  Founded in 2005  Over 200 full time employees  Offices in the US, Canada, Bulgaria & Pakistan  Palo Alto, CA.  Sofia, Bulgaria  Karachi, Pakistan Toronto, Canada
  • 6. Areas of Focus: Enterprise  Automating workflows  Cloud based solutions  Application integration  Platform development  Healthcare  Mobile Enterprise  Digital Media  Supply Chain
  • 7. Some of Our Enterprise Clients
  • 8. Areas of Focus: Mobile  Serious enterprise applications for Banks, Businesses  Fun consumer apps for app discovery, interaction, exercise gamification and play  Educational apps  Augmented Reality apps  Mobile Platforms
  • 9. Some of Our Mobile Clients
  • 10. Areas of Focus: Web & Social Media  Community Sites based on Content Management Systems  Enterprise Social Networking  Social Games for Facebook & Mobile  Companion Apps for games
  • 11. Some of Our Web Clients
  • 12. NoSQL Database: ApacheNoSQL Database: Apache CassandraCassandra www.folio3.com @folio_3
  • 13. Agenda  What is NOSQL?  Motivations for NOSQL?  Brewer’s CAP Theorem  Taxonomy of NOSQL databases  Apache Cassandra  Features  Data Model  Consistency  Operations  Cluster Membership  What Does NOSQL means for RDBMS?
  • 14. What is NOSQL?  Refers to databases that differs from traditional relational database management system (RDBMS)  Distributed, flexible, horizontally scalable data stores  Confusion with the term NOSQL  NOSQL != No SQL (or Anti-SQL)  NOSQL = Not Only SQL  NOSQL is an inaccurate term since it is commonly used to refer to "non-relational" databases but the term has stuck
  • 15. Motivations for NOSQL  Classical RDBMS unsuitable for today's web applications because:  Performance (Latency): Variable  Flexibility: Low  Scalability: Variable  Functionality
  • 16. Brewer's CAP Theorm  Consistency (C)  Availability (A)  Partition Tolerance (P)  Pick any two  Most NOSQL databases sacrifice Consistency in favor of high Availability and Performance
  • 17. Taxonomy of NOSQL  Key/Value Stores - Distributed Hash Tables (DHT)  Memcached, Amazon’s Dynamo, Redis, PStore  Document Stores  Semi structured data (stores entire documents)  CouchDB, MongoDB, RDDB, Riak  Graph Databases *  Based on graph theory  ActiveRDF, AllegroGraph, Neo4J  Object Database *  Versant, Objectivity  Column-oriented Stores  * these are considered soft NOSQL databases and are usually in NOSQL category because of being "non-relational".
  • 18. Column-Oriented Data Stores  Semi-structured column-based data stores  Stores each column separately so that aggregate operations for one column of the entire table are significantly quicker than the traditional row storage model  Popular examples  Hadoop/HBASE  Apache Cassandra  Google's BigTable  HyperTable  Amazon's SimpleDB
  • 19. Apache Cassandra  Fully distributed column oriented data store  Also provides Map Reduce implementation using Hadoop (increased performance)  Based on Google's BigTable (Data Model) and Amazon's Dynamo (Consistency & Partition Tolerance)  Cassandra values Availability and Partitioning tolerance (AP) while providing tunable consistency levels.
  • 20. History  Developed at Facebook  Released as open source project on Google Code in July 2008  Became an Apache Incubator Project in March 2009  Became a top level Apache project in February 2010 Performance  Rumors of Facebook having started working on its own separate version of Cassandra
  • 21. Features  Fully Distributed  Highly Scalable  Fault Tolerant (No single point of failure)  Tunable Consistency (Eventually Consistent)  Semi-structured key-value store  High Availability  No Referential Integrity  No Joins
  • 22. Data Model  KeySpace (Uppermost namespace)  Column Family / Super Column Family (analogous to table)  Super Column  Column (Name, Value, Timestamp)  Rows are referenced through keys  Each column is stored in a separate physical file
  • 25. Super Column Family: Static/Static
  • 26. Super Column Family: Static/Static
  • 27. Super Column Family: Static/Dynamic
  • 28. Super Column Family: Static/Dynamic
  • 29. Super Column Family: Dynamic/Static
  • 30. Super Column Family: Dynamic/Static
  • 31. Super Column Family: Dynamic/Dynamic
  • 32. Super Column Family: Dynamic/Dynamic
  • 33. Apache Cassandra: Consistency  Consistency refers to whether a system is left in a consistent state after an operation. In distributed data systems like Cassandra, this usually means that once a writer has written, all readers will see that write.  If W + R > N, you will have strong consistent behavior; that is, readers will always see the most recent write  W is the number of nodes to block for on write  R is the number to block for on reads  N is the replication factor (number of replicas)
  • 34. Apache Cassandra: Consistency  Relational databases provide strong consistency (ACID)  Cassandra provide eventual consistency (BASE) meaning the database will eventually reach a consistent state  QUORUM reads and writes gives consistency while still allowing availability  Q = (N / 2) + 1 (simple majority)  If latency is more important than consistency, you can lower values for either or both W and R.
  • 35. Apache Cassandra: Consistency Levels  Write  ZERO  ANY  ONE  QUORUM  ALL  Read  ZERO  ANY  ONE  QUORUM  ALL
  • 36. Write Operation  Client sends a write request to a random node; the random node forwards the request to the proper node (1st replica responsible for the partition - coordinator)  Coordinator sends requests to N replicas  If W replicas confirm the write operation then OK  Always writable, hinted handoff (If a replica node for the key is down, Cassandra will write a hint to the live replica node indicating that the write needs to be replayed to the unavailable node.)
  • 37. Read Operation  Coordinator sends requests to N replicas, if R replicas respond then OK  If different versions are returned then reconcile and write back the reconciled version (Read Repair)
  • 38. Cluster Membership  Gossip Protocol  Every T seconds each node increments its heartbeat counter and gossips to another node about the state of the cluster; the receiving node merges the cluster info with its own copy  Cluster state (node in/out, failure) propagated quickly: O(LogN) where N is the number of nodes in the cluster
  • 39. Storage Ring  Cassandra cluster nodes are organized in a virtual ring.  Each node has a single unique token that defines its place in the ring and which keys it is responsible for  Key ranges are adjusted when the nodes join or leave
  • 40. Apache Cassandra: MySQL Comparison  MySQL (> 50 GB data)  Read Average: ~ 350 ms  Write Average: ~ 300 ms  Cassandra (> 50 GB data)  Read Average: 15 ms  Write Average: 0.12 ms
  • 41. Apache Cassandra: Client API  Low level API  Thrift  High Level API  Java  Hector, Pelops, Kundera  .NET  FluentCassandra, Aquiles  Python  Telephus, Pycassa  PHP  phpcassa, SimpleCassie
  • 42. Apache Cassandra: Where to Use?  Use Cassandra, if you want/need  High write throughput  Near-Linear scalability  Automated replication/fault tolerance  Can tolerate low consistency  Can tolerate missing RDBMS features
  • 43. Apache Cassandra: Users  Facebook (of course)  To power inbox search (previously)  Twitter  To handle user relationships, analytics (but not for tweets)  Digg & Reddit  Both use Cassandra to handle user comments and votes  Rackspace  IBM  To build scalable email system  Cisco's WebEx  To store user feed and activity in near real time
  • 44. What does NOSQL mean for the future of RDBMS?  No worries! RDBMSs are here to stay for the foreseeable future  NOSQL data stores can be used in combination with RDBMS in some situations  NOSQL still has a long way to go, in order to reach the widespread (mainstream) use and support of the RDBMS
  • 45. Weakness of NOSQL  No or limited support for complex queries  No transactions available (operations are atomic)  No standard interface for NOSQL databases (like SQL in relational databases)  No or limited administrative features available for NOSQL databases  Not suitable (yet) for mainstream use
  • 46. Why Still Use RDBMS?  All the weaknesses of NOSQL  Relational databases are widely used and understood  RDBMS DBAs and developers are easily available in the market  For big business, relational databases are a safe choice because they have heavily invested in relational technology  Many database design and development tools available
  • 47. References  http://www.allthingsdistributed.com/2008/12/eventually_consistent. html  http://wiki.apache.org/cassandra/FrontPage  http://en.wikipedia.org/wiki/Apache_Cassandra  http://www.slideshare.net/gdusbabek/cassandra-presentation-for- san-antonio-jug  http://www.slideshare.net/Eweaver/cassandra-presentation-at-nosql  http://nosql-database.org/  http://nosqlpedia.com/
  • 48. Contact  For more details about our services, please get in touch with us. contact@folio3.com US Office: (408) 365-4638 www.folio3.com