SlideShare a Scribd company logo
1 of 32
NoSQL & Couchbase
Sangharsh Agarwal
Relational Databases
• MySQL, PostgreSQL, SQLite, Oracle etc.,
• Good at
–Schemas
–Strong Consistency
–Transactions
–“Mature” and well tested
–Availability of Expertise
What is NoSQL?
• It’s not Anti SQL or ‘NO’ SQL.
• It means (N)ot (O)nly SQL.
• Exact name could be Non
Relational DB.
What is NoSQL?
• Carlo Strozzi used the term NoSQL in 1998 to name his lightweight, open-
source relational database that did not expose the standard SQL interface.
• A NoSQL database provides a mechanism for storage and retrieval of data
that is modeled in means other than the tabular relations used in
relational databases.
• Motivation for NoSQL include simplicity of design, horizontal scaling and
finer control over availability.
• Data structures in NoSQL (e.g. key-value, graph, or document) differs from
the RDBMS, and therefore some operations are faster in NoSQL and some
in RDBMS.
“Is NoSQL a complete
replacement of RDBMS?”
“NO”
Common Features of NoSQL
• Open Source
• Schema-less
• Scalability with Scale Out not Scale Up.
• Distribution with Sharding.
• Eventual Consistency.
• Commodity Class Nodes
• Parallel Query with MapReduce.
• Cloud Readiness
• High Availability
NoSQL Data Models (1/2)
• Distributed Caches: Couchbase, Memcached,
Velocity
• Wide Column Stores: Accumulo, Cassandra,
Druid, HBase
• Document Stores: Clusterpoint, Apache
CouchDB, Couchbase, MarkLogic, MongoDB
NoSQL Data Models (2/2)
• Key-value Stores: Dynamo, FoundationDB,
MemcacheDB, Redis, Riak, FairCom c-treeACE
• Graph Databases: Allegro, Neo4J,
InfiniteGraph, OrientDB, Virtuoso, Stardog
Why NoSQL (1/2)
• Interactive applications have changed dramatically over the last 15
years. In the late ‘90s, large web companies emerged with dramatic
increases in scale on many dimensions:
– The number of concurrent users skyrocketed. (Big Users)
– The amount of data collected and processed soared. (IOT)
– The amount of unstructured or semi-structured data exploded. (Big
Data/Cloud)
• Dealing with above issues was more and more difficult using
relational database technology.
• Relational databases are essentially architected to run a single
machine and use a rigid schema-based approach to modeling data.
Why NoSQL (2/2)
• Schema-less: Alter operation in RDBMS is
costly.
• RDMS are less capable of dealing with Big-
Data.
• RDMS are not good for Object oriented
programmer.
• RDMS support Scale-up than Scale-out.
• RDMS can-not handle Unstructured or semi-
structured data.
Big Users
• Not that long ago, 1,000 daily users of an application was a lot and 10,000
was an extreme case.
• Today, with the growth in global Internet use, the increased number of
hours users spend online, and the growing popularity of smartphones and
tablets, it's not uncommon for apps to have millions of users a day.
Internet of Things
• The amount of machine-generated data is increasing with
the proliferation of digital telemetry.
• There are 14 billion things connected to the Internet.
– By 2020, 32 billion things will be connected to the Internet.
– By 2020, 10% of data will be generated by embedded systems.
– By 2020, 20% of target rich data will be generated by
embedded systems.
• Telemetry data is small, semi-structured and continuous.
It’s a challenge for relational databases.
• To address this challenge, the innovative enterprise is
relying on NoSQL technology to scale concurrent data
access to millions of connected things.
Big Data
• The amount of data is growing rapidly, and the nature of data is changing as well
as developers find new data types – most of which are unstructured or semi-
structures – that they want to incorporate into their applications.
• Data is becoming easier to capture and access through third parties such as
Facebook, Dun and Bradstreet, and others.
• NoSQL provides a data model that maps better to the application’s organization
of data and simplifies the interaction between the
The Cloud
• Three-Tier Internet Architecture: Applications today are increasingly developed
using a three-tier internet architecture, are cloud-based, and use a Software-as-a-
Service business model that needs to support the collective needs of thousands of
customers.
• Above approach requires a horizontally scalable architecture that easily scales with
the number of users and amount of data the application has.
• NoSQL technologies have been built from the ground up to be distributed, scale-
out technologies and therefore fit better with the highly distributed nature of the
three-tier Internet architecture.
Data Models
• Relational and NoSQL data models are very different.
• The relational model takes data and separates it into many interrelated tables.
• Tables reference each other through foreign keys that are stored in columns as
well.
• NoSQL databases have a very different model.
• For example, a document-oriented NoSQL database takes the data you want
to store and aggregates it into documents using the JSON format.
The CAP Theorem
Published by Eric Brewer in 2000, the theorem is a set of basic requirements that
describe any distributed system (not just storage/database systems).
• Consistency - All the servers in the system will have the same data so anyone
using the system will get the same copy regardless of which server answers
their request.
• Availability - The system will always respond to a request (even if it's not the
latest data or consistent across the system or just a message saying the system
isn't working).
• Partition Tolerance - The system continues to operate as a whole even if
individual servers fail or can't be reached.
It's theoretically impossible to have all 3 requirements met, so a combination of
2 must be chosen and this is usually the deciding factor in what technology is
used.
ACID vs BASE Theorems
ACID Properties
ACID is a set of properties that apply specifically to database transactions,
defined as follows:
• Atomicity - Everything in a transaction must happen successfully or none
of the changes are committed. This avoids a transaction that changes
multiple pieces of data from failing halfway and only making a few
changes.
• Consistency - The data will only be committed if it passes all the rules in
place in the database (ie: data types, triggers, constraints, etc).
• Isolation - Transactions won't affect other transactions by changing data
that another operation is counting on; and other users won't see partial
results of a transaction in progress (depending on isolation mode).
• Durability - Once data is committed, it is durably stored and safe against
errors, crashes or any other (software) malfunctions within the database.
BASE Theorem
• Basically Available - This constraint states that the system does guarantee
the availability of the data as regards CAP Theorem; there will be a
response to any request. But, that response could still be ‘failure’ to obtain
the requested data or the data may be in an inconsistent or changing
state, much like waiting for a check to clear in your bank account.
• Soft state - The state of the system could change over time, so even during
times without input there may be changes going on due to ‘eventual
consistency,’ thus the state of the system is always ‘soft.’
• Eventual consistency - The system will eventually become consistent once
it stops receiving input. The data will propagate to everywhere it should
sooner or later, but the system will continue to receive input and is not
checking the consistency of every transaction before it moves onto the
next one.
Couchbase
Couchbase - The NoSQL document database
• Couchbase Server, originally known as Membase, is an open
source, distributed (shared-nothing architecture) NoSQL
document-oriented database that is optimized for interactive
applications. These applications must service many concurrent
users; creating, storing, retrieving, aggregating, manipulating and
presenting data.
• Couchbase is designed to provide easy-to-scale key-value or
document access with low latency and high sustained
throughput. It is designed to be clustered from a single machine
to very large scale deployments.
• In the parlance of Eric Brewer’s CAP theorem, Couchbase is a CP
type system.
Couchbase Features
Easy Scalability
It’s easy to scale your database layer with
Couchbase Server, whether within a cluster
or across clusters in multiple data centers.
With one click of a button, no downtime,
and no changes to your app, you can grow
your cluster from 1 to 25 to 100s of servers
while keeping the workload evenly
distributed.
Consistent High Performance
Couchbase Server’s consistent sub
millisecond response times means an
awesome experience for your app users.
Consistent, high throughput lets you
serve more users with fewer servers.
Data and workload are equally spread
across all servers.
Always On
With Couchbase Server, your application is
always online, 24x365. Whether you are
upgrading your database, system software
or hardware – or recovering from a
disaster – you can count on zero app
downtime with Couchbase Server.
Flexible Data Model
You shouldn’t have to worry about the
database when you change your
application. With Couchbase Server, there
is no fixed schema so records can have
different structure, and be changed any
time, without modification to other
documents in the database.
Couchbase Features..
Flexible Data Model
1. JSON Support
2. Indexing and Querying
3. Incremental Map Reduce
Easy Scalability
1. Clone to Grow with Auto-Sharding
2. Cross-Cluster Replication (XDCR)
Consistent High Performance
1. Built-In Object-Level Cache
(memcached)
Always On 24x365
1. Zero Downtime Manitenance
2. Data Replication With Auto-Failover
3. Management and Monitoring UI
4. Reliable Storage Architecture.
Why Couchbase?
• Couchbase provides the world’s most complete,
most scalable and best performing NoSQL
database.
• Couchbase provides the world’s most complete,
most scalable and best performing NoSQL
database.
• Couchbase provides a shared nothing
architecture, a single node-type, a built in caching
layer, true auto-sharding and the world’s first
NoSQL mobile offering.
Couchbase Architecture (1/3)
High-Level Deployment Architecture.
Couchbase Architecture (2/3)
• In Couchbase Server, the data
manager stores and retrieves data
in response to data operation
requests from applications.
• Every server in a Couchbase cluster
includes a built-in multi-threaded
object-managed cache, which
provides consistent low-latency for
read and write operations.
• The cluster manager supervises
server configuration and interaction
between servers within a
Couchbase cluster.
Node architecture diagram of Couchbase Server
Couchbase Architecture (3/3)
Data flow within Couchbase during a write operation
1. Client writes a document into the cache,
and the server sends the client a
confirmation.
2. The document is added into the intra-
cluster replication queue to be replicated
to other servers within the cluster.
3. The document is also added into the disk
write queue to be asynchronously
persisted to disk. The document is
persisted to disk after the disk-write
queue is flushed.
4. After the document is persisted to disk,
it’s replicated to other Couchbase Server
clusters using cross datacenter replication
(XDCR) and eventually indexed.
Couchbase’ Elasticsearch Connector
• Together, Couchbase and Elasticsearch enable you to build richer and more
powerful apps with full-text search, indexing and querying and real-time analytics
for use cases such as content stores or aggregating data from varied data sources.
“The plug-in for Elasticsearch extends Couchbase Server’s flexibility even further,
allowing users to build self-adapting interactive applications.”
Thanks
References
• http://www.thoughtworks.com/insights/articles/nosql-
comparison
• http://www.quora.com/What-is-the-relation-between-SQL-
NoSQL-the-CAP-theorem-and-ACID
• http://www.christof-strauch.de/nosqldbs.pdf
• http://docs.couchbase.com/

More Related Content

What's hot

Hadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data WarehouseHadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data Warehouse
DataWorks Summit
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
HostedbyConfluent
 

What's hot (20)

Introduction to Database Services
Introduction to Database ServicesIntroduction to Database Services
Introduction to Database Services
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Hadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data WarehouseHadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data Warehouse
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Cassandra
CassandraCassandra
Cassandra
 
Azure storage
Azure storageAzure storage
Azure storage
 
Delta Lake with Azure Databricks
Delta Lake with Azure DatabricksDelta Lake with Azure Databricks
Delta Lake with Azure Databricks
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Cassandra
Cassandra Cassandra
Cassandra
 
NoSQL
NoSQLNoSQL
NoSQL
 
Lakehouse in Azure
Lakehouse in AzureLakehouse in Azure
Lakehouse in Azure
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Couchdb + Membase = Couchbase
Couchdb + Membase = CouchbaseCouchdb + Membase = Couchbase
Couchdb + Membase = Couchbase
 
Azure Database Services for MySQL PostgreSQL and MariaDB
Azure Database Services for MySQL PostgreSQL and MariaDBAzure Database Services for MySQL PostgreSQL and MariaDB
Azure Database Services for MySQL PostgreSQL and MariaDB
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Azure SQL Database
Azure SQL DatabaseAzure SQL Database
Azure SQL Database
 
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 

Similar to NoSQL and Couchbase

Nosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptxNosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptx
Radhika R
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunities
Editor Jacotech
 
How To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQLHow To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQL
DataStax
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
Adi Challa
 

Similar to NoSQL and Couchbase (20)

Nosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptxNosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptx
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربي
 
Distributed RDBMS: Data Distribution Policy: Part 2 - Creating a Data Distrib...
Distributed RDBMS: Data Distribution Policy: Part 2 - Creating a Data Distrib...Distributed RDBMS: Data Distribution Policy: Part 2 - Creating a Data Distrib...
Distributed RDBMS: Data Distribution Policy: Part 2 - Creating a Data Distrib...
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunities
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
 
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
 
NoSql Brownbag
NoSql BrownbagNoSql Brownbag
NoSql Brownbag
 
BigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearchBigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearch
 
Distributed RDBMS: Data Distribution Policy: Part 1 - What is a Data Distribu...
Distributed RDBMS: Data Distribution Policy: Part 1 - What is a Data Distribu...Distributed RDBMS: Data Distribution Policy: Part 1 - What is a Data Distribu...
Distributed RDBMS: Data Distribution Policy: Part 1 - What is a Data Distribu...
 
How To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQLHow To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQL
 
Master.pptx
Master.pptxMaster.pptx
Master.pptx
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
Couchbase 3.0.2 d1
Couchbase 3.0.2  d1Couchbase 3.0.2  d1
Couchbase 3.0.2 d1
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
 
System design fundamentals CAP.pdf
System design fundamentals CAP.pdfSystem design fundamentals CAP.pdf
System design fundamentals CAP.pdf
 
CouchBase The Complete NoSql Solution for Big Data
CouchBase The Complete NoSql Solution for Big DataCouchBase The Complete NoSql Solution for Big Data
CouchBase The Complete NoSql Solution for Big Data
 
Nosql- Introduction for Beginners
Nosql-  Introduction for BeginnersNosql-  Introduction for Beginners
Nosql- Introduction for Beginners
 
Rise of NewSQL
Rise of NewSQLRise of NewSQL
Rise of NewSQL
 
Introduction to NoSQL database technology
Introduction to NoSQL database technologyIntroduction to NoSQL database technology
Introduction to NoSQL database technology
 
No sql database
No sql databaseNo sql database
No sql database
 

Recently uploaded

Recently uploaded (20)

Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 

NoSQL and Couchbase

  • 2.
  • 3. Relational Databases • MySQL, PostgreSQL, SQLite, Oracle etc., • Good at –Schemas –Strong Consistency –Transactions –“Mature” and well tested –Availability of Expertise
  • 4. What is NoSQL? • It’s not Anti SQL or ‘NO’ SQL. • It means (N)ot (O)nly SQL. • Exact name could be Non Relational DB.
  • 5. What is NoSQL? • Carlo Strozzi used the term NoSQL in 1998 to name his lightweight, open- source relational database that did not expose the standard SQL interface. • A NoSQL database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. • Motivation for NoSQL include simplicity of design, horizontal scaling and finer control over availability. • Data structures in NoSQL (e.g. key-value, graph, or document) differs from the RDBMS, and therefore some operations are faster in NoSQL and some in RDBMS.
  • 6. “Is NoSQL a complete replacement of RDBMS?” “NO”
  • 7. Common Features of NoSQL • Open Source • Schema-less • Scalability with Scale Out not Scale Up. • Distribution with Sharding. • Eventual Consistency. • Commodity Class Nodes • Parallel Query with MapReduce. • Cloud Readiness • High Availability
  • 8. NoSQL Data Models (1/2) • Distributed Caches: Couchbase, Memcached, Velocity • Wide Column Stores: Accumulo, Cassandra, Druid, HBase • Document Stores: Clusterpoint, Apache CouchDB, Couchbase, MarkLogic, MongoDB
  • 9. NoSQL Data Models (2/2) • Key-value Stores: Dynamo, FoundationDB, MemcacheDB, Redis, Riak, FairCom c-treeACE • Graph Databases: Allegro, Neo4J, InfiniteGraph, OrientDB, Virtuoso, Stardog
  • 10. Why NoSQL (1/2) • Interactive applications have changed dramatically over the last 15 years. In the late ‘90s, large web companies emerged with dramatic increases in scale on many dimensions: – The number of concurrent users skyrocketed. (Big Users) – The amount of data collected and processed soared. (IOT) – The amount of unstructured or semi-structured data exploded. (Big Data/Cloud) • Dealing with above issues was more and more difficult using relational database technology. • Relational databases are essentially architected to run a single machine and use a rigid schema-based approach to modeling data.
  • 11. Why NoSQL (2/2) • Schema-less: Alter operation in RDBMS is costly. • RDMS are less capable of dealing with Big- Data. • RDMS are not good for Object oriented programmer. • RDMS support Scale-up than Scale-out. • RDMS can-not handle Unstructured or semi- structured data.
  • 12. Big Users • Not that long ago, 1,000 daily users of an application was a lot and 10,000 was an extreme case. • Today, with the growth in global Internet use, the increased number of hours users spend online, and the growing popularity of smartphones and tablets, it's not uncommon for apps to have millions of users a day.
  • 13. Internet of Things • The amount of machine-generated data is increasing with the proliferation of digital telemetry. • There are 14 billion things connected to the Internet. – By 2020, 32 billion things will be connected to the Internet. – By 2020, 10% of data will be generated by embedded systems. – By 2020, 20% of target rich data will be generated by embedded systems. • Telemetry data is small, semi-structured and continuous. It’s a challenge for relational databases. • To address this challenge, the innovative enterprise is relying on NoSQL technology to scale concurrent data access to millions of connected things.
  • 14. Big Data • The amount of data is growing rapidly, and the nature of data is changing as well as developers find new data types – most of which are unstructured or semi- structures – that they want to incorporate into their applications. • Data is becoming easier to capture and access through third parties such as Facebook, Dun and Bradstreet, and others. • NoSQL provides a data model that maps better to the application’s organization of data and simplifies the interaction between the
  • 15. The Cloud • Three-Tier Internet Architecture: Applications today are increasingly developed using a three-tier internet architecture, are cloud-based, and use a Software-as-a- Service business model that needs to support the collective needs of thousands of customers. • Above approach requires a horizontally scalable architecture that easily scales with the number of users and amount of data the application has. • NoSQL technologies have been built from the ground up to be distributed, scale- out technologies and therefore fit better with the highly distributed nature of the three-tier Internet architecture.
  • 16. Data Models • Relational and NoSQL data models are very different. • The relational model takes data and separates it into many interrelated tables. • Tables reference each other through foreign keys that are stored in columns as well. • NoSQL databases have a very different model. • For example, a document-oriented NoSQL database takes the data you want to store and aggregates it into documents using the JSON format.
  • 17. The CAP Theorem Published by Eric Brewer in 2000, the theorem is a set of basic requirements that describe any distributed system (not just storage/database systems). • Consistency - All the servers in the system will have the same data so anyone using the system will get the same copy regardless of which server answers their request. • Availability - The system will always respond to a request (even if it's not the latest data or consistent across the system or just a message saying the system isn't working). • Partition Tolerance - The system continues to operate as a whole even if individual servers fail or can't be reached. It's theoretically impossible to have all 3 requirements met, so a combination of 2 must be chosen and this is usually the deciding factor in what technology is used.
  • 18. ACID vs BASE Theorems
  • 19. ACID Properties ACID is a set of properties that apply specifically to database transactions, defined as follows: • Atomicity - Everything in a transaction must happen successfully or none of the changes are committed. This avoids a transaction that changes multiple pieces of data from failing halfway and only making a few changes. • Consistency - The data will only be committed if it passes all the rules in place in the database (ie: data types, triggers, constraints, etc). • Isolation - Transactions won't affect other transactions by changing data that another operation is counting on; and other users won't see partial results of a transaction in progress (depending on isolation mode). • Durability - Once data is committed, it is durably stored and safe against errors, crashes or any other (software) malfunctions within the database.
  • 20. BASE Theorem • Basically Available - This constraint states that the system does guarantee the availability of the data as regards CAP Theorem; there will be a response to any request. But, that response could still be ‘failure’ to obtain the requested data or the data may be in an inconsistent or changing state, much like waiting for a check to clear in your bank account. • Soft state - The state of the system could change over time, so even during times without input there may be changes going on due to ‘eventual consistency,’ thus the state of the system is always ‘soft.’ • Eventual consistency - The system will eventually become consistent once it stops receiving input. The data will propagate to everywhere it should sooner or later, but the system will continue to receive input and is not checking the consistency of every transaction before it moves onto the next one.
  • 22.
  • 23. Couchbase - The NoSQL document database • Couchbase Server, originally known as Membase, is an open source, distributed (shared-nothing architecture) NoSQL document-oriented database that is optimized for interactive applications. These applications must service many concurrent users; creating, storing, retrieving, aggregating, manipulating and presenting data. • Couchbase is designed to provide easy-to-scale key-value or document access with low latency and high sustained throughput. It is designed to be clustered from a single machine to very large scale deployments. • In the parlance of Eric Brewer’s CAP theorem, Couchbase is a CP type system.
  • 24. Couchbase Features Easy Scalability It’s easy to scale your database layer with Couchbase Server, whether within a cluster or across clusters in multiple data centers. With one click of a button, no downtime, and no changes to your app, you can grow your cluster from 1 to 25 to 100s of servers while keeping the workload evenly distributed. Consistent High Performance Couchbase Server’s consistent sub millisecond response times means an awesome experience for your app users. Consistent, high throughput lets you serve more users with fewer servers. Data and workload are equally spread across all servers. Always On With Couchbase Server, your application is always online, 24x365. Whether you are upgrading your database, system software or hardware – or recovering from a disaster – you can count on zero app downtime with Couchbase Server. Flexible Data Model You shouldn’t have to worry about the database when you change your application. With Couchbase Server, there is no fixed schema so records can have different structure, and be changed any time, without modification to other documents in the database.
  • 25. Couchbase Features.. Flexible Data Model 1. JSON Support 2. Indexing and Querying 3. Incremental Map Reduce Easy Scalability 1. Clone to Grow with Auto-Sharding 2. Cross-Cluster Replication (XDCR) Consistent High Performance 1. Built-In Object-Level Cache (memcached) Always On 24x365 1. Zero Downtime Manitenance 2. Data Replication With Auto-Failover 3. Management and Monitoring UI 4. Reliable Storage Architecture.
  • 26. Why Couchbase? • Couchbase provides the world’s most complete, most scalable and best performing NoSQL database. • Couchbase provides the world’s most complete, most scalable and best performing NoSQL database. • Couchbase provides a shared nothing architecture, a single node-type, a built in caching layer, true auto-sharding and the world’s first NoSQL mobile offering.
  • 27. Couchbase Architecture (1/3) High-Level Deployment Architecture.
  • 28. Couchbase Architecture (2/3) • In Couchbase Server, the data manager stores and retrieves data in response to data operation requests from applications. • Every server in a Couchbase cluster includes a built-in multi-threaded object-managed cache, which provides consistent low-latency for read and write operations. • The cluster manager supervises server configuration and interaction between servers within a Couchbase cluster. Node architecture diagram of Couchbase Server
  • 29. Couchbase Architecture (3/3) Data flow within Couchbase during a write operation 1. Client writes a document into the cache, and the server sends the client a confirmation. 2. The document is added into the intra- cluster replication queue to be replicated to other servers within the cluster. 3. The document is also added into the disk write queue to be asynchronously persisted to disk. The document is persisted to disk after the disk-write queue is flushed. 4. After the document is persisted to disk, it’s replicated to other Couchbase Server clusters using cross datacenter replication (XDCR) and eventually indexed.
  • 30. Couchbase’ Elasticsearch Connector • Together, Couchbase and Elasticsearch enable you to build richer and more powerful apps with full-text search, indexing and querying and real-time analytics for use cases such as content stores or aggregating data from varied data sources. “The plug-in for Elasticsearch extends Couchbase Server’s flexibility even further, allowing users to build self-adapting interactive applications.”