SlideShare a Scribd company logo
1 of 29
No

NoSQL Databases
NoSQL DataBases
By:
Muluken Sholaye
(mulesho2490@gmail.com)
Sept,2021
CAP Theorem

Consistency, Availability, Partition Tolerance (CAP)

You can’t continually maintain perfect consistency,
availability, and partition tolerance simultaneously.

CAP is defined by:-

Consistency: all nodes see the same data at the same time

Availability: a guarantee that every request receives a
response about whether it

was successful or failed

Partition tolerance: the system continues to operate despite
arbitrary message loss
CAP Theorem

A distributed system can satisfy a maximum of two
of the following gurantees.


NoSQL databases are next generation databases mostly addressing
some of the points:

Being non-relational,

distributed,

open-source, and

horizontally scalable

Often more characteristics apply to NoSQL databases such as:
Schema-free, easy replication support, simple API, eventually
consistent/BASE (basically available, soft-state, eventual consistency

Not ACID but BASE
NoSQL Databases
Properties of NoSQL Databases

Non-relational

Distributed

Open-source

Horizontally scalable

Schema-free

Easy replication support

Simple API

BASE not ACID
The current number of NoSQL databases has more than 225.
NoSQL databases are widely used in many famous enterprises such as
Google, Yahoo, Facebook, Twitter, Taobao, Amazon, and so on
Categories of NoSQL Databases
●
Here are the four main types of NoSQL databases:
●
Document databases
●
Key-value stores
●
Column-oriented databases
●
Graph databases
●
According to the statistics of the DB-Engines
Ranking website, Apache Cassandra and Apache
HBase are the more widely discussed ones of the
wide column store databases.
Document based
●
A document database stores data in JSON, BSON ,
or XML documents.
●
In a document database, documents can be nested.
Particular elements can be indexed for faster
querying.
●
The most widely adopted document databases are
usually implemented with a scale-out architecture,
providing a clear path to scalability of both data
volumes and traffic.
●
Examples of document stores are MongoDB and
CouchDB.
Cont’d
●
A collection is a group of documents. The
documents within a collection are usually related
to the same subject, such as employees, products,
and so on.
●
A document is a set of ordered key-value pairs,
where key is a string used to reference a
particular value, and value can be either a string
or a document.
●
JSON (JavaScript Object Notation), BSON (Binary
JSON), and XML (eXtensible Markup Language) are
formats commonly used to define documents.
Cont’d
KEY-VALUE STORES
●
Key-value stores are the least complex of the NoSQL databases.
They are, as the name suggests, a collection of key-value pairs.
●
The data in this category of NoSQL databases is stored with the
format of “Key → Value” ,
●
where
●
Key is a string used to identify a unique value;
●
Value is an object whose value can be a simple string, numeric
value, or a complex BLOB JSON object, image, audio, and so
on;
●
According to the statistics of the DB-Engines Ranking Website,
both Redis and DynamoDB.
Cont’d
Graph Databases
●
The most complex one, geared toward storing
relations between entities in an efficient manner.
●
The graph database model (GDM) is composed of
vertices and edges [5], where
– A vertex is an entity instance, which is equivalent to a
tuple in RDM;
– An edge is used to define the relationship between
vertices;
– Each vertex and edge contains any number of attributes
that store the actual data value
●
Cont’d
Assignment
●
Hbase
●
CouchDB
●
Cassandra
●
Redis
●
MongoDB
●
Note:- Take One database from the list and study
– The basics of the database
– Installation and usage
– Demo
●
ETA = 5 Days
Columnar Databases
●
They are index based databases arranged into
columns.
●
Hbase is the most commonly used.
Bigdata Frameworks
Basics
●
The major challenges associated with big data are as follows
−
– Capturing data
– Curation
– Storage
– Searching
– Sharing
– Transfer
– Analysis
– Presentation
●
To fulfill the above challenges, organizations normally take
the help of enterprise Solutions of Layered Frameworks.
Hadoop Ecosystem
●
Apache Hadoop is an open source framework.
●
Hadoop provides businesses with the ability to distribute data storage,
parallel processing, and process data at higher volume, higher velocity,
variety, value, and veracity.
●
Hadoop Ecosystem is a platform or a suite which provides various
services to solve the big data problems. It includes Many Apache projects.
– HDFS: Hadoop Distributed File System
– YARN: Yet Another Resource Negotiator
– MapReduce: Programming based Data Processing
– Spark: In-Memory data processing
– PIG, HIVE: Query based processing of data services
– HBase: NoSQL Database
– Mahout, Spark MLLib: Machine Learning algorithm libraries
– Solar, Lucene: Searching and Indexing
– Zookeeper: Managing cluster
– Flume,Chukwa, Scribe, Kafka, Sqoop : Data collection
Cont’d
●
All these toolkits or components revolve around one term
i.e. Data.
●
That’s the beauty of Hadoop that it revolves around data
and hence making its synthesis easier.
●
There are four major elements
of Hadoop i.e.
– HDFS,
– MapReduce,
– YARN, and
– Hadoop Common.
●
Let’s study each in more detail.
HDFS
●
HDFS is is responsible for storing large data sets of structured
or unstructured data across various nodes and thereby
maintaining the metadata in the form of log files.
●
HDFS consists of two core components i.e.
– Name node
– Data Node
●
Name Node is the prime node which contains metadata (data
about data) requiring comparatively fewer resources than the data
nodes that stores the actual data.
●
These data nodes are commodity hardware in the distributed
environment. Undoubtedly, making Hadoop cost effective.
●
HDFS maintains all the coordination between the clusters and
hardware, thus working at the heart of the system.
MapReduce
●
By making the use of distributed and parallel algorithms,
MapReduce makes it possible to carry over the processing’s
logic and helps to write applications which transform big
data sets into a manageable one.
●
MapReduce makes the use of two functions i.e. Map()
and Reduce() whose task is:
– Map() performs sorting and filtering of data and thereby
organizing them in the form of group. Map generates a key-value
pair based result which is later on processed by the Reduce()
method.
– Reduce(), as the name suggests does the summarization by
aggregating the mapped data. In simple, Reduce() takes the output
generated by Map() as input and combines those tuples into
smaller set of tuples.
●
A Word Count Example of MapReduce
●
Let us understand, how a MapReduce works
by taking an example where I have a text file
called example.txt whose contents are as
follows:
●
Dear, Bear, River, Car, Car, River, Deer, Car
and Bear
●
Now, suppose, we have to perform a word
count on the sample.txt using MapReduce. So,
we will be finding unique words and the
number of occurrences of those unique words.
●
Example
Nosql
Nosql

More Related Content

What's hot

Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesMaynooth University
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.Navdeep Charan
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7abdulrahmanhelan
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 
NoSQL Data Architecture Patterns
NoSQL Data ArchitecturePatternsNoSQL Data ArchitecturePatterns
NoSQL Data Architecture PatternsMaynooth University
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL DatabasesBADR
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational DatabasesUdi Bauman
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databasesAshwani Kumar
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra nehabsairam
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsDATAVERSITY
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingDATAVERSITY
 

What's hot (20)

Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choices
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
 
NoSQL Consepts
NoSQL ConseptsNoSQL Consepts
NoSQL Consepts
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
NoSQL Data Architecture Patterns
NoSQL Data ArchitecturePatternsNoSQL Data ArchitecturePatterns
NoSQL Data Architecture Patterns
 
Data models in NoSQL
Data models in NoSQLData models in NoSQL
Data models in NoSQL
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational Databases
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
 
4. hbase overview
4. hbase overview4. hbase overview
4. hbase overview
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
NoSql
NoSqlNoSql
NoSql
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Unit 3 MongDB
Unit 3 MongDBUnit 3 MongDB
Unit 3 MongDB
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture Patterns
 
the rising no sql technology
the rising no sql technologythe rising no sql technology
the rising no sql technology
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data Modeling
 

Similar to Nosql

NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
NOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfNOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfajajkhan16
 
Big data technology unit 3
Big data technology unit 3Big data technology unit 3
Big data technology unit 3RojaT4
 
Assignment_4
Assignment_4Assignment_4
Assignment_4Kirti J
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sqlRam kumar
 
Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Ahmed Rashwan
 
Big data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBig data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBhavya Gulati
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014Stratebi
 
Mongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMohan Rathour
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLbalwinders
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGijiert bestjournal
 

Similar to Nosql (20)

NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
NOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfNOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdf
 
Big data technology unit 3
Big data technology unit 3Big data technology unit 3
Big data technology unit 3
 
unit2-ppt1.pptx
unit2-ppt1.pptxunit2-ppt1.pptx
unit2-ppt1.pptx
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
Assignment_4
Assignment_4Assignment_4
Assignment_4
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
Datastores
DatastoresDatastores
Datastores
 
Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?
 
Know what is NOSQL
Know what is NOSQL Know what is NOSQL
Know what is NOSQL
 
nosql.pptx
nosql.pptxnosql.pptx
nosql.pptx
 
2018 05 08_biological_databases_no_sql
2018 05 08_biological_databases_no_sql2018 05 08_biological_databases_no_sql
2018 05 08_biological_databases_no_sql
 
Big data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBig data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edge
 
No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
 
No sql database
No sql databaseNo sql database
No sql database
 
HADOOP
HADOOPHADOOP
HADOOP
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014
 
Mongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorial
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
 

Recently uploaded

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Nosql

  • 3. CAP Theorem  Consistency, Availability, Partition Tolerance (CAP)  You can’t continually maintain perfect consistency, availability, and partition tolerance simultaneously.  CAP is defined by:-  Consistency: all nodes see the same data at the same time  Availability: a guarantee that every request receives a response about whether it  was successful or failed  Partition tolerance: the system continues to operate despite arbitrary message loss
  • 4. CAP Theorem  A distributed system can satisfy a maximum of two of the following gurantees. 
  • 5.  NoSQL databases are next generation databases mostly addressing some of the points:  Being non-relational,  distributed,  open-source, and  horizontally scalable  Often more characteristics apply to NoSQL databases such as: Schema-free, easy replication support, simple API, eventually consistent/BASE (basically available, soft-state, eventual consistency  Not ACID but BASE NoSQL Databases
  • 6. Properties of NoSQL Databases  Non-relational  Distributed  Open-source  Horizontally scalable  Schema-free  Easy replication support  Simple API  BASE not ACID The current number of NoSQL databases has more than 225. NoSQL databases are widely used in many famous enterprises such as Google, Yahoo, Facebook, Twitter, Taobao, Amazon, and so on
  • 7. Categories of NoSQL Databases ● Here are the four main types of NoSQL databases: ● Document databases ● Key-value stores ● Column-oriented databases ● Graph databases ● According to the statistics of the DB-Engines Ranking website, Apache Cassandra and Apache HBase are the more widely discussed ones of the wide column store databases.
  • 8. Document based ● A document database stores data in JSON, BSON , or XML documents. ● In a document database, documents can be nested. Particular elements can be indexed for faster querying. ● The most widely adopted document databases are usually implemented with a scale-out architecture, providing a clear path to scalability of both data volumes and traffic. ● Examples of document stores are MongoDB and CouchDB.
  • 9. Cont’d ● A collection is a group of documents. The documents within a collection are usually related to the same subject, such as employees, products, and so on. ● A document is a set of ordered key-value pairs, where key is a string used to reference a particular value, and value can be either a string or a document. ● JSON (JavaScript Object Notation), BSON (Binary JSON), and XML (eXtensible Markup Language) are formats commonly used to define documents.
  • 11. KEY-VALUE STORES ● Key-value stores are the least complex of the NoSQL databases. They are, as the name suggests, a collection of key-value pairs. ● The data in this category of NoSQL databases is stored with the format of “Key → Value” , ● where ● Key is a string used to identify a unique value; ● Value is an object whose value can be a simple string, numeric value, or a complex BLOB JSON object, image, audio, and so on; ● According to the statistics of the DB-Engines Ranking Website, both Redis and DynamoDB.
  • 13. Graph Databases ● The most complex one, geared toward storing relations between entities in an efficient manner. ● The graph database model (GDM) is composed of vertices and edges [5], where – A vertex is an entity instance, which is equivalent to a tuple in RDM; – An edge is used to define the relationship between vertices; – Each vertex and edge contains any number of attributes that store the actual data value ●
  • 15. Assignment ● Hbase ● CouchDB ● Cassandra ● Redis ● MongoDB ● Note:- Take One database from the list and study – The basics of the database – Installation and usage – Demo ● ETA = 5 Days
  • 16. Columnar Databases ● They are index based databases arranged into columns. ● Hbase is the most commonly used.
  • 18. Basics ● The major challenges associated with big data are as follows − – Capturing data – Curation – Storage – Searching – Sharing – Transfer – Analysis – Presentation ● To fulfill the above challenges, organizations normally take the help of enterprise Solutions of Layered Frameworks.
  • 19. Hadoop Ecosystem ● Apache Hadoop is an open source framework. ● Hadoop provides businesses with the ability to distribute data storage, parallel processing, and process data at higher volume, higher velocity, variety, value, and veracity. ● Hadoop Ecosystem is a platform or a suite which provides various services to solve the big data problems. It includes Many Apache projects. – HDFS: Hadoop Distributed File System – YARN: Yet Another Resource Negotiator – MapReduce: Programming based Data Processing – Spark: In-Memory data processing – PIG, HIVE: Query based processing of data services – HBase: NoSQL Database – Mahout, Spark MLLib: Machine Learning algorithm libraries – Solar, Lucene: Searching and Indexing – Zookeeper: Managing cluster – Flume,Chukwa, Scribe, Kafka, Sqoop : Data collection
  • 20.
  • 21. Cont’d ● All these toolkits or components revolve around one term i.e. Data. ● That’s the beauty of Hadoop that it revolves around data and hence making its synthesis easier. ● There are four major elements of Hadoop i.e. – HDFS, – MapReduce, – YARN, and – Hadoop Common. ● Let’s study each in more detail.
  • 22. HDFS ● HDFS is is responsible for storing large data sets of structured or unstructured data across various nodes and thereby maintaining the metadata in the form of log files. ● HDFS consists of two core components i.e. – Name node – Data Node ● Name Node is the prime node which contains metadata (data about data) requiring comparatively fewer resources than the data nodes that stores the actual data. ● These data nodes are commodity hardware in the distributed environment. Undoubtedly, making Hadoop cost effective. ● HDFS maintains all the coordination between the clusters and hardware, thus working at the heart of the system.
  • 23. MapReduce ● By making the use of distributed and parallel algorithms, MapReduce makes it possible to carry over the processing’s logic and helps to write applications which transform big data sets into a manageable one. ● MapReduce makes the use of two functions i.e. Map() and Reduce() whose task is: – Map() performs sorting and filtering of data and thereby organizing them in the form of group. Map generates a key-value pair based result which is later on processed by the Reduce() method. – Reduce(), as the name suggests does the summarization by aggregating the mapped data. In simple, Reduce() takes the output generated by Map() as input and combines those tuples into smaller set of tuples.
  • 24.
  • 25.
  • 26. ● A Word Count Example of MapReduce ● Let us understand, how a MapReduce works by taking an example where I have a text file called example.txt whose contents are as follows: ● Dear, Bear, River, Car, Car, River, Deer, Car and Bear ● Now, suppose, we have to perform a word count on the sample.txt using MapReduce. So, we will be finding unique words and the number of occurrences of those unique words. ●