This presentation discusses the following features of Hadoop:
Open source
Fault Tolerance
Distributed Processing
Scalability
Reliability
High Availability
Economic
Flexibility
Easy to use
Data locality
Conclusion
2. Open source
Fault Tolerance
Distributed Processing
Scalability
Reliability
High Availability
Economic
Flexibility
Easy to use
Data locality
Conclusion
3. It is an open source Java-based
programming framework.
Open source means it is freely available
and even we can change its source code
as per your requirements.
(CentreforKnowledgeTransfer)
institute
4. Hadoop control faults by the process of replica
creation.
When client stores a file in HDFS, Hadoop framework
divide the file into blocks.
Then client distributes data blocks across different
machines present in HDFS cluster.
And, then create the replica of each block is on other
machines present in the cluster.
HDFS, by default, creates 3 copies of a block on other
machines present in the cluster.
If any machine in the cluster goes down or fails due to
unfavorable conditions.
Then also, the user can easily access that data from
other machines.
(CentreforKnowledgeTransfer)
institute
5. Hadoop stores huge amount of data in a distributed manner in HDFS.
Process the data in parallel on a cluster of nodes.
(CentreforKnowledgeTransfer)
institute
6. Hadoop is an open-source platform.
This makes it extremely scalable platform.
So, new nodes can be easily added without any downtime.
Hadoop provides horizontal scalability so new node added on the
fly model to the system.
In Apache hadoop, applications run on more than thousands of
node.
(CentreforKnowledgeTransfer)
institute
7. Data is reliably stored on the
cluster of machines despite
machine failure due to
replication of data.
So, if any of the nodes fails,
then also we can store data
reliably.
(CentreforKnowledgeTransfer)
institute
8. Due to multiple copies of data,
data is highly available and
accessible despite hardware
failure.
So, any machine goes down
data can be retrieved from the
other path.
(CentreforKnowledgeTransfer)
institute
9. Hadoop is not very expensive
as it runs on the cluster of
commodity hardware.
As we are using low-cost
commodity hardware, we
don’t need to spend a huge
amount of money for scaling
out your Hadoop cluster
(CentreforKnowledgeTransfer)
institute
10. Hadoop is very flexible
in terms of ability to
deal with all kinds of
data.
It deals with structured,
semi-structured or
unstructured.
(CentreforKnowledgeTransfer)
institute
11. No need of client to deal with
distributed computing, the
framework takes care of all the
things.
So it is easy to use
(CentreforKnowledgeTransfer)
institute
12. It refers to the ability to move the
computation close to where actual
data resides on the node. Instead of
moving data to computation.
This minimizes network congestion
and increases the over throughput of
the system.
(CentreforKnowledgeTransfer)
institute
13. Hadoop is highly fault-tolerant.
It reliably stores huge amount of data
despite hardware failure.
It provides High scalability and high
availability.
Hadoop is cost efficient as it runs on a
cluster of commodity hardware.
Hadoop work on Data locality as moving
computation is cheaper than moving data.
All these features of Big data Hadoop
make it powerful for the Big data
processing.
(CentreforKnowledgeTransfer)
institute