Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

BDtraining

152 views

Published on

  • Be the first to comment

  • Be the first to like this

BDtraining

  1. 1. 1 Introducing You To Big Data World By Jithin S L Hadoop Techology Lead – DirecTV. CBC Lead – Hadoop Practice Area. Cross Sector Industry Training to Communication Sector,India : 03-Aug-15
  2. 2. 2 Agenda ● Origin of Big Data ● What is Big Data? ● Dimensions of Big Data ● What is Hadoop? ● Why Hadoop? ● Characters of Hadoop
  3. 3. 3 Cont.. ● Hadoop Business use ● Verticals of Hadoop ● Hadoop Ecosystem ● Components of Hadoop ● HDFS Architecture
  4. 4. 4 Cont.. ● Hadoop Business use ● Verticals of Hadoop ● Hadoop Ecosystem ● Components of Hadoop ● HDFS Architecture
  5. 5. 5 Objective ● You will understand an overview of Hadoop and business uses. ● Various Components of Hadoop. .
  6. 6. 6 Data ● Valuable asset ● Different forms of data. ● Decision Making
  7. 7. 7 Types of Data
  8. 8. 8 Data Size
  9. 9. 9 Sources of Data 12+ TBs of tweet data every day
  10. 10. 10 Big Data “Big data refers to data sets whose size is beyond the ability of typical database software tools to capture, store, manage and analyze.”
  11. 11. 11 Dimension Of Big Data
  12. 12. 12 4th V
  13. 13. 13 Verticals – Big Data
  14. 14. 14 Orgin of Hadoop ● Dec 2003 – GFS ● July 2005 – Nutch ● April 2007 – Yahoo ● Jan 2008 - Apache
  15. 15. 15 Hadoop ● Open-source software framework ● For storing data and running applications ● Clusters of commodity hardware.
  16. 16. 16 Who Use Hadoop ● Google ● Facebook ● Yahoo ● ebay ● Twitter ● Amazon
  17. 17. 17 Who Use Hadoop ● Google ● Facebook ● Yahoo ● ebay ● Twitter ● Amazon
  18. 18. 18 !Puzzle! True or False ● Is Hadoop a Database?
  19. 19. 19 D/b Big Data Vs Hadoop Big Data Hadoop Big data is simply the large sets of data that businesses and other parties put together to serve specific goals and operations Hadoop is one of the tools designed to handle big data.
  20. 20. 20 Benefits of Hadoop ● Computing power. ● Flexibility. ● Fault tolerance ● Low cost. ● Scalability.
  21. 21. 21 !Puzzle! True or False ● Video,sound,images are examples of Semi-structured Data.
  22. 22. 22 Big Data in Business
  23. 23. 23 Cont..
  24. 24. 24 Cont..
  25. 25. 25 Challenges of using Hadoop ● MapReduce programming is not a good match for all problems. ● There’s a widely acknowledged talent gap. ● Data security. ● Full-fledged data management and governance.
  26. 26. 26 Distributions ● Cloudera ● Big Insights. ● Horton works. ● MapR
  27. 27. 27 Modes ● Stand alone – Single machine ● Pseudo Mode – Single machine ● Fully Distributed - More than 1 machine (1- M,1-S)
  28. 28. 28 Hadoop Ecosystem
  29. 29. 29 Hadoop Component ● HDFS - Storage ● Map reduce - Processing
  30. 30. 30 HDFS ● M-S ● Name Node – Master ● Data Node - Slave ● Secondary Name - Backup
  31. 31. 31 HDFS – Master - Slave
  32. 32. 32 Q&A ● Send your queries to jithinsl@in.ibm.com
  33. 33. 33

×