Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big data analysis using map/reduce

This presentation mainly focused on map/reduce concept in Big Data.

  • Be the first to comment

Big data analysis using map/reduce

  1. 1. BBigig DData Analysis for Pageata Analysis for Page Ranking using Map/ReduceRanking using Map/Reduce R.Renuka, R.Vidhya Priya, IIIB.Sc., IT, The S.F.R.College forWomen, Sivakasi.
  2. 2. Overview Introduction What isBig Data! Why Big Data? 4 V’sOf Big Data Big DataAnalyticsTechnologies Map/Reduce Applications CaseStudy Conclusion
  3. 3. Introduction Datahaveoutgrown thestorageand processing capabilitiesof asinglehost. Two fundamental challenges: – how to storeand – how to work with voluminousdatasizes, and, – how to understand dataand turn it into acompetitive advantage.
  4. 4. What isBig Data! ‘Big-data’ issimilar to ‘Small-data’, but bigger But having databigger requiresdifferent approaches: techniques, tools& architectures To solve: New problemsand old problemsin abetter way.
  5. 5. TheBlind men and theElephant
  6. 6. Why Big Data? Key enablersfor thegrowth of “Big Data” are: Increaseof Processing Power Increaseof StorageCapacities Availability of Data
  7. 7. 4 V’sof Big Data
  8. 8. Big DataAnalyticsTechnologies Hadoop PLATFORA WibiData PIG Hive MapReduce NoSQL databases Column-oriented databases
  9. 9. Hadoop Hadoop isadistributed filesystem and data processing engine Hadoop hastwo components: – TheHadoop distributed filesystem (HDFS) – TheMapReduceprograming.
  10. 10. Map / Reduce A High level abstracted framework for distributed processing of large datasets Fault Tolerant , Parallelization Computation consistsof two phases Map Reduce A Master-Slavearchitecture Computationsoccursin multipleslavenodes And it triesto providedatalocality asmuch aspossible.
  11. 11. MR model Map – Processakey/valuepair to generateintermediatekey/value pairs Reduce – Mergeall intermediatevaluesassociated with thesamekey Usersimplement interfaceof two primary methods: 1. Map: (key1, val1) → (key2, val2) 2. Reduce: (key2, [val2]) → [val3]
  12. 12. Applications
  13. 13. Homeland Security FinanceSmarter Healthcare Multi-channel sales Telecom Manufacturing Traffic Control Trading Analytics Fraud and Risk Log Analysis Search Quality Retails
  14. 14. CaseStudy
  15. 15. Conclusion Real-time big data isn’t just a process for storing petabytesor exabytesof datain adatawarehouse, It’s about the ability to make better decisions and take meaningful actionsat theright time.
  16. 16. Queries ??