SlideShare a Scribd company logo
1 of 12
MapReduce 
HEART OF HADOOP
What is Map Reduce 
 MapReduce is a programming model designed for 
processing large volumes of data in parallel by dividing 
the work into a set of independent tasks. 
 MapReduce programs are written in a particular style 
inf luenced by functional programming constructs, 
specifically idioms for processing lists of data.
Map and Reduce 
 Conceptually, MapReduce programs transform lists of 
input data elements into lists of output data elements. 
A MapReduce program will do this twice, using two 
different list processing idioms: map, and reduce.
Dataflow 
 A mapreduce job is the a unit of work that the client 
wants to be performed it consist of the input data and 
theMapReduce Program and configuration info . 
 Hadoop runs the job by dividing it into tasks ,of which 
there are two type :map tasks and the reduce tasks
MapReduce 
How much data is 
getting processed 
at a time
MapReduce
Map Only Job
Combiner 
 The Combiner is a "mini-reduce" process which 
operates only on data generated by one machine. 
 Word count is a prime example for where a Combiner 
is useful. The Word Count program in listings 1--3 
emits a (word, 1) pair for every instance of every word it 
sees. So if the same document contains the word "cat" 
3 times, the pair ("cat", 1) is emitted three times; all of 
these are then sent to the Reducer.
FAULT TOLERANCE 
 One of the primary reasons to use Hadoop to run your jobs 
is due to its high degree of fault tolerance. 
 The primary way that Hadoop achieves fault tolerance is 
through restarting tasks. Individual task nodes 
(TaskTrackers) are in constant communication with the 
head node of the system, called the JobTracker. If a 
TaskTracker fails to communicate with the JobTracker for 
a period of time (by default, 1 minute), the 
JobTracker will assume that the TaskTracker in 
question has crashed. The JobTracker knows which map 
and reduce tasks were assigned to each 
TaskTracker. .
Hadoop Streaming 
 Hadoop provides an API to MapReduce that allows you to write your 
map and reduce functions in languages other than Java. Hadoop 
Streaming uses Unix standard streams as the interface between 
Hadoop and your program, so you can use any language that can 
read standard input and write to standard output to write your 
MapReduce program.
MapReduce basic

More Related Content

What's hot

Map reduce paradigm explained
Map reduce paradigm explainedMap reduce paradigm explained
Map reduce paradigm explained
Dmytro Sandu
 
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
soujavajug
 
Intro to Hadoop
Intro to HadoopIntro to Hadoop
Intro to Hadoop
jeffturner
 

What's hot (19)

Mapreduce by examples
Mapreduce by examplesMapreduce by examples
Mapreduce by examples
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 
Map reduce paradigm explained
Map reduce paradigm explainedMap reduce paradigm explained
Map reduce paradigm explained
 
Hadoop-Introduction
Hadoop-IntroductionHadoop-Introduction
Hadoop-Introduction
 
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
 
Map reduce prashant
Map reduce prashantMap reduce prashant
Map reduce prashant
 
Intro to Hadoop
Intro to HadoopIntro to Hadoop
Intro to Hadoop
 
Introduction to MapReduce
Introduction to MapReduceIntroduction to MapReduce
Introduction to MapReduce
 
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other OptimizationsMastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
 
Apache Hadoop MapReduce Tutorial
Apache Hadoop MapReduce TutorialApache Hadoop MapReduce Tutorial
Apache Hadoop MapReduce Tutorial
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
MapReduce Design Patterns
MapReduce Design PatternsMapReduce Design Patterns
MapReduce Design Patterns
 
Hadoop & MapReduce
Hadoop & MapReduceHadoop & MapReduce
Hadoop & MapReduce
 
Hadoop/MapReduce/HDFS
Hadoop/MapReduce/HDFSHadoop/MapReduce/HDFS
Hadoop/MapReduce/HDFS
 
An Introduction To Map-Reduce
An Introduction To Map-ReduceAn Introduction To Map-Reduce
An Introduction To Map-Reduce
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 

Viewers also liked

Application of MapReduce in Cloud Computing
Application of MapReduce in Cloud ComputingApplication of MapReduce in Cloud Computing
Application of MapReduce in Cloud Computing
Mohammad Mustaqeem
 
Nadia Jamaica Options
Nadia Jamaica OptionsNadia Jamaica Options
Nadia Jamaica Options
chglat
 
Reagan Costa Rica Options
Reagan Costa Rica OptionsReagan Costa Rica Options
Reagan Costa Rica Options
chglat
 
Коротко о планировании на год для ССУ
Коротко о планировании на год для ССУКоротко о планировании на год для ССУ
Коротко о планировании на год для ССУ
Alexandr Mikhailov
 

Viewers also liked (20)

[150824]symposium v4
[150824]symposium v4[150824]symposium v4
[150824]symposium v4
 
Top 3 design patterns in Map Reduce
Top 3 design patterns in Map ReduceTop 3 design patterns in Map Reduce
Top 3 design patterns in Map Reduce
 
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters
 
NextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduceNextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduce
 
Map/Reduce intro
Map/Reduce introMap/Reduce intro
Map/Reduce intro
 
Application of MapReduce in Cloud Computing
Application of MapReduce in Cloud ComputingApplication of MapReduce in Cloud Computing
Application of MapReduce in Cloud Computing
 
Cloud storage
Cloud storageCloud storage
Cloud storage
 
Cloud storage slides
Cloud storage slidesCloud storage slides
Cloud storage slides
 
Dominican Republic
Dominican Republic Dominican Republic
Dominican Republic
 
The Royal Playa Del Carmen
The Royal Playa Del CarmenThe Royal Playa Del Carmen
The Royal Playa Del Carmen
 
Nadia Jamaica Options
Nadia Jamaica OptionsNadia Jamaica Options
Nadia Jamaica Options
 
Reagan Costa Rica Options
Reagan Costa Rica OptionsReagan Costa Rica Options
Reagan Costa Rica Options
 
Alison Jones
Alison JonesAlison Jones
Alison Jones
 
Secrets cap cana josh
Secrets cap cana joshSecrets cap cana josh
Secrets cap cana josh
 
Victoria Excellence PM
Victoria Excellence PMVictoria Excellence PM
Victoria Excellence PM
 
DreamWork
DreamWorkDreamWork
DreamWork
 
Ashley akumal
Ashley akumalAshley akumal
Ashley akumal
 
Karen excellence pm
Karen excellence pmKaren excellence pm
Karen excellence pm
 
Curriculumm
CurriculummCurriculumm
Curriculumm
 
Коротко о планировании на год для ССУ
Коротко о планировании на год для ССУКоротко о планировании на год для ССУ
Коротко о планировании на год для ССУ
 

Similar to MapReduce basic

2004 map reduce simplied data processing on large clusters (mapreduce)
2004 map reduce simplied data processing on large clusters (mapreduce)2004 map reduce simplied data processing on large clusters (mapreduce)
2004 map reduce simplied data processing on large clusters (mapreduce)
anh tuan
 
Hadoop interview question
Hadoop interview questionHadoop interview question
Hadoop interview question
pappupassindia
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
Mohamed Ali Mahmoud khouder
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
Varun Narang
 

Similar to MapReduce basic (20)

MapReduce-Notes.pdf
MapReduce-Notes.pdfMapReduce-Notes.pdf
MapReduce-Notes.pdf
 
Hadoop
HadoopHadoop
Hadoop
 
B04 06 0918
B04 06 0918B04 06 0918
B04 06 0918
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introduction
 
Map reduce
Map reduceMap reduce
Map reduce
 
2004 map reduce simplied data processing on large clusters (mapreduce)
2004 map reduce simplied data processing on large clusters (mapreduce)2004 map reduce simplied data processing on large clusters (mapreduce)
2004 map reduce simplied data processing on large clusters (mapreduce)
 
Lecture 1 mapreduce
Lecture 1  mapreduceLecture 1  mapreduce
Lecture 1 mapreduce
 
Hadoop interview questions
Hadoop interview questionsHadoop interview questions
Hadoop interview questions
 
Hadoop ppt2
Hadoop ppt2Hadoop ppt2
Hadoop ppt2
 
B04 06 0918
B04 06 0918B04 06 0918
B04 06 0918
 
E031201032036
E031201032036E031201032036
E031201032036
 
Hadoop interview questions - Softwarequery.com
Hadoop interview questions - Softwarequery.comHadoop interview questions - Softwarequery.com
Hadoop interview questions - Softwarequery.com
 
Hadoop interview question
Hadoop interview questionHadoop interview question
Hadoop interview question
 
Seminar_Report_hadoop
Seminar_Report_hadoopSeminar_Report_hadoop
Seminar_Report_hadoop
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacm
 
map reduce Technic in big data
map reduce Technic in big data map reduce Technic in big data
map reduce Technic in big data
 

More from Chirag Ahuja (10)

Deploy hadoop cluster
Deploy hadoop clusterDeploy hadoop cluster
Deploy hadoop cluster
 
Word count example in hadoop mapreduce using java
Word count example in hadoop mapreduce using javaWord count example in hadoop mapreduce using java
Word count example in hadoop mapreduce using java
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
Flume
FlumeFlume
Flume
 
Hbase
HbaseHbase
Hbase
 
Pig
PigPig
Pig
 
Hive : WareHousing Over hadoop
Hive :  WareHousing Over hadoopHive :  WareHousing Over hadoop
Hive : WareHousing Over hadoop
 
Mapreduce advanced
Mapreduce advancedMapreduce advanced
Mapreduce advanced
 
Hdfs
HdfsHdfs
Hdfs
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 

Recently uploaded

Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 

Recently uploaded (20)

RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 

MapReduce basic

  • 2. What is Map Reduce  MapReduce is a programming model designed for processing large volumes of data in parallel by dividing the work into a set of independent tasks.  MapReduce programs are written in a particular style inf luenced by functional programming constructs, specifically idioms for processing lists of data.
  • 3. Map and Reduce  Conceptually, MapReduce programs transform lists of input data elements into lists of output data elements. A MapReduce program will do this twice, using two different list processing idioms: map, and reduce.
  • 4. Dataflow  A mapreduce job is the a unit of work that the client wants to be performed it consist of the input data and theMapReduce Program and configuration info .  Hadoop runs the job by dividing it into tasks ,of which there are two type :map tasks and the reduce tasks
  • 5. MapReduce How much data is getting processed at a time
  • 7.
  • 9. Combiner  The Combiner is a "mini-reduce" process which operates only on data generated by one machine.  Word count is a prime example for where a Combiner is useful. The Word Count program in listings 1--3 emits a (word, 1) pair for every instance of every word it sees. So if the same document contains the word "cat" 3 times, the pair ("cat", 1) is emitted three times; all of these are then sent to the Reducer.
  • 10. FAULT TOLERANCE  One of the primary reasons to use Hadoop to run your jobs is due to its high degree of fault tolerance.  The primary way that Hadoop achieves fault tolerance is through restarting tasks. Individual task nodes (TaskTrackers) are in constant communication with the head node of the system, called the JobTracker. If a TaskTracker fails to communicate with the JobTracker for a period of time (by default, 1 minute), the JobTracker will assume that the TaskTracker in question has crashed. The JobTracker knows which map and reduce tasks were assigned to each TaskTracker. .
  • 11. Hadoop Streaming  Hadoop provides an API to MapReduce that allows you to write your map and reduce functions in languages other than Java. Hadoop Streaming uses Unix standard streams as the interface between Hadoop and your program, so you can use any language that can read standard input and write to standard output to write your MapReduce program.