SlideShare a Scribd company logo
1 of 16
1
Apache Hadoop at 10
(+ The Next 10 Years)
2
3
A Decade of Hadoop History on One Slide
Ten years ago, “Hadoop” referred to a scalable, fault-tolerant
filesystem (HDFS) and programming framework (MapReduce)
for distributed computing.
Today, it refers to both a kernel containing the aforementioned
pieces, as well as a constantly evolving ecosystem of 25+ data
stores, execution engines, programming and data access
frameworks, and other componentry.
Recognize this guy?
4
Fast Historical Facts
• The code that eventually became Hadoop was written by
Doug Cutting and Mike Cafarella, open source developers
working in the search tech community, as part of the
Nutch project.
• The word “hadoop” originated with Cutting’s young son,
who owned a plush toy elephant he gave that name.
• Yahoo! was the first user of Hadoop in large-scale
production, and Cutting did early work on Hadoop there.
• Eventually, Cutting joined Cloudera as its chief architect
and remains there to this day.
5
The Original Inspirations for Hadoop
2003 2004
6
Hadoop’s Original Architecture
MapReduce
(Data Processing and Resource Management)
HDFS
(Filesystem/Storage)
7
2002
Doug Cutting and Mike
Cafarella create Nutch, an
open source web crawler
(October)
Google publishes its
“Google File System” paper
(October) Cutting & Cafarella
implement Nutch features
that will become HDFS
(June)
Google publishes its
“MapReduce” paper
(October)
2002
2003
2004
Timeline (Abridged): The Invention Years
8
2002
Cafarella spearheads an
implementation of
MapReduce in Nutch
(February)
Cutting joins Yahoo!;
starts Hadoop subproject by
carving code from Nutch
(January)
2005
2006
2007
Yahoo! creates its first Hadoop
cluster for R&D
(March)
Google publishes “Bigtable”
paper, which eventually will
inspire creation of HBase
(November)
First Hadoop User Group
meeting (in Palo Alto, CA)
(October)
Community contributions begin
to rise steeply
First Apache release of Hadoop
(April)
Timeline (Abridged): The Incubation Years
9
2002
Hadoop becomes a Top Level
ASF project
(January) Initial publication of Hadoop:
The Definitive Guide, by Tom
White
(June)
2008
2009
Cutting joins Cloudera as its
chief architect
(August)
Inaugural Hadoop World
conference
convenes in New York
(October)
Yahoo! launches world’s
largest Hadoop application
(February)
Hive, Hadoop’s first SQL
framework, becomes a
Hadoop sub-project
(June)
Cloudera, first company to
commercialize Hadoop, is
founded (August)
Initial Apache release of Pig
(November)
Timeline (Abridged): The Coming-Out Years
10
2002
The extended Hadoop
community busily builds out
a plethora of new
components (Crunch, Sqoop,
Flume, Oozie, etc) that
extend Hadoop use cases and
usability
HDFS NameNode HA, a
significant new feature for
enterprise adoption, merges
into Hadoop trunk
(March)
2010-11
2012
YARN, another important
advance for adoption,
becomes a Hadoop subproject
(August)
Impala, the first native MPP
query engine for Hadoop data,
joins the ecosystem
(October)
Spark, the emerging default
execution engine for Hadoop,
becomes a Top Level ASF
project
(February)
2013-14
Kudu, the first native storage
option for Hadoop since
HBase, joins the ASF Incubator
(as does Impala)
(December)
2015
Timeline (Abridged): The Rapid Adoption Years
11
2006 2008 2009 2010 2011 2012 2013
Core Hadoop
(HDFS,
MapReduce)
HBase
ZooKeeper
Solr
Pig
Core Hadoop
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
Core Hadoop
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
Core Hadoop
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
Spark
Tez
Impala
Kafka
Drill
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
Parquet
Sentry
Spark
Tez
Impala
Kafka
Drill
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
The stack is continually evolving and growing!
2007
Solr
Pig
Core Hadoop
Knox
Flink
Parquet
Sentry
Spark
Tez
Impala
Kafka
Drill
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
2014 2015
Kudu
RecordService
Ibis
Falcon
Knox
Flink
Parquet
Sentry
Spark
Tez
Impala
Kafka
Drill
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
Evolution of the Hadoop Platform
12
Hadoop Architecture Today
13
Why Did Hadoop Succeed?
1. Open source community and license
A large and diverse community of developers has historically made, and continues to
make, the Hadoop ecosystem among the most active and engaged in history, while
the Apache License lowers the barrier to entry for users.
2. Extensibility/adaptability
With the possible exception of Linux, no other complex platform has evolved on so
many levels, and so quickly, to meet user requirements over time.
3. A strong focus on systems
The roots of Hadoop are in making distributed computing infrastructure more
accessible by application developers. That continuing focus continues to bear fruit in
areas like resource management and security.
14
Hadoop’s Next 10 Years
Interest in public-cloud
deployments are driving
native support for them
into the platform.
Rapid hardware
advances are forcing the
community to re-think
Hadoop’s foundations.
Data sources are more
numerous, distributed,
and diverse (IoT), and
Hadoop will adapt.
15
The Use Case Only Gets Stronger
Much of the progress we will make in this century
will come from increased understanding of the data
we generate.
- Doug Cutting
“
”
16
Learn More
cloudera.com/hadoop10

More Related Content

What's hot

Hadoop tools with Examples
Hadoop tools with ExamplesHadoop tools with Examples
Hadoop tools with ExamplesJoe McTee
 
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)Uwe Printz
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 
Big data Hadoop Analytic and Data warehouse comparison guide
Big data Hadoop Analytic and Data warehouse comparison guideBig data Hadoop Analytic and Data warehouse comparison guide
Big data Hadoop Analytic and Data warehouse comparison guideDanairat Thanabodithammachari
 
Hw09 Welcome To Hadoop World
Hw09   Welcome To Hadoop WorldHw09   Welcome To Hadoop World
Hw09 Welcome To Hadoop WorldCloudera, Inc.
 
The Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemThe Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemCloudera, Inc.
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview EMC
 
Hadoop: Distributed Data Processing
Hadoop: Distributed Data ProcessingHadoop: Distributed Data Processing
Hadoop: Distributed Data ProcessingCloudera, Inc.
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY pptsravya raju
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and HadoopFlavio Vit
 
Hadoop Presentation - PPT
Hadoop Presentation - PPTHadoop Presentation - PPT
Hadoop Presentation - PPTAnand Pandey
 
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
What are Hadoop Components? Hadoop Ecosystem and Architecture | EdurekaWhat are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
What are Hadoop Components? Hadoop Ecosystem and Architecture | EdurekaEdureka!
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemShivaji Dutta
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3tcloudcomputing-tw
 
Integration of HIve and HBase
Integration of HIve and HBaseIntegration of HIve and HBase
Integration of HIve and HBaseHortonworks
 

What's hot (20)

Hadoop overview
Hadoop overviewHadoop overview
Hadoop overview
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop tools with Examples
Hadoop tools with ExamplesHadoop tools with Examples
Hadoop tools with Examples
 
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
 
Big data Analytics Hadoop
Big data Analytics HadoopBig data Analytics Hadoop
Big data Analytics Hadoop
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Big data Hadoop Analytic and Data warehouse comparison guide
Big data Hadoop Analytic and Data warehouse comparison guideBig data Hadoop Analytic and Data warehouse comparison guide
Big data Hadoop Analytic and Data warehouse comparison guide
 
Hw09 Welcome To Hadoop World
Hw09   Welcome To Hadoop WorldHw09   Welcome To Hadoop World
Hw09 Welcome To Hadoop World
 
The Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemThe Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop Ecosystem
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Big data concepts
Big data conceptsBig data concepts
Big data concepts
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview
 
Hadoop: Distributed Data Processing
Hadoop: Distributed Data ProcessingHadoop: Distributed Data Processing
Hadoop: Distributed Data Processing
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Hadoop Presentation - PPT
Hadoop Presentation - PPTHadoop Presentation - PPT
Hadoop Presentation - PPT
 
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
What are Hadoop Components? Hadoop Ecosystem and Architecture | EdurekaWhat are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
 
Integration of HIve and HBase
Integration of HIve and HBaseIntegration of HIve and HBase
Integration of HIve and HBase
 

Viewers also liked

Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChicago Hadoop Users Group
 
איתמר ורלי
איתמר ורליאיתמר ורלי
איתמר ורליmerkazy
 
Replacement of legacy cis with sap cr&b at phi
Replacement of legacy cis with sap cr&b at phiReplacement of legacy cis with sap cr&b at phi
Replacement of legacy cis with sap cr&b at phirobgirvan
 
HAPPYWEEK 172 2016.05.30.
HAPPYWEEK 172 2016.05.30.HAPPYWEEK 172 2016.05.30.
HAPPYWEEK 172 2016.05.30.Jiří Černák
 
2013-09-12 - SUGDC - Office 365 and Hybrid Solutions
2013-09-12 - SUGDC - Office 365 and Hybrid Solutions2013-09-12 - SUGDC - Office 365 and Hybrid Solutions
2013-09-12 - SUGDC - Office 365 and Hybrid SolutionsDan Usher
 
Grafico diario del dax perfomance index para el 11 05-2012
Grafico diario del dax perfomance index para el 11 05-2012Grafico diario del dax perfomance index para el 11 05-2012
Grafico diario del dax perfomance index para el 11 05-2012Experiencia Trading
 
RHEL 7. Контейнеры и Docker
RHEL 7. Контейнеры и DockerRHEL 7. Контейнеры и Docker
RHEL 7. Контейнеры и DockerAndrey Markelov
 
Turismo accesible.
Turismo accesible.Turismo accesible.
Turismo accesible.José María
 
Nl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesNl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesBolke de Bruin
 
ΥΠΑΤΙΑ Η ΑΛΕΞΑΝΔΡΙΝΗ ΠΡΟΣΤΑΤΙΣ ΤΩΝ ΕΛΛΗΝΙΚΩΝ ΓΡΑΜΜ
ΥΠΑΤΙΑ Η ΑΛΕΞΑΝΔΡΙΝΗ ΠΡΟΣΤΑΤΙΣ ΤΩΝ ΕΛΛΗΝΙΚΩΝ ΓΡΑΜΜΥΠΑΤΙΑ Η ΑΛΕΞΑΝΔΡΙΝΗ ΠΡΟΣΤΑΤΙΣ ΤΩΝ ΕΛΛΗΝΙΚΩΝ ΓΡΑΜΜ
ΥΠΑΤΙΑ Η ΑΛΕΞΑΝΔΡΙΝΗ ΠΡΟΣΤΑΤΙΣ ΤΩΝ ΕΛΛΗΝΙΚΩΝ ΓΡΑΜΜIason Yannis Schizas
 
Oportunidad de negocio cardi ventas por catalogo para ganar dinero
Oportunidad de negocio cardi ventas por catalogo para ganar dineroOportunidad de negocio cardi ventas por catalogo para ganar dinero
Oportunidad de negocio cardi ventas por catalogo para ganar dineroVenta por Catalogo
 
AudioとガジェットをWebで遊ぶ - Web Audio/MIDI Web Bluetooth -
AudioとガジェットをWebで遊ぶ - Web Audio/MIDI Web Bluetooth -AudioとガジェットをWebで遊ぶ - Web Audio/MIDI Web Bluetooth -
AudioとガジェットをWebで遊ぶ - Web Audio/MIDI Web Bluetooth -Ryoya Kawai
 
ΠΑΡΟΥΣΙΑΣΗ ΑΠΟΤΕΛΕΣΜΑΤΩΝ ΕΡΕΥΝΑΣ ΔΗΜΟΤΙΚΟΥ ΣΧΟΛΕΙΟΥ ΓΑΛΛΙΚΟΥ
ΠΑΡΟΥΣΙΑΣΗ ΑΠΟΤΕΛΕΣΜΑΤΩΝ ΕΡΕΥΝΑΣ ΔΗΜΟΤΙΚΟΥ ΣΧΟΛΕΙΟΥ ΓΑΛΛΙΚΟΥΠΑΡΟΥΣΙΑΣΗ ΑΠΟΤΕΛΕΣΜΑΤΩΝ ΕΡΕΥΝΑΣ ΔΗΜΟΤΙΚΟΥ ΣΧΟΛΕΙΟΥ ΓΑΛΛΙΚΟΥ
ΠΑΡΟΥΣΙΑΣΗ ΑΠΟΤΕΛΕΣΜΑΤΩΝ ΕΡΕΥΝΑΣ ΔΗΜΟΤΙΚΟΥ ΣΧΟΛΕΙΟΥ ΓΑΛΛΙΚΟΥkaratzid
 
Open pit mining
Open pit miningOpen pit mining
Open pit mininggereltuya
 

Viewers also liked (18)

Edelweiss.
Edelweiss. Edelweiss.
Edelweiss.
 
Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your Business
 
Aulas
AulasAulas
Aulas
 
איתמר ורלי
איתמר ורליאיתמר ורלי
איתמר ורלי
 
Replacement of legacy cis with sap cr&b at phi
Replacement of legacy cis with sap cr&b at phiReplacement of legacy cis with sap cr&b at phi
Replacement of legacy cis with sap cr&b at phi
 
HAPPYWEEK 172 2016.05.30.
HAPPYWEEK 172 2016.05.30.HAPPYWEEK 172 2016.05.30.
HAPPYWEEK 172 2016.05.30.
 
2013-09-12 - SUGDC - Office 365 and Hybrid Solutions
2013-09-12 - SUGDC - Office 365 and Hybrid Solutions2013-09-12 - SUGDC - Office 365 and Hybrid Solutions
2013-09-12 - SUGDC - Office 365 and Hybrid Solutions
 
Grafico diario del dax perfomance index para el 11 05-2012
Grafico diario del dax perfomance index para el 11 05-2012Grafico diario del dax perfomance index para el 11 05-2012
Grafico diario del dax perfomance index para el 11 05-2012
 
μεγαουρητήρας
μεγαουρητήραςμεγαουρητήρας
μεγαουρητήρας
 
RHEL 7. Контейнеры и Docker
RHEL 7. Контейнеры и DockerRHEL 7. Контейнеры и Docker
RHEL 7. Контейнеры и Docker
 
Turismo accesible.
Turismo accesible.Turismo accesible.
Turismo accesible.
 
Nl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesNl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenches
 
ΥΠΑΤΙΑ Η ΑΛΕΞΑΝΔΡΙΝΗ ΠΡΟΣΤΑΤΙΣ ΤΩΝ ΕΛΛΗΝΙΚΩΝ ΓΡΑΜΜ
ΥΠΑΤΙΑ Η ΑΛΕΞΑΝΔΡΙΝΗ ΠΡΟΣΤΑΤΙΣ ΤΩΝ ΕΛΛΗΝΙΚΩΝ ΓΡΑΜΜΥΠΑΤΙΑ Η ΑΛΕΞΑΝΔΡΙΝΗ ΠΡΟΣΤΑΤΙΣ ΤΩΝ ΕΛΛΗΝΙΚΩΝ ΓΡΑΜΜ
ΥΠΑΤΙΑ Η ΑΛΕΞΑΝΔΡΙΝΗ ΠΡΟΣΤΑΤΙΣ ΤΩΝ ΕΛΛΗΝΙΚΩΝ ΓΡΑΜΜ
 
Privasi dan keselamatan data
Privasi dan keselamatan dataPrivasi dan keselamatan data
Privasi dan keselamatan data
 
Oportunidad de negocio cardi ventas por catalogo para ganar dinero
Oportunidad de negocio cardi ventas por catalogo para ganar dineroOportunidad de negocio cardi ventas por catalogo para ganar dinero
Oportunidad de negocio cardi ventas por catalogo para ganar dinero
 
AudioとガジェットをWebで遊ぶ - Web Audio/MIDI Web Bluetooth -
AudioとガジェットをWebで遊ぶ - Web Audio/MIDI Web Bluetooth -AudioとガジェットをWebで遊ぶ - Web Audio/MIDI Web Bluetooth -
AudioとガジェットをWebで遊ぶ - Web Audio/MIDI Web Bluetooth -
 
ΠΑΡΟΥΣΙΑΣΗ ΑΠΟΤΕΛΕΣΜΑΤΩΝ ΕΡΕΥΝΑΣ ΔΗΜΟΤΙΚΟΥ ΣΧΟΛΕΙΟΥ ΓΑΛΛΙΚΟΥ
ΠΑΡΟΥΣΙΑΣΗ ΑΠΟΤΕΛΕΣΜΑΤΩΝ ΕΡΕΥΝΑΣ ΔΗΜΟΤΙΚΟΥ ΣΧΟΛΕΙΟΥ ΓΑΛΛΙΚΟΥΠΑΡΟΥΣΙΑΣΗ ΑΠΟΤΕΛΕΣΜΑΤΩΝ ΕΡΕΥΝΑΣ ΔΗΜΟΤΙΚΟΥ ΣΧΟΛΕΙΟΥ ΓΑΛΛΙΚΟΥ
ΠΑΡΟΥΣΙΑΣΗ ΑΠΟΤΕΛΕΣΜΑΤΩΝ ΕΡΕΥΝΑΣ ΔΗΜΟΤΙΚΟΥ ΣΧΟΛΕΙΟΥ ΓΑΛΛΙΚΟΥ
 
Open pit mining
Open pit miningOpen pit mining
Open pit mining
 

Similar to Apache Hadoop at 10

Big Data Training in Amritsar
Big Data Training in AmritsarBig Data Training in Amritsar
Big Data Training in AmritsarE2MATRIX
 
Big Data Training in Mohali
Big Data Training in MohaliBig Data Training in Mohali
Big Data Training in MohaliE2MATRIX
 
Big Data Training in Ludhiana
Big Data Training in LudhianaBig Data Training in Ludhiana
Big Data Training in LudhianaE2MATRIX
 
Hadoop online training
Hadoop online training Hadoop online training
Hadoop online training Keylabs
 
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...Hadoop / Spark Conference Japan
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big dealeduarderwee
 
hadoop-ecosystem-ppt.pptx
hadoop-ecosystem-ppt.pptxhadoop-ecosystem-ppt.pptx
hadoop-ecosystem-ppt.pptxraghavanand36
 
Hadoop's Impact on the Future of Data Management | Amr Awadallah
Hadoop's Impact on the Future of Data Management | Amr AwadallahHadoop's Impact on the Future of Data Management | Amr Awadallah
Hadoop's Impact on the Future of Data Management | Amr AwadallahCloudera, Inc.
 
The Big Picture on Hadoop
The Big Picture on HadoopThe Big Picture on Hadoop
The Big Picture on HadoopStackIQ
 
An Overview Of Apache Pig And Apache Hive
An Overview Of Apache Pig And Apache HiveAn Overview Of Apache Pig And Apache Hive
An Overview Of Apache Pig And Apache HiveJoe Andelija
 
Hadoop And Their Ecosystem ppt
 Hadoop And Their Ecosystem ppt Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem pptsunera pathan
 
Hadoop And Their Ecosystem
 Hadoop And Their Ecosystem Hadoop And Their Ecosystem
Hadoop And Their Ecosystemsunera pathan
 
field_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentahofield_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentahoMartin Ferguson
 
Dallas TDWI Meeting Dec. 2012: Hadoop
Dallas TDWI Meeting Dec. 2012: HadoopDallas TDWI Meeting Dec. 2012: Hadoop
Dallas TDWI Meeting Dec. 2012: Hadooplamont_lockwood
 
Apache Spark: killer or savior of Apache Hadoop?
Apache Spark: killer or savior of Apache Hadoop?Apache Spark: killer or savior of Apache Hadoop?
Apache Spark: killer or savior of Apache Hadoop?rhatr
 

Similar to Apache Hadoop at 10 (20)

Big Data Training in Amritsar
Big Data Training in AmritsarBig Data Training in Amritsar
Big Data Training in Amritsar
 
Big Data Training in Mohali
Big Data Training in MohaliBig Data Training in Mohali
Big Data Training in Mohali
 
Big Data Training in Ludhiana
Big Data Training in LudhianaBig Data Training in Ludhiana
Big Data Training in Ludhiana
 
Hadoop online training
Hadoop online training Hadoop online training
Hadoop online training
 
hadoop_module
hadoop_modulehadoop_module
hadoop_module
 
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
 
hadoop-ecosystem-ppt.pptx
hadoop-ecosystem-ppt.pptxhadoop-ecosystem-ppt.pptx
hadoop-ecosystem-ppt.pptx
 
Hadoop's Impact on the Future of Data Management | Amr Awadallah
Hadoop's Impact on the Future of Data Management | Amr AwadallahHadoop's Impact on the Future of Data Management | Amr Awadallah
Hadoop's Impact on the Future of Data Management | Amr Awadallah
 
The Big Picture on Hadoop
The Big Picture on HadoopThe Big Picture on Hadoop
The Big Picture on Hadoop
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop training
Hadoop trainingHadoop training
Hadoop training
 
An Overview Of Apache Pig And Apache Hive
An Overview Of Apache Pig And Apache HiveAn Overview Of Apache Pig And Apache Hive
An Overview Of Apache Pig And Apache Hive
 
Hadoop And Their Ecosystem ppt
 Hadoop And Their Ecosystem ppt Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem ppt
 
Hadoop And Their Ecosystem
 Hadoop And Their Ecosystem Hadoop And Their Ecosystem
Hadoop And Their Ecosystem
 
field_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentahofield_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentaho
 
Dallas TDWI Meeting Dec. 2012: Hadoop
Dallas TDWI Meeting Dec. 2012: HadoopDallas TDWI Meeting Dec. 2012: Hadoop
Dallas TDWI Meeting Dec. 2012: Hadoop
 
Apache Spark: killer or savior of Apache Hadoop?
Apache Spark: killer or savior of Apache Hadoop?Apache Spark: killer or savior of Apache Hadoop?
Apache Spark: killer or savior of Apache Hadoop?
 
Hadoop MapReduce
Hadoop MapReduceHadoop MapReduce
Hadoop MapReduce
 
Cap 10 ingles
Cap  10 inglesCap  10 ingles
Cap 10 ingles
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageDista
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadIvo Andreev
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfTobias Schneck
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024Mind IT Systems
 
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Jaydeep Chhasatia
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native BuildpacksVish Abrams
 
Why Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfWhy Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfBrain Inventory
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIIvo Andreev
 
How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?AmeliaSmith90
 
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine HarmonyLeveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmonyelliciumsolutionspun
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Incrobinwilliams8624
 
Deep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampDeep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampVICTOR MAESTRE RAMIREZ
 
Generative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilGenerative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilVICTOR MAESTRE RAMIREZ
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsJaydeep Chhasatia
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.Sharon Liu
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...OnePlan Solutions
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntelliSource Technologies
 
online pdf editor software solutions.pdf
online pdf editor software solutions.pdfonline pdf editor software solutions.pdf
online pdf editor software solutions.pdfMeon Technology
 

Recently uploaded (20)

Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and Bad
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024
 
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native Buildpacks
 
Why Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfWhy Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdf
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AI
 
How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?
 
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine HarmonyLeveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Inc
 
Deep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampDeep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - Datacamp
 
Generative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilGenerative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-Council
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptx
 
Salesforce AI Associate Certification.pptx
Salesforce AI Associate Certification.pptxSalesforce AI Associate Certification.pptx
Salesforce AI Associate Certification.pptx
 
online pdf editor software solutions.pdf
online pdf editor software solutions.pdfonline pdf editor software solutions.pdf
online pdf editor software solutions.pdf
 

Apache Hadoop at 10

  • 1. 1 Apache Hadoop at 10 (+ The Next 10 Years)
  • 2. 2
  • 3. 3 A Decade of Hadoop History on One Slide Ten years ago, “Hadoop” referred to a scalable, fault-tolerant filesystem (HDFS) and programming framework (MapReduce) for distributed computing. Today, it refers to both a kernel containing the aforementioned pieces, as well as a constantly evolving ecosystem of 25+ data stores, execution engines, programming and data access frameworks, and other componentry. Recognize this guy?
  • 4. 4 Fast Historical Facts • The code that eventually became Hadoop was written by Doug Cutting and Mike Cafarella, open source developers working in the search tech community, as part of the Nutch project. • The word “hadoop” originated with Cutting’s young son, who owned a plush toy elephant he gave that name. • Yahoo! was the first user of Hadoop in large-scale production, and Cutting did early work on Hadoop there. • Eventually, Cutting joined Cloudera as its chief architect and remains there to this day.
  • 5. 5 The Original Inspirations for Hadoop 2003 2004
  • 6. 6 Hadoop’s Original Architecture MapReduce (Data Processing and Resource Management) HDFS (Filesystem/Storage)
  • 7. 7 2002 Doug Cutting and Mike Cafarella create Nutch, an open source web crawler (October) Google publishes its “Google File System” paper (October) Cutting & Cafarella implement Nutch features that will become HDFS (June) Google publishes its “MapReduce” paper (October) 2002 2003 2004 Timeline (Abridged): The Invention Years
  • 8. 8 2002 Cafarella spearheads an implementation of MapReduce in Nutch (February) Cutting joins Yahoo!; starts Hadoop subproject by carving code from Nutch (January) 2005 2006 2007 Yahoo! creates its first Hadoop cluster for R&D (March) Google publishes “Bigtable” paper, which eventually will inspire creation of HBase (November) First Hadoop User Group meeting (in Palo Alto, CA) (October) Community contributions begin to rise steeply First Apache release of Hadoop (April) Timeline (Abridged): The Incubation Years
  • 9. 9 2002 Hadoop becomes a Top Level ASF project (January) Initial publication of Hadoop: The Definitive Guide, by Tom White (June) 2008 2009 Cutting joins Cloudera as its chief architect (August) Inaugural Hadoop World conference convenes in New York (October) Yahoo! launches world’s largest Hadoop application (February) Hive, Hadoop’s first SQL framework, becomes a Hadoop sub-project (June) Cloudera, first company to commercialize Hadoop, is founded (August) Initial Apache release of Pig (November) Timeline (Abridged): The Coming-Out Years
  • 10. 10 2002 The extended Hadoop community busily builds out a plethora of new components (Crunch, Sqoop, Flume, Oozie, etc) that extend Hadoop use cases and usability HDFS NameNode HA, a significant new feature for enterprise adoption, merges into Hadoop trunk (March) 2010-11 2012 YARN, another important advance for adoption, becomes a Hadoop subproject (August) Impala, the first native MPP query engine for Hadoop data, joins the ecosystem (October) Spark, the emerging default execution engine for Hadoop, becomes a Top Level ASF project (February) 2013-14 Kudu, the first native storage option for Hadoop since HBase, joins the ASF Incubator (as does Impala) (December) 2015 Timeline (Abridged): The Rapid Adoption Years
  • 11. 11 2006 2008 2009 2010 2011 2012 2013 Core Hadoop (HDFS, MapReduce) HBase ZooKeeper Solr Pig Core Hadoop Hive Mahout HBase ZooKeeper Solr Pig Core Hadoop Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig Core Hadoop Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop Spark Tez Impala Kafka Drill Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop Parquet Sentry Spark Tez Impala Kafka Drill Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop The stack is continually evolving and growing! 2007 Solr Pig Core Hadoop Knox Flink Parquet Sentry Spark Tez Impala Kafka Drill Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop 2014 2015 Kudu RecordService Ibis Falcon Knox Flink Parquet Sentry Spark Tez Impala Kafka Drill Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop Evolution of the Hadoop Platform
  • 13. 13 Why Did Hadoop Succeed? 1. Open source community and license A large and diverse community of developers has historically made, and continues to make, the Hadoop ecosystem among the most active and engaged in history, while the Apache License lowers the barrier to entry for users. 2. Extensibility/adaptability With the possible exception of Linux, no other complex platform has evolved on so many levels, and so quickly, to meet user requirements over time. 3. A strong focus on systems The roots of Hadoop are in making distributed computing infrastructure more accessible by application developers. That continuing focus continues to bear fruit in areas like resource management and security.
  • 14. 14 Hadoop’s Next 10 Years Interest in public-cloud deployments are driving native support for them into the platform. Rapid hardware advances are forcing the community to re-think Hadoop’s foundations. Data sources are more numerous, distributed, and diverse (IoT), and Hadoop will adapt.
  • 15. 15 The Use Case Only Gets Stronger Much of the progress we will make in this century will come from increased understanding of the data we generate. - Doug Cutting “ ”

Editor's Notes

  1. Kick off the presentation by playing the Doug Cutting “Hadoop 10” video. https://www.youtube.com/watch?v=XHz_R33QnsI
  2. In the beginning, the word “Hadoop” referred to just two components. Fast forward a decade, and that word now refers to that “kernel” (aka Core Hadoop) as well as to a growing ecosystem of related projects. In that sense, Hadoop now has much in common with Linux, which is also both a kernel and an ecosystem.
  3. These are the very high-level historical facts about Hadoop. The timeline to follow contains much more detail.
  4. As a very basic explanation, Hadoop was originally an open source implementation of internal systems built by Google in the early ‘00s to deal with the extraordinarily resource-intensive problem of indexing the Internet every night. Those systems were first described in these papers, and Cutting and Cafarella, who faced similar problems with Nutch, took notice of them quickly. (Later, Google also published its “Bigtable” paper, which led other developers to create HBase.) As Cutting puts it, periodically, “Google sends us messages from the future.”
  5. Cutting & Cafarella’s initial implementation of these systems consisted of just 2 components: MapReduce and HDFS.
  6. This timeline is abridged for brevity, but it contains some major milestones.
  7. The rapid expansion of the Hadoop ecosystem is further evidence of its meteoric adoption.
  8. With the expansion of that ecosystem, “Hadoop” has grown much, much bigger than its original “core.”
  9. Even with this history, one has to ask: Why did Hadoop succeed?
  10. What does the future hold for Hadoop? There are many possible permutations, but these are just a couple of the obvious influences going forward.
  11. Regardless of what Hadoop looks like in 10 or 20 years, it’s indisputable that the use cases for it will grow stronger as data volume, variety, and velocity expand. There will be no clearer driver of progress than the ability to translate raw data into actionable insight.