Apache Hadoop 2.0: Migration from 1.0 to 2.0
Strata conference 2014: http://strataconf.com/strata2014/public/schedule/detail/32247
Vinod Kumar Vavilapalli (Hortonworks)
4:50pm Wednesday, 02/12/2014
Hadoop and Beyond
GA Ballroom
The Hadoop 2.0 revolution is in full force! Organizations, companies, users are all gearing up for the major move that is from Hadoop 1.0 to Hadoop 2.0. In this talk, we will discuss what Hadoop 2.0 is about, what YARN is, how YARN changes Hadoop to be all-in-one data processing platform, what features HDFS2 unlocks and what it means to move to Hadoop 2.0. We’ll discuss this major migration from 1.0 to 2.0 from various perspectives – admins, frameworks, end users & data processing platforms. We’ll cover what it means for existing clusters to upgrade, how existing applications can move to Hadoop 2.0 at the same time making use of all the the great stuff that is unlocked by Hadoop 2.0 – better utilization, performance, scalability, reliability and more powerful programming models.
Graph processing – Giraph, HamaStream proessing – Smaza, Storm, Spark, DataTorrentMapReduceTez – fast query executionWeave/REEF – frameworks to help with writing applicationsList of some of the applications which already support YARN, in some form.Smaza, Storm, S4 and DataTorrent are streaming frameworksVarious types of graph processing frameworks – Giraph and Hama are graph processing systemsThere’s some github projects – caching systems, on-demand web-server spin up Wave and REEF are frameworks on top of YARN to make writing applications easier