Submit Search
Upload
Hadoop Present - Open Enterprise Hadoop
•
5 likes
•
4,318 views
Yifeng Jiang
Follow
Hadoop current status overview, what is Open Enterprise Hadoop.
Read less
Read more
Software
Report
Share
Report
Share
1 of 56
Download now
Download to read offline
Recommended
Sub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scale
Yifeng Jiang
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
markgrover
One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)
DataWorks Summit
CBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFS
DataWorks Summit
Hive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfs
Yifeng Jiang
Migrating pipelines into Docker
Migrating pipelines into Docker
DataWorks Summit/Hadoop Summit
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
DataWorks Summit
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
DataWorks Summit/Hadoop Summit
Recommended
Sub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scale
Yifeng Jiang
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
markgrover
One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)
DataWorks Summit
CBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFS
DataWorks Summit
Hive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfs
Yifeng Jiang
Migrating pipelines into Docker
Migrating pipelines into Docker
DataWorks Summit/Hadoop Summit
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
DataWorks Summit
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
DataWorks Summit/Hadoop Summit
Bringing Real-Time to the Enterprise with Hortonworks DataFlow
Bringing Real-Time to the Enterprise with Hortonworks DataFlow
DataWorks Summit
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
DataWorks Summit
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache Kudu
DataWorks Summit
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
DataWorks Summit/Hadoop Summit
Applied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4j
DataWorks Summit
Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?
DataWorks Summit
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DataWorks Summit
HPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposal
DataWorks Summit
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
DataWorks Summit
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
DataWorks Summit
Spark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop Summit
DataWorks Summit
Stinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
Hortonworks
Hadoop {Submarine} Project: Running Deep Learning Workloads on YARN
Hadoop {Submarine} Project: Running Deep Learning Workloads on YARN
DataWorks Summit
IoT:what about data storage?
IoT:what about data storage?
DataWorks Summit/Hadoop Summit
HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016
DataWorks Summit/Hadoop Summit
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
Chris Nauroth
Ansible + Hadoop
Ansible + Hadoop
Michael Young
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
DataWorks Summit
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
DataWorks Summit/Hadoop Summit
Running a container cloud on YARN
Running a container cloud on YARN
DataWorks Summit
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
Shivaji Dutta
More Related Content
What's hot
Bringing Real-Time to the Enterprise with Hortonworks DataFlow
Bringing Real-Time to the Enterprise with Hortonworks DataFlow
DataWorks Summit
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
DataWorks Summit
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache Kudu
DataWorks Summit
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
DataWorks Summit/Hadoop Summit
Applied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4j
DataWorks Summit
Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?
DataWorks Summit
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DataWorks Summit
HPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposal
DataWorks Summit
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
DataWorks Summit
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
DataWorks Summit
Spark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop Summit
DataWorks Summit
Stinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
Hortonworks
Hadoop {Submarine} Project: Running Deep Learning Workloads on YARN
Hadoop {Submarine} Project: Running Deep Learning Workloads on YARN
DataWorks Summit
IoT:what about data storage?
IoT:what about data storage?
DataWorks Summit/Hadoop Summit
HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016
DataWorks Summit/Hadoop Summit
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
Chris Nauroth
Ansible + Hadoop
Ansible + Hadoop
Michael Young
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
DataWorks Summit
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
DataWorks Summit/Hadoop Summit
Running a container cloud on YARN
Running a container cloud on YARN
DataWorks Summit
What's hot
(20)
Bringing Real-Time to the Enterprise with Hortonworks DataFlow
Bringing Real-Time to the Enterprise with Hortonworks DataFlow
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache Kudu
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Applied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4j
Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
HPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposal
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
Spark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop Summit
Stinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
Hadoop {Submarine} Project: Running Deep Learning Workloads on YARN
Hadoop {Submarine} Project: Running Deep Learning Workloads on YARN
IoT:what about data storage?
IoT:what about data storage?
HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
Ansible + Hadoop
Ansible + Hadoop
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
Running a container cloud on YARN
Running a container cloud on YARN
Similar to Hadoop Present - Open Enterprise Hadoop
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
Shivaji Dutta
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
DataWorks Summit
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Innovative Management Services
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Hortonworks
Application architectures with hadoop – big data techcon 2014
Application architectures with hadoop – big data techcon 2014
Jonathan Seidman
Application architectures with Hadoop – Big Data TechCon 2014
Application architectures with Hadoop – Big Data TechCon 2014
hadooparchbook
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloud
DataWorks Summit/Hadoop Summit
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
Ameet Paranjape
YARN - Strata 2014
YARN - Strata 2014
Hortonworks
Big data spain keynote nov 2016
Big data spain keynote nov 2016
alanfgates
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
DataWorks Summit
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
Big Data Spain
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
Hortonworks
Improvements in Hadoop Security
Improvements in Hadoop Security
DataWorks Summit
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
DataWorks Summit
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]
Hortonworks
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Alex Zeltov
Discover hdp 2.2 hdfs - final
Discover hdp 2.2 hdfs - final
Hortonworks
Similar to Hadoop Present - Open Enterprise Hadoop
(20)
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Application architectures with hadoop – big data techcon 2014
Application architectures with hadoop – big data techcon 2014
Application architectures with Hadoop – Big Data TechCon 2014
Application architectures with Hadoop – Big Data TechCon 2014
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloud
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
YARN - Strata 2014
YARN - Strata 2014
Big data spain keynote nov 2016
Big data spain keynote nov 2016
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
Improvements in Hadoop Security
Improvements in Hadoop Security
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Discover hdp 2.2 hdfs - final
Discover hdp 2.2 hdfs - final
More from Yifeng Jiang
introduction-to-apache-kafka
introduction-to-apache-kafka
Yifeng Jiang
Hive2 Introduction -- Interactive SQL for Big Data
Hive2 Introduction -- Interactive SQL for Big Data
Yifeng Jiang
Introduction to Streaming Analytics Manager
Introduction to Streaming Analytics Manager
Yifeng Jiang
HDF 3.0 IoT Platform for Everyone
HDF 3.0 IoT Platform for Everyone
Yifeng Jiang
Hortonworks Data Cloud for AWS 1.11 Updates
Hortonworks Data Cloud for AWS 1.11 Updates
Yifeng Jiang
Spark Security
Spark Security
Yifeng Jiang
Introduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWS
Yifeng Jiang
Real-time Analytics in Financial
Real-time Analytics in Financial
Yifeng Jiang
sparksql-hive-bench-by-nec-hwx-at-hcj16
sparksql-hive-bench-by-nec-hwx-at-hcj16
Yifeng Jiang
Nifi workshop
Nifi workshop
Yifeng Jiang
Yifeng hadoop-present-public
Yifeng hadoop-present-public
Yifeng Jiang
Hive-sub-second-sql-on-hadoop-public
Hive-sub-second-sql-on-hadoop-public
Yifeng Jiang
Yifeng spark-final-public
Yifeng spark-final-public
Yifeng Jiang
Kinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-dive
Yifeng Jiang
Hive present-and-feature-shanghai
Hive present-and-feature-shanghai
Yifeng Jiang
Apache Hiveの今とこれから
Apache Hiveの今とこれから
Yifeng Jiang
HDFS Deep Dive
HDFS Deep Dive
Yifeng Jiang
Hadoop Trends & Hadoop on EC2
Hadoop Trends & Hadoop on EC2
Yifeng Jiang
Apache Ambari Overview -- Hadoop for Everyone
Apache Ambari Overview -- Hadoop for Everyone
Yifeng Jiang
HDP Security Overview
HDP Security Overview
Yifeng Jiang
More from Yifeng Jiang
(20)
introduction-to-apache-kafka
introduction-to-apache-kafka
Hive2 Introduction -- Interactive SQL for Big Data
Hive2 Introduction -- Interactive SQL for Big Data
Introduction to Streaming Analytics Manager
Introduction to Streaming Analytics Manager
HDF 3.0 IoT Platform for Everyone
HDF 3.0 IoT Platform for Everyone
Hortonworks Data Cloud for AWS 1.11 Updates
Hortonworks Data Cloud for AWS 1.11 Updates
Spark Security
Spark Security
Introduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWS
Real-time Analytics in Financial
Real-time Analytics in Financial
sparksql-hive-bench-by-nec-hwx-at-hcj16
sparksql-hive-bench-by-nec-hwx-at-hcj16
Nifi workshop
Nifi workshop
Yifeng hadoop-present-public
Yifeng hadoop-present-public
Hive-sub-second-sql-on-hadoop-public
Hive-sub-second-sql-on-hadoop-public
Yifeng spark-final-public
Yifeng spark-final-public
Kinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-dive
Hive present-and-feature-shanghai
Hive present-and-feature-shanghai
Apache Hiveの今とこれから
Apache Hiveの今とこれから
HDFS Deep Dive
HDFS Deep Dive
Hadoop Trends & Hadoop on EC2
Hadoop Trends & Hadoop on EC2
Apache Ambari Overview -- Hadoop for Everyone
Apache Ambari Overview -- Hadoop for Everyone
HDP Security Overview
HDP Security Overview
Recently uploaded
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
Wave PLM
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
aagamshah0812
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
shikhaohhpro
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽❤️🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽❤️🧑🏻 89...
gurkirankumar98700
Professional Resume Template for Software Developers
Professional Resume Template for Software Developers
Vinodh Ram
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Alberto González Trastoy
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
joe51371421
Software Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
Arshad QA
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
Willy Marroquin (WillyDevNET)
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
panagenda
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
ICS
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
Jhone kinadey
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
AxelRicardoTrocheRiq
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
OnePlan Solutions
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Steffen Staab
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
anilsa9823
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
MyIntelliSource, Inc.
Clustering techniques data mining book ....
Clustering techniques data mining book ....
ShaimaaMohamedGalal
Recently uploaded
(20)
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽❤️🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽❤️🧑🏻 89...
Professional Resume Template for Software Developers
Professional Resume Template for Software Developers
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
Software Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Clustering techniques data mining book ....
Clustering techniques data mining book ....
Hadoop Present - Open Enterprise Hadoop
1.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Hadoop Present – Open Enterprise Hadoop Yifeng Jiang Solutions Engineer, Hortonworks, inc. July 26, 2015
2.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved 自己紹介 蒋 逸峰 (Yifeng Jiang) • Solutions Engineer @ Hortonworks Japan • HBase book author • ⽇日本に来て10年年経ちました… • 趣味は⼭山登り • Twitter: @uprush
3.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Ageda • Hadoop Core Updates • Data Access in Hadoop • Hadoop Security • Hadoop Management
4.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Hadoop Present Enterprise Ready Hadoop
5.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Hadoopコミュニティのアクティビティ Number of Issues Resolved Number of Line of Code Increased http://ajisakaa.blogspot.jp
6.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Open Leadership Code Contributed in 2014 by Organization Hortonworks
7.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved 専門家集団: 開発に深く携わるコア・メンバーにより構成 沿革 2011年6月: Yahoo! で初代の Hadoop 開発を手がけたアーキテクト、デベロッパー、 オペレータ 24名によって創立 2014年12月: 社員数600を超えるHadoopの専門家集団に成長 Apache Project Committers PMC Members Hadoop 27 21 Pig 5 5 Hive 18 6 Tez 16 15 HBase 6 4 Phoenix 4 4 Accumulo 2 2 Storm 3 2 Slider 11 11 Falcon 5 3 Flume 1 1 Sqoop 1 1 Ambari 36 28 Oozie 3 2 Zookeeper 2 1 Knox 13 3 Ranger 11 n/a TOTAL 164 109
8.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Hortonworks Data Platform 2.2 Stack
9.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Hadoop Core HDFS + YARN: Data Operating System
10.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved HDFS Scalable & Efficient Data Lake Storage
11.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved HDFS: more Efficient Data Lake Storage • HDFS NFS Gateway – Mount HDFS path • Erasure Coding (under dev) – Reduce storage cost from 3x to 1.4x • Tiered Storage – DataNode becomes collection of tiered storages – DISK, SSD, RAM, ARCHIVAL
12.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Storage Growth Challenges • Some cluster storage need grows very fast – High volumes of data – More users and new use cases to Hadoop • Only way to grow storage is add more nodes Page 12Architecting the Future of Big Data Cluster Storage and Compute Capacity Cluster Storage Utilization Compute Utilization
13.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Archival Storage Scenario Data Usage Hot - Less than 7 days with very high usage Warm – Less than 1 month and used ~20 times per month Cold – Less than 3 months and used 5 times per month Frozen - 3 months to 7 years and used approximately 2 times per year Ebay 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 40.00 0 10 20 30 40 50 60 70 80 Temperature of Data Hadoop TIME (Data Age) FrequencyofDataUsage(perMonth) Cold Data Hot Data Warm Data Cold Data
14.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Archival Storage for Cost Efficiency Scale Storage independently from Compute. Archival Storage Tier • Deploy storage dense hardware nodes • Utilize storage policies for datasets: • Hot, Warm, Cold • Achieve ~4x lower price point per GB Cluster Storage Capacity Cluster Storage Utilization Compute Utilization Cluster Compute Capacity
15.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved HDFS Storage Architecture - Before
16.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved HDFS Storage Architecture - Now
17.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Storage Policy: SSD & Hot SSD SSD SSD SSD SSD SSD SSD SSD SSD DISK DISK DISK DISK DISK DISK HDP Cluster A DISK DISK DISK A A SSD All replicas on SSDDataSet A (e.g., HBase) Hot All replicas on DISK DataSet B (others) B B B I2.8x I2.8x I2.8x d2.8x d2.8x d2.8x
18.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Storage Policy: 実際にやってみる Ambariにて、HDFS Configuration Groups 作成 • I2⽤用グループ • D2⽤用グループ Ambariにて、GroupsごとにDataNodeストレージタイプ、パスを定義 dfs.datanode.data.dir を下記に設定 • I2 group: [SSD]/hadoop/hdfs/data1,[SSD]/hadoop/hdfs/data2,… • D2 group: [DISK]/hadoop/hdfs/data1,[DISK]/hadoop/hdfs/data2,… HDFS再起動
19.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Storage Policyを設定してみる $ hdfs dfs -mkdir /hbase $ hdfs dfsadmin -setStoragePolicy /hbase ALL_SSD Set storage policy ALL_SSD on /hbase $ hdfs dfsadmin -getStoragePolicy /ssd The storage policy of /ssd: BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]} HBaseのデータをすべてSSD(i2)に保存 • /hbase 配下を ALL_̲SSD に設定
20.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved HDFS: Next Step • Erasure Code GA • Ozone: an object store in HDFS HDFS-7285 HDFS-7240
21.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved YARN Extends Hadoop into Data OS
22.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Recap: What’s YARN Cluster Resource Management • Resource sharing – Capacity scheduler – Fair Sharing: pluggable queue policies new • Isolation – Memory, CPU – Node labels new • Workload types – Batch, interactive, in-memory
23.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Storm Storm StormStorm Exclusive Node Labels enable Isolated Partitions S App Storm Configure Partitions Storm B App Exclusive Labels enforce Isolation S S nodes labels S S HDP 2.2
24.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Spark Spark SparkSpark Non-Exclusive Node Labels S App Spark Configure non- exclusive labels Spark B App Schedule if free capacity S S nodes labels S S B YARN-3214 HDP 2.3
25.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Working with Labels Ambari YARN Guided Configuration: Enable node labels YARN CLI: Create and assign labels ResourceManager UI: View Node Labels in Cluster Capacity Scheduler View: Define workload management policy with labels $ yarn rmadmin -addToClusterNodeLabels ”spark(exclusive=false)” $ yarn cluster -list-node-labels $ yarn rmadmin -replaceLabelsOnNode ”node5=spark”
26.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved YARN: Next Step Disk & network isolation • Just isolation – enforce equal sharing of Disk and Network I/O across containers running on node • Current in technical preview of HDP 2.3 • Disk resource: Local Disk Iops… not HDFS read/writes • Network resource: Outbound only bandwidth (mbits/sec) YARN-2619 YARN-2140
27.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Data Access Innovation SQL, Spark, Stream Processing, Search
28.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Hive: Enterprise SQL at Hadoop Scale Native transactions • Delivered: Insert, Update, Delete Performance: 100x faster • ORC File • Hive on Tez • Cost Based Optimizer • Vertorized SQL engine 28
29.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Hive: Next Step SQL Enhancement • Transactions: BEGIN, COMMIT, ROLLBACK • SQL 2011 Analytics Performance • Sub-second response: LLAP, HBase as metastore, etc. Apache Hiveの今とこれから
30.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Spark Features – HDP 2.3.x & Spark 1.3.1 Supported • Spark Core • MLlib • Spark on YARN • Kerberos • Ambari support Tech Preview • SparkSQL* • Spark Streaming • DataFrame • Spark ML Pipeline API Unsupported • GraphX • BlinkDB • Spark Standalone/ Mesos
31.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Resource Management YARN for multi-tenant, diverse workloads with predictable SLAs Tiered Memory Storage HDFS in-memory tier – External BlockStore for RDD Cache SparkSQL & Hive for SQL Interop with modern Metastore/HS2, optimized ORC support, advanced analytics e.g. Geospatial Spark & NoSQL Deep integration with HBase via DataSources/Catalyst for Predicate/Aggregate Pushdown Connect The Dots – Algorithms to Use-Cases Higher-level ML Abstractions – E.g. OneVsRest Validation, tuning, pipeline assembly... e.g. GeoSpatial Spark and Hadoop – How Can We Do Better? Storage YARN: Data Operating System Governance Security Operations Resource Management
32.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Ease of Use Apache Zeppelin for interactive notebooks Metadata & Governance Apache Atlas for metadata & Apache Falcon support for Spark pipelines Security & Operations Apache Ranger managed authorization and deployment/ management via Apache Ambari Deployable Anywhere Linux, Windows, on-premises or cloud Self-Service Spark in the Cloud Easy launch of Data Science clusters via Cloudbreak and Ambari – for Azure, AWS, GCP, OpenStack, Docker Spark and Hadoop – How Can We Do Better? Storage YARN: Data Operating System Governance Security Operations Resource Management
33.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Platform Innovation for Data Access An integrated scalable platform for data access powered by HDP • Limitless storage • Deep analytics • Real-time access
34.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Security End to End Security in Hadoop
35.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Five Security Requirements Authentication Kerberos Authorization Audit Encryption HDP 2.3 Security support RANGER HDFS Hadoop Security Overview
36.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved HDFS Fully Secure Flow –End to End Security HiveServer 2 A B C KDC Use Hive ST, submit query Hive gets Namenode (NN) service ticket 6.Hive creates map reduce using NN ST Ranger 3.Knox gets service ticket for Hive 4.Knox calls as proxy user 1.Original request w/user id/password Client gets query result SSL O/JDBC Client SSL SASL SSL SSL SSL LDAP 2.Knox Authenticates user/pass Ranger Sync users/groups from LDAP 5. Ranger AuthZ Apache Knox Apache Knox
37.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Ranger: Central Security Administration 37 • Table/column access control • Audit logging • Flexible definition Control group/ user permissions
38.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Hadoop Management Ambari: Hadoop for Everyone, 100% Open Source
39.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved What’s Apache Ambari? 100% open source operational platform to provision, manage and monitor Hadoop clusters
40.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Apache Ambari Mission Easy opera,on at scale Large scale cluster install, manage and monitor Efficient and scale at scale Easy to extend with community Innovate with community Integrate with enterprise so:ware Accelerate new feature and adop=on Centralized management for the whole Hadoop stack Access point for all Hadoop users, not just cluster management Easy of use
41.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Ambari 2.1 HDP Stack High Availability HDP Stack Mode Ambari 2.0 Ambari 2.1 HDFS: NameNode HDP 2.0+ Active/ Standby YARN: ResourceManager HDP 2.1+ Active/ Standby HBase: HBaseMaster HDP 2.1+ Multi-master Hive: HiveServer2 HDP 2.1+ Multi-instance Hive: Hive Metastore HDP 2.1+ Multi-instance Hive: WebHCat Server HDP 2.1+ Multi-instance Oozie: Oozie Server HDP 2.1+ Multi-instance Storm: Nimbus Server HDP 2.3 Multi-instance Ranger: AdminServer HDP 2.3 Multi-instance
42.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Install Wizard
43.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Guided Configs for HDFS
44.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Guided Configs for YARN & MapReduce
45.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Enable Features in YARN
46.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Cluster Dashboard
47.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Service Dashboard
48.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Service Manage - HDFS
49.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Host Manage
50.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Monitor & Alert Email SNMP Notifications Script new
51.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved User Views – HDFS File View Files View Browse HDFS file system.
52.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved User Views – YARN CS, Tez Capacity Scheduler View Browse + manage YARN queues Tez View View information related to Tez jobs that are executing on the cluster.
53.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved User Views – Pig, Hive Pig View Author and execute Pig Scripts. Hive View Author, execute and debug Hive queries.
54.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Summary
55.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Open Enterprise Hadoop Hadoop/YARN-powered data operating system 100% open source, multi-tenant data platform for any application, any data set, anywhere. Built on a centralized architecture of shared enterprise services • Scalable tiered storage • Resource and workload management • Trusted data governance & metadata management • Consistent operations • Comprehensive security • Developer APIs and tools YARN: data operating system Governance Security Operations Resource management Data access: batch, interactive, real-time Storage Commodity Appliance Cloud
56.
© Hortonworks Inc.
2011 – 2015. All Rights Reserved Thank you Yifeng Jiang, Solutions Engineer, Hortonworks @uprush
Download now