SlideShare a Scribd company logo
1 of 24
© Hortonworks Inc. 2014
Hadoop Security Today &
Tomorrow
Amsterdam - April3rd, 2014
Vinay Shukla
Twitter: @NeoMythos
© Hortonworks Inc. 2014
Agenda
• What is Hadoop Security?
– 4 Security Pillars & Rings of Defense
• What security elements exists today?
– Authentication
– Authorization
– Audit
– Data Protection
• What is on the security roadmap?
– Coming soon
– Longer term projects
• Securing Hadoop with Apache Knox Gateway
– Knox overview
– Demo
• How to get involved
© Hortonworks Inc. 2014
Two Reasons for Security in Hadoop
Hadoop Contains Sensitive Data
–As Hadoop adoption grows so too has the types of data
organizations look to store. Often the data is proprietary
or personal and it must be protected.
–In this context, Hadoop is governed by the same
security requirements as any data center platform.
Hadoop is subject to Compliance adherence
–Organizations are often subject to comply with
regulations such as HIPPA, PCI DSS, FISAM that
require protection of personal information.
–Adherence to other Corporate security policies.
1
2
© Hortonworks Inc. 2014
What is Apache Hadoop Security?
Security in Apache Hadoop is
defined by four key pillars:
authentication, authorization, accou
ntability, and data protection.
© Hortonworks Inc. 2014
Security: Rings of Defense
Perimeter Level Security
• Network Security (i.e. Firewalls)
• Apache Knox (i.e. Gateways)
Data Protection
• Core Hadoop
• Partners
Authentication
• Kerberos
OS Security
Authorization
• MR ACLs
• HDFS Permissions
• HDFS ACLs
• HiveATZ-NG
• HBase ACLs
• Accumulo Label Security
Page 5
© Hortonworks Inc. 2014
Authentication in Hadoop Today…
Authentication
Who am I/prove it?
Control access to
cluster.
Authorization
Restrict access
to explicit data
Audit
Understand who
did what
Data Protection
Encrypt data at
rest & motion
Kerberos in native
Apache Hadoop
Perimeter
Security with
Apache Knox
Gateway
© Hortonworks Inc. 2014
Kerberos Authentication in Hadoop
For more than 20 years, Kerberos has been the de-facto
standard for strong authentication.
…no other option exists.
The design and implementation of Kerberos security in native Apache
Hadoop was delivered by Hortonworker Owen O’Malley in 2010.
What does Kerberos Do?
– Establishes identity for clients, hosts and services
– Prevents impersonation/passwords are never sent over the wire
– Integrates w/ enterprise identity management tools such as LDAP & Active Directory
– More granular auditing of data access/job execution
© Hortonworks Inc. 2014
• Single Hadoop
access point
• REST API hierarchy
• Consolidated API
calls
• Multi-cluster
support
• Eliminates SSH
“edge node”
• Central API
management
• Central audit control
• Simple Service
level Authorization
• SSO Integration –
Siteminder, API
Key*, OAuth* &
SAML*
• LDAP & AD
integration
Perimeter Security with Apache Knox
Integrated with
existing systems to
simplify identity
maintenance
Incubated and led by Hortonworks,
Apache Knox provides a simple and open
framework for Hadoop perimeter security.
Single, simple point
of access for a
cluster
Central controls
ensure consistency
across one or more
clusters
© Hortonworks Inc. 2014
Authentication & Audit in Hadoop today…
Authorization
Restrict access
to explicit data
Audit
Understand who
did what
Data Protection
Encrypt data at
rest & motion
Kerberos in native
Apache Hadoop
Perimeter
Security with
Apache Knox
Gateway
Native in Apache Hadoop
• MapReduce Access Control Lists
• HDFS Permissions
• Process Execution audit trail
Cell level access control in
Apache Accumulo
Authentication
Who am I/prove it?
Control access to
cluster.
© Hortonworks Inc. 2014
Authorization: Who can do what in Hadoop?
• Access Control Services exist for each of the Hadoop
components
–HDFS has file Permissions
–YARN, MapReduce, HBase has Access Control Lists (ACL)
–Accumulo Proves more granular label/cell level security
• Improvements to these services are being led by
Hortonworks Team:
–HDFS Improvements – Extended ACL, more flexible via multiple
policies on the same file or directory
–Hive Improvements – Hortonworks initiative called Hive ATZ-
NG, better integration allows familiar SQL/database syntax
(GRANT/REVOKE) and allows more clients (including partner
integrations) to be secure.
© Hortonworks Inc. 2014
Data Protection in Hadoop today…
Authorization
Restrict access
to explicit data
Audit
Understand who
did what
Data Protection
Encrypt data at
rest & motion
Kerberos in native
Apache Hadoop
Perimeter
Security with
Apache Knox
Gateway
Native in Apache Hadoop
• MapReduce Access Control Lists
• HDFS Permissions
• Process Execution audit trail
Cell level access control in
Apache Accumulo
Wire encryption
in native Apache
Hadoop
Orchestrated
encryption with
3rd party tools
Authentication
Who am I/prove it?
Control access to
cluster.
© Hortonworks Inc. 2014
Data Protection in Hadoop
must be applied at three different
layers in Apache Hadoop
Storage: encrypt data while it is at rest
Direct data flows “into” and “out of” 3rd party encryption tools and/or
rely upon hardware specific techniques (i.e. drive-level encryption).
Transmission: encrypt data as it is in motion
Native Apache Hadoop 2.0 provides wire encryption.
Upon Access: apply restrictions when accessed
Direct data flows “into” and “out of” 3rd party encryption tools.
Data Protection
© Hortonworks Inc. 2014
Data Protection – Details - Today
• Encryption of Data at Rest
–Option 1: OS or Hardware Level Encryption (Out of the Box)
–Option 2: Custom Development
–Option 3: Certified Partners
–Work underway for encryption in Hive, HDFS and HBase as core
platform capabilities.
• Encryption of Data on the Wire
–All wire protocols can be encrypted by HDP platform (2.x). Wire-level
encryption enhancements led by HWX Team.
• Column Level Encryption
–No current out of the box support in Hadoop.
–Certified Partners provide these capabilities.
© Hortonworks Inc. 2014
What can be done today?
Authorization
Restrict access
to explicit data
Audit
Understand who
did what
Data Protection
Encrypt data at
rest & motion
Kerberos in
native Apache
Hadoop
Perimeter
Security with
Apache Knox
Gateway
Native in Apache Hadoop
• MapReduce Access Control Lists
• HDFS Permissions
• Process Execution audit trail
Cell level access control in
Apache Accumulo
Service level Authorization with
Knox
Access Audit with Knox
Wire encryption
in native Apache
Hadoop
Wire Encryption
with Knox
Orchestrated
encryption with
3rd party tools
Authentication
Who am I/prove it?
Control access to
cluster.
© Hortonworks Inc. 2014
Hadoop Security
Hortonworks is Delivering Secure Hadoop for the Enterprise
Security for Hadoop must be addressed within
every layer of the stack and integrated into existing frameworks
For a full description of what is available in Enterprise Hadoop
today across Authentication, Authorization, Accountability and
Data Protection please visit our security labs page
Governance
&Integration
Security
Operations
Data Access
Data
Management
HDP 2.1
New: Apache Knox
Perimeter security for Hadoop
 A common place to preform authentication
across Hadoop and all related projects
 Integrated to LDAP and AD
 Currently supports:
WebHDFS, WebHCAT, Oozie, Hive & HBase
 Broad community effort, incubated with
Microsoft, broad set of developers involved
Security Investments
Security Phase 3:
• Audit event correlation and Audit viewer
• Data Encryption in HDFS, Hive & HBase
• Knox for HDFS HA, Ambari & Falcon
• Support Token-Based AuthN beyond Kerb
Security Phase 2:
• ACLs for HDFS
• Knox: Hadoop REST API Security
• SQL-style Hive AuthZ (GRANT, REVOKE)
• SSL support for Hive Server 2
• SSL for DN/NN UI & WebHDFS
• PAM support for Hive
Phase 1
• Strong AuthN with Kerberos
• HBase, Hive, HDFS basic AuthZ
• Encryption with SSL for NN, JT, etc.
• Wire encryption with Shuffle, HDFS, JDBC
© Hortonworks Inc. 2014
Hadoop Security: Phase 2
Page 16
HDP 2.1 Features
Release Theme REST API Security, Improve AuthZ, Wire Encryption
Specific Features • Hadoop REST API Security with Apache Knox
• Eliminates SSH edge node
• Single Hadoop access point
• LDAP, AD based Authentication
• Service-level Authorization
• Audit support for REST access
• SQL style Hive Authorization with fine grain access
• HDFS Access Control Lists
• SSL support in HiveServer2
• SSL support in NN/DN UI & WebHDFS
• Pluggable Authentication Module (PAM) in Hive
Included
Components
Apache Knox, Hive, HDFS
© Hortonworks Inc. 2014
Why Knox?
From fb.com/hadoopmemes
Apache Knox Gateway
• REST/HTTP API security for
Hadoop
• Eliminates SSH edge node
• Single REST API access point
• Centralized Authentication,
Authorization, and Audit for
Hadoop REST/HTTP services
• LDAP/AD Authentication,
Service Authorization, Audit etc.
Knox Eliminates
• Client’s requirements for intimate knowledge of cluster topology
© Hortonworks Inc. 2014
Knox Deployment with Hadoop Cluster
Application Tier
DMZ
Switch Switch
….
Master
Nodes
Rack 1
Switch
NN
SNN
….
Slave
Nodes
Rack 2
….
Slave
Nodes
Rack N
SwitchSwitch
DN DN
Web
Tier
LB
Knox
Hadoop
CLIs
© Hortonworks Inc. 2014
Hadoop REST API Security: Drill-Down
Page 19
REST
Client
Enterprise
Identity
Provider
LDAP/AD
Knox
Gateway
GW
GW
Firewall
Firewall
DMZ
L
B
Edge
Node/H
adoop
CLIs
RPC
HTTP
HTTP HTTP
LDAP
Hadoop Cluster 1
Masters
Slaves
RM
NN
Web
HCat
Oozie
DN NM
HS2
Hadoop Cluster 2
Masters
Slaves
RM
NN
Web
HCat
Oozie
DN NM
HS2
HBase
HBase
© Hortonworks Inc. 2014
Selects appropriate
service filter chain
based on request URL
mapping rules
REST
Client
Protocol
Listener
Listens for requests on the
appropriate protocols
(e.g. HTTP/HTTPS)
Service
Selector
Service Specific Filter Chain
Identity
Asserter
Filter
Dispatch
Rewrite
Filter
AuthN
Filter
Hadoop
Service
Enforces propagation of
authenticated identity to Hadoop
by modifying request
Streams request and
response to and from
Hadoop service based
on rewritten URLs
Translates URLs in request and
response between external and
internal URLs based on service
specific rules
Enterprise
Identity
Provider
Enterprise/Cl
oud SSO
Provider
Challenges client for
credentials and authenticates
or validates SSO Token
Service filter chains are composed
and configured at deployment time
by service specific plugins
What is Knox? Client > Knox > Hadoop Cluster
Page 20
Knox Gateway
© Hortonworks Inc. 2014© Hortonworks Inc. 2014
Knox Gateway in action
Submit MR job via Knox
Page 21
© Hortonworks Inc. 2014
HDFS & MR Operations with Knox
• Create a few directories
curl -iku guest:guest-password -X PUT 'https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test?op=MKDIRS&permission=777'
curl -iku guest:guest-password -X PUT "https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test/input?op=MKDIRS&permission=777"
curl -iku guest:guest-password -X PUT "https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test/lib?op=MKDIRS&permission=777"
• Upload files
curl -iku guest:guest-password -L -T samples/hadoop-examples.jar -X PUT https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test/lib/hadoop-
examples.jar?op=CREATE
curl -iku guest:guest-password -X PUT -L -T README -X PUT
"https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test/input/README?op=CREATE"
• Run MR job
curl -iku guest:guest-password -X POST -d arg=/user/guest/test/input -d arg=/user/guest/test/output -d jar=/user/guest/test/lib/hadoop-examples.jar -d
class=org.apache.hadoop.examples.WordCount https://localhost:8443/gateway/sandbox/templeton/v1/mapreduce/jar
• Query the jobs for a user
curl -iku guest:guest-password https://localhost:8443/gateway/sandbox/templeton/v1/queue
• Query the status of a given job
curl -iku guest:guest-password https://localhost:8443/gateway/sandbox/templeton/v1/queue/<job_id>
• Read the output file
curl -iku guest:guest-password -L -X GET https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test/output/part-r-00000?op=OPEN
• Remove a directory
curl -iku guest:guest-password -X DELETE "https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test?op=DELETE&recursive=true"
Page 22
© Hortonworks Inc. 2014
How to get Involved
Page 23
Resource Location
Security Labs http://hortonworks.com/labs/security/
Security Blogs http://hortonworks.com/blog/category/innovation/security/
Apache Knox
Tutorial
http://hortonworks.com/hadoop-tutorial/securing-hadoop-
infrastructure-apache-knox/
Need help? http://hortonworks.com/community/forums/forum/security/ or
vshukla@hortonworks.com
© Hortonworks Inc. 2014 Page 24
Thank you! Amsterdam - April3rd, 2014
Vinay Shukla
Twitter: @NeoMythos

More Related Content

What's hot

Apache NiFi User Guide
Apache NiFi User GuideApache NiFi User Guide
Apache NiFi User GuideDeon Huang
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3SANG WON PARK
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache RangerDataWorks Summit
 
Collaborative Editing Tools for Alfresco
Collaborative Editing Tools for AlfrescoCollaborative Editing Tools for Alfresco
Collaborative Editing Tools for AlfrescoAngel Borroy López
 
NiFi Developer Guide
NiFi Developer GuideNiFi Developer Guide
NiFi Developer GuideDeon Huang
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka IntroductionAmita Mirajkar
 
Presto At Treasure Data
Presto At Treasure DataPresto At Treasure Data
Presto At Treasure DataTaro L. Saito
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?OpenStack Korea Community
 

What's hot (20)

Apache NiFi Crash Course Intro
Apache NiFi Crash Course IntroApache NiFi Crash Course Intro
Apache NiFi Crash Course Intro
 
Apache NiFi User Guide
Apache NiFi User GuideApache NiFi User Guide
Apache NiFi User Guide
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
kafka
kafkakafka
kafka
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
An Overview of Ambari
An Overview of AmbariAn Overview of Ambari
An Overview of Ambari
 
Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
HDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and SupportabilityHDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and Supportability
 
Collaborative Editing Tools for Alfresco
Collaborative Editing Tools for AlfrescoCollaborative Editing Tools for Alfresco
Collaborative Editing Tools for Alfresco
 
NiFi Developer Guide
NiFi Developer GuideNiFi Developer Guide
NiFi Developer Guide
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Integrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data LakesIntegrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data Lakes
 
Nifi workshop
Nifi workshopNifi workshop
Nifi workshop
 
Presto At Treasure Data
Presto At Treasure DataPresto At Treasure Data
Presto At Treasure Data
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
 

Viewers also liked

Intro to MySQL Master Slave Replication
Intro to MySQL Master Slave ReplicationIntro to MySQL Master Slave Replication
Intro to MySQL Master Slave Replicationsatejsahu
 
Big Data Security with Hadoop
Big Data Security with HadoopBig Data Security with Hadoop
Big Data Security with HadoopCloudera, Inc.
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastDataWorks Summit
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview Hortonworks
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...DataWorks Summit
 
Information security in big data -privacy and data mining
Information security in big data -privacy and data miningInformation security in big data -privacy and data mining
Information security in big data -privacy and data miningharithavijay94
 
Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)Peter Wood
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersApache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersDataWorks Summit
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureUwe Printz
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityDataWorks Summit
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the CloudDataWorks Summit
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Kevin Minder
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access SecurityCloudera, Inc.
 
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Emilio Coppa
 
OAuth - Open API Authentication
OAuth - Open API AuthenticationOAuth - Open API Authentication
OAuth - Open API Authenticationleahculver
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY pptsravya raju
 
Cours Big Data Chap1
Cours Big Data Chap1Cours Big Data Chap1
Cours Big Data Chap1Amal Abid
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 

Viewers also liked (20)

Intro to MySQL Master Slave Replication
Intro to MySQL Master Slave ReplicationIntro to MySQL Master Slave Replication
Intro to MySQL Master Slave Replication
 
Big Data Security with Hadoop
Big Data Security with HadoopBig Data Security with Hadoop
Big Data Security with Hadoop
 
Hadoop
HadoopHadoop
Hadoop
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the Beast
 
An Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache KnoxAn Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache Knox
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
Information security in big data -privacy and data mining
Information security in big data -privacy and data miningInformation security in big data -privacy and data mining
Information security in big data -privacy and data mining
 
Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersApache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the Cloud
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
 
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)
 
OAuth - Open API Authentication
OAuth - Open API AuthenticationOAuth - Open API Authentication
OAuth - Open API Authentication
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Cours Big Data Chap1
Cours Big Data Chap1Cours Big Data Chap1
Cours Big Data Chap1
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 

Similar to Hadoop Security Today & Tomorrow with Apache Knox

August 2014 HUG : Comprehensive Security for Hadoop
August 2014 HUG : Comprehensive Security for HadoopAugust 2014 HUG : Comprehensive Security for Hadoop
August 2014 HUG : Comprehensive Security for HadoopYahoo Developer Network
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_securityAdam Muise
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityChris Nauroth
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Rangertrihug
 
Curb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure ClusterCurb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure Clusterahortonworks
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHortonworks
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Hortonworks
 
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...huguk
 
AWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a Service
AWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a ServiceAWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a Service
AWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a ServiceAmazon Web Services
 
Securing Data in Hadoop at Uber
Securing Data in Hadoop at UberSecuring Data in Hadoop at Uber
Securing Data in Hadoop at UberDataWorks Summit
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSHortonworks
 
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend MicroHBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend MicroCloudera, Inc.
 
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?DataWorks Summit
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemShivaji Dutta
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...DataWorks Summit
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Cloudera, Inc.
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of ViewKaran Alang
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks
 

Similar to Hadoop Security Today & Tomorrow with Apache Knox (20)

August 2014 HUG : Comprehensive Security for Hadoop
August 2014 HUG : Comprehensive Security for HadoopAugust 2014 HUG : Comprehensive Security for Hadoop
August 2014 HUG : Comprehensive Security for Hadoop
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Ranger
 
Curb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure ClusterCurb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure Cluster
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
 
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
 
AWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a Service
AWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a ServiceAWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a Service
AWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a Service
 
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
 
Curb your insecurity with HDP
Curb your insecurity with HDPCurb your insecurity with HDP
Curb your insecurity with HDP
 
Securing Data in Hadoop at Uber
Securing Data in Hadoop at UberSecuring Data in Hadoop at Uber
Securing Data in Hadoop at Uber
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
 
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend MicroHBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
 
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of View
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
 

Recently uploaded

PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 

Recently uploaded (20)

PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 

Hadoop Security Today & Tomorrow with Apache Knox

  • 1. © Hortonworks Inc. 2014 Hadoop Security Today & Tomorrow Amsterdam - April3rd, 2014 Vinay Shukla Twitter: @NeoMythos
  • 2. © Hortonworks Inc. 2014 Agenda • What is Hadoop Security? – 4 Security Pillars & Rings of Defense • What security elements exists today? – Authentication – Authorization – Audit – Data Protection • What is on the security roadmap? – Coming soon – Longer term projects • Securing Hadoop with Apache Knox Gateway – Knox overview – Demo • How to get involved
  • 3. © Hortonworks Inc. 2014 Two Reasons for Security in Hadoop Hadoop Contains Sensitive Data –As Hadoop adoption grows so too has the types of data organizations look to store. Often the data is proprietary or personal and it must be protected. –In this context, Hadoop is governed by the same security requirements as any data center platform. Hadoop is subject to Compliance adherence –Organizations are often subject to comply with regulations such as HIPPA, PCI DSS, FISAM that require protection of personal information. –Adherence to other Corporate security policies. 1 2
  • 4. © Hortonworks Inc. 2014 What is Apache Hadoop Security? Security in Apache Hadoop is defined by four key pillars: authentication, authorization, accou ntability, and data protection.
  • 5. © Hortonworks Inc. 2014 Security: Rings of Defense Perimeter Level Security • Network Security (i.e. Firewalls) • Apache Knox (i.e. Gateways) Data Protection • Core Hadoop • Partners Authentication • Kerberos OS Security Authorization • MR ACLs • HDFS Permissions • HDFS ACLs • HiveATZ-NG • HBase ACLs • Accumulo Label Security Page 5
  • 6. © Hortonworks Inc. 2014 Authentication in Hadoop Today… Authentication Who am I/prove it? Control access to cluster. Authorization Restrict access to explicit data Audit Understand who did what Data Protection Encrypt data at rest & motion Kerberos in native Apache Hadoop Perimeter Security with Apache Knox Gateway
  • 7. © Hortonworks Inc. 2014 Kerberos Authentication in Hadoop For more than 20 years, Kerberos has been the de-facto standard for strong authentication. …no other option exists. The design and implementation of Kerberos security in native Apache Hadoop was delivered by Hortonworker Owen O’Malley in 2010. What does Kerberos Do? – Establishes identity for clients, hosts and services – Prevents impersonation/passwords are never sent over the wire – Integrates w/ enterprise identity management tools such as LDAP & Active Directory – More granular auditing of data access/job execution
  • 8. © Hortonworks Inc. 2014 • Single Hadoop access point • REST API hierarchy • Consolidated API calls • Multi-cluster support • Eliminates SSH “edge node” • Central API management • Central audit control • Simple Service level Authorization • SSO Integration – Siteminder, API Key*, OAuth* & SAML* • LDAP & AD integration Perimeter Security with Apache Knox Integrated with existing systems to simplify identity maintenance Incubated and led by Hortonworks, Apache Knox provides a simple and open framework for Hadoop perimeter security. Single, simple point of access for a cluster Central controls ensure consistency across one or more clusters
  • 9. © Hortonworks Inc. 2014 Authentication & Audit in Hadoop today… Authorization Restrict access to explicit data Audit Understand who did what Data Protection Encrypt data at rest & motion Kerberos in native Apache Hadoop Perimeter Security with Apache Knox Gateway Native in Apache Hadoop • MapReduce Access Control Lists • HDFS Permissions • Process Execution audit trail Cell level access control in Apache Accumulo Authentication Who am I/prove it? Control access to cluster.
  • 10. © Hortonworks Inc. 2014 Authorization: Who can do what in Hadoop? • Access Control Services exist for each of the Hadoop components –HDFS has file Permissions –YARN, MapReduce, HBase has Access Control Lists (ACL) –Accumulo Proves more granular label/cell level security • Improvements to these services are being led by Hortonworks Team: –HDFS Improvements – Extended ACL, more flexible via multiple policies on the same file or directory –Hive Improvements – Hortonworks initiative called Hive ATZ- NG, better integration allows familiar SQL/database syntax (GRANT/REVOKE) and allows more clients (including partner integrations) to be secure.
  • 11. © Hortonworks Inc. 2014 Data Protection in Hadoop today… Authorization Restrict access to explicit data Audit Understand who did what Data Protection Encrypt data at rest & motion Kerberos in native Apache Hadoop Perimeter Security with Apache Knox Gateway Native in Apache Hadoop • MapReduce Access Control Lists • HDFS Permissions • Process Execution audit trail Cell level access control in Apache Accumulo Wire encryption in native Apache Hadoop Orchestrated encryption with 3rd party tools Authentication Who am I/prove it? Control access to cluster.
  • 12. © Hortonworks Inc. 2014 Data Protection in Hadoop must be applied at three different layers in Apache Hadoop Storage: encrypt data while it is at rest Direct data flows “into” and “out of” 3rd party encryption tools and/or rely upon hardware specific techniques (i.e. drive-level encryption). Transmission: encrypt data as it is in motion Native Apache Hadoop 2.0 provides wire encryption. Upon Access: apply restrictions when accessed Direct data flows “into” and “out of” 3rd party encryption tools. Data Protection
  • 13. © Hortonworks Inc. 2014 Data Protection – Details - Today • Encryption of Data at Rest –Option 1: OS or Hardware Level Encryption (Out of the Box) –Option 2: Custom Development –Option 3: Certified Partners –Work underway for encryption in Hive, HDFS and HBase as core platform capabilities. • Encryption of Data on the Wire –All wire protocols can be encrypted by HDP platform (2.x). Wire-level encryption enhancements led by HWX Team. • Column Level Encryption –No current out of the box support in Hadoop. –Certified Partners provide these capabilities.
  • 14. © Hortonworks Inc. 2014 What can be done today? Authorization Restrict access to explicit data Audit Understand who did what Data Protection Encrypt data at rest & motion Kerberos in native Apache Hadoop Perimeter Security with Apache Knox Gateway Native in Apache Hadoop • MapReduce Access Control Lists • HDFS Permissions • Process Execution audit trail Cell level access control in Apache Accumulo Service level Authorization with Knox Access Audit with Knox Wire encryption in native Apache Hadoop Wire Encryption with Knox Orchestrated encryption with 3rd party tools Authentication Who am I/prove it? Control access to cluster.
  • 15. © Hortonworks Inc. 2014 Hadoop Security Hortonworks is Delivering Secure Hadoop for the Enterprise Security for Hadoop must be addressed within every layer of the stack and integrated into existing frameworks For a full description of what is available in Enterprise Hadoop today across Authentication, Authorization, Accountability and Data Protection please visit our security labs page Governance &Integration Security Operations Data Access Data Management HDP 2.1 New: Apache Knox Perimeter security for Hadoop  A common place to preform authentication across Hadoop and all related projects  Integrated to LDAP and AD  Currently supports: WebHDFS, WebHCAT, Oozie, Hive & HBase  Broad community effort, incubated with Microsoft, broad set of developers involved Security Investments Security Phase 3: • Audit event correlation and Audit viewer • Data Encryption in HDFS, Hive & HBase • Knox for HDFS HA, Ambari & Falcon • Support Token-Based AuthN beyond Kerb Security Phase 2: • ACLs for HDFS • Knox: Hadoop REST API Security • SQL-style Hive AuthZ (GRANT, REVOKE) • SSL support for Hive Server 2 • SSL for DN/NN UI & WebHDFS • PAM support for Hive Phase 1 • Strong AuthN with Kerberos • HBase, Hive, HDFS basic AuthZ • Encryption with SSL for NN, JT, etc. • Wire encryption with Shuffle, HDFS, JDBC
  • 16. © Hortonworks Inc. 2014 Hadoop Security: Phase 2 Page 16 HDP 2.1 Features Release Theme REST API Security, Improve AuthZ, Wire Encryption Specific Features • Hadoop REST API Security with Apache Knox • Eliminates SSH edge node • Single Hadoop access point • LDAP, AD based Authentication • Service-level Authorization • Audit support for REST access • SQL style Hive Authorization with fine grain access • HDFS Access Control Lists • SSL support in HiveServer2 • SSL support in NN/DN UI & WebHDFS • Pluggable Authentication Module (PAM) in Hive Included Components Apache Knox, Hive, HDFS
  • 17. © Hortonworks Inc. 2014 Why Knox? From fb.com/hadoopmemes Apache Knox Gateway • REST/HTTP API security for Hadoop • Eliminates SSH edge node • Single REST API access point • Centralized Authentication, Authorization, and Audit for Hadoop REST/HTTP services • LDAP/AD Authentication, Service Authorization, Audit etc. Knox Eliminates • Client’s requirements for intimate knowledge of cluster topology
  • 18. © Hortonworks Inc. 2014 Knox Deployment with Hadoop Cluster Application Tier DMZ Switch Switch …. Master Nodes Rack 1 Switch NN SNN …. Slave Nodes Rack 2 …. Slave Nodes Rack N SwitchSwitch DN DN Web Tier LB Knox Hadoop CLIs
  • 19. © Hortonworks Inc. 2014 Hadoop REST API Security: Drill-Down Page 19 REST Client Enterprise Identity Provider LDAP/AD Knox Gateway GW GW Firewall Firewall DMZ L B Edge Node/H adoop CLIs RPC HTTP HTTP HTTP LDAP Hadoop Cluster 1 Masters Slaves RM NN Web HCat Oozie DN NM HS2 Hadoop Cluster 2 Masters Slaves RM NN Web HCat Oozie DN NM HS2 HBase HBase
  • 20. © Hortonworks Inc. 2014 Selects appropriate service filter chain based on request URL mapping rules REST Client Protocol Listener Listens for requests on the appropriate protocols (e.g. HTTP/HTTPS) Service Selector Service Specific Filter Chain Identity Asserter Filter Dispatch Rewrite Filter AuthN Filter Hadoop Service Enforces propagation of authenticated identity to Hadoop by modifying request Streams request and response to and from Hadoop service based on rewritten URLs Translates URLs in request and response between external and internal URLs based on service specific rules Enterprise Identity Provider Enterprise/Cl oud SSO Provider Challenges client for credentials and authenticates or validates SSO Token Service filter chains are composed and configured at deployment time by service specific plugins What is Knox? Client > Knox > Hadoop Cluster Page 20 Knox Gateway
  • 21. © Hortonworks Inc. 2014© Hortonworks Inc. 2014 Knox Gateway in action Submit MR job via Knox Page 21
  • 22. © Hortonworks Inc. 2014 HDFS & MR Operations with Knox • Create a few directories curl -iku guest:guest-password -X PUT 'https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test?op=MKDIRS&permission=777' curl -iku guest:guest-password -X PUT "https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test/input?op=MKDIRS&permission=777" curl -iku guest:guest-password -X PUT "https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test/lib?op=MKDIRS&permission=777" • Upload files curl -iku guest:guest-password -L -T samples/hadoop-examples.jar -X PUT https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test/lib/hadoop- examples.jar?op=CREATE curl -iku guest:guest-password -X PUT -L -T README -X PUT "https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test/input/README?op=CREATE" • Run MR job curl -iku guest:guest-password -X POST -d arg=/user/guest/test/input -d arg=/user/guest/test/output -d jar=/user/guest/test/lib/hadoop-examples.jar -d class=org.apache.hadoop.examples.WordCount https://localhost:8443/gateway/sandbox/templeton/v1/mapreduce/jar • Query the jobs for a user curl -iku guest:guest-password https://localhost:8443/gateway/sandbox/templeton/v1/queue • Query the status of a given job curl -iku guest:guest-password https://localhost:8443/gateway/sandbox/templeton/v1/queue/<job_id> • Read the output file curl -iku guest:guest-password -L -X GET https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test/output/part-r-00000?op=OPEN • Remove a directory curl -iku guest:guest-password -X DELETE "https://localhost:8443/gateway/sandbox/webhdfs/v1/user/guest/test?op=DELETE&recursive=true" Page 22
  • 23. © Hortonworks Inc. 2014 How to get Involved Page 23 Resource Location Security Labs http://hortonworks.com/labs/security/ Security Blogs http://hortonworks.com/blog/category/innovation/security/ Apache Knox Tutorial http://hortonworks.com/hadoop-tutorial/securing-hadoop- infrastructure-apache-knox/ Need help? http://hortonworks.com/community/forums/forum/security/ or vshukla@hortonworks.com
  • 24. © Hortonworks Inc. 2014 Page 24 Thank you! Amsterdam - April3rd, 2014 Vinay Shukla Twitter: @NeoMythos

Editor's Notes

  1. BackgroundHortonworks led initiativeUseful for connecting to Hadoop from the outside the clusterWhen more client language flexibility is requiredi.e. Java binding not an optionNot intended for RPC callsCall it REST API Gateway for HadoopDon’t call it a firewallFirewalls are at the network layerDon’t call is perimeter securityPerimeter security is getting discredited as an incomplete security solution
  2. Node the arrows to Hadoop Cluster are simplificationsActually there will be multiple arrow – one per port open between Knox and Hadoop Services it supports (WebHDFS, WebHCAT, HiveServer2, HBase, Oozie) &amp; more in future
  3. Functions as HTTP reverse proxyRe-writes URLs to protect internal network topologyKnox Gateway embeds Jetty containerReads/Writes HTTP