SlideShare a Scribd company logo
1 of 24
Download to read offline
Big Data Consulting
doing hadoop, securely
Rob Gibbon
■ Architect @Big Industries Belgium
■ Focus on designing, deploying & integrating web
scale solutions with Hadoop
■ Deliveries for clients in telco, financial services &
media
Hadoop was built to survive data tsunamis
■ a response to challenges that enterprise vendors
were unable to address
■ focused on data volumes and cost reduction
■ initially, the solution had some serious holes
Confidentiality, Integrity, Availability
■ early prereleases couldn’t really meet any of these
three fundamental infosec objectives
■ basic controls weren’t there
the early days
■ Multiple SPoF
■ No authentication
■ Easily spoofed authorisation
■ No encryption of data at rest nor in transit
■ No accounting
enter the hadoop vendors
■ Vendors like Cloudera focus on making Apache
Hadoop “enterprise ready”
■ Includes building robust infosec controls into
Hadoop core
■ Multilayer security is now available for Hadoop
running a cluster in non-secure mode
■ malicious|mistaken user:
■ recursively delete all the data please
■ by the way, I’m the system superuser
■ hadoop:
■ oh ok then
bad things happen with slack controls in
place
average cost of a data breach = $3.8m
running a secure cluster
■ Kerberos is one of the primary security controls you
can use
■ Btw, what’s wrong with this kerberos principal?
■ hdfs@BIGINDUSTRIES.BE
kerberos continued
■ Kerberos uses a three-part principal
■ hdfs/node1.cluster1.bigindustries.be@BIGINDUSTRIES.BE
■ hdfs/node1.cluster2.bigindustries.be@BIGINDUSTRIES.BE
■ Best to use explicit mappings from kerberos principals to local
users
hive / impala
■ HiveServer doesn’t support Kerberos => use HiveServer2
■ Best to use Sentry to enforce role based access controls from
SQL
■ Users can upload and execute arbitrary [possibly hostile] UDFs
=> enable Sentry
■ Older versions of Metastore don’t enforce permissions on
grant_* and revoke_* APIs => stay up to date
availability
■ Most core components now support HA
■ HDFS
■ YARN
■ Hive
■ Hbase
disaster recovery
■ HDFS and HBase offer point-in-time snapshots
■ => consistentency!
■ Vendor-tethered solutions for site-to-site replication
are available
encryption at rest
■ HDFS encryption zones
■ transparent to existing applications
■ minimal performance overhead on Intel
architecture
■ key management is externalised
wire encryption
■ SSL encryption is now available for most Hadoop
services
■ Note that AES-256 for SSL and for Kerberos preauth
requires extra JCE policy files on the cluster
accounting
■ Vendor-tethered solutions are available for auditing
■ Navigator for Cloudera clusters
■ Ranger for HortonWorks clusters
tokenization
■ The process of substituting a sensitive data
element with a non-sensitive equivalent
■ 3rd Party vendor solutions are available that
integrate well with Hadoop
some places where there’s still some work to
do
■ Setting up hadoop security controls is complex and time
consuming
■ Not much support for SELinux around here
■ No general, coherent, policy-based framework for controlling
resource access demands
■ Apache Knox is a starting point
■ => network and host resource access?
Integration
■ Integrating hadoop into an organisation’s services environment
needs careful planning
■ Hadoop can conflict with established governance policies
■ system accounts & privileges
■ remote access
■ firewall flows
■ domains and trust
■ etc.
layered security in hadoop-core
■ Authentication: Kerberos
■ Authorisation: Local unix group or LDAP mappings
■ Authorisation: Sentry RBACS for hive/impala
■ Encryption: HDFS encryption
■ Encryption: SSL encryption for most services
■ Availability: Active/Passive failover HDFS, YARN, Hbase
■ Integrity: HDFS block replication & CRC checksum
but what about
poodle/heartbleed/shellshock/whatever...
■ underlines the need for a mature information
security governance strategy & architecture
defence-in-depth
■ A layered security architecture for Hadoop clusters
is doable
■ eg. MasterCard’s Cloudera Hadoop cluster achieved
PCI compliance in 2014 http://goo.gl/FP5DUt
thanks for listening
be.linkedin.com/in/robertgibbon
www.bigindustries.be

More Related Content

What's hot

Webinar: Moving the Enterprise Backup to the Cloud – A Step-By-Step Guide
Webinar: Moving the Enterprise Backup to the Cloud – A Step-By-Step GuideWebinar: Moving the Enterprise Backup to the Cloud – A Step-By-Step Guide
Webinar: Moving the Enterprise Backup to the Cloud – A Step-By-Step GuideStorage Switzerland
 
E2 evc 3-2-1-rule - mikeresseler
E2 evc   3-2-1-rule - mikeresselerE2 evc   3-2-1-rule - mikeresseler
E2 evc 3-2-1-rule - mikeresselerMike Resseler
 
Accelerate your digital business transformation with 360 Data Management
Accelerate your digital business transformation with 360 Data ManagementAccelerate your digital business transformation with 360 Data Management
Accelerate your digital business transformation with 360 Data ManagementVeritas Technologies LLC
 
Cloud Bursting: Leveraging the Cloud to Maintain App Performance during Peak ...
Cloud Bursting: Leveraging the Cloud to Maintain App Performance during Peak ...Cloud Bursting: Leveraging the Cloud to Maintain App Performance during Peak ...
Cloud Bursting: Leveraging the Cloud to Maintain App Performance during Peak ...Veritas Technologies LLC
 
NetBackup CloudCatalyst: Efficient, Cost-Effective Deduplication to the Cloud
NetBackup CloudCatalyst: Efficient, Cost-Effective Deduplication to the CloudNetBackup CloudCatalyst: Efficient, Cost-Effective Deduplication to the Cloud
NetBackup CloudCatalyst: Efficient, Cost-Effective Deduplication to the CloudVeritas Technologies LLC
 
Examining Technical Best Practices for Veritas and AWS Using a Detailed Refer...
Examining Technical Best Practices for Veritas and AWS Using a Detailed Refer...Examining Technical Best Practices for Veritas and AWS Using a Detailed Refer...
Examining Technical Best Practices for Veritas and AWS Using a Detailed Refer...Veritas Technologies LLC
 
D3SF17- Migrating to the Cloud 5- Years' Worth of Lessons Learned
D3SF17- Migrating to the Cloud 5- Years' Worth of Lessons LearnedD3SF17- Migrating to the Cloud 5- Years' Worth of Lessons Learned
D3SF17- Migrating to the Cloud 5- Years' Worth of Lessons LearnedImperva Incapsula
 
SLA Consistency: Protecting Workloads from On-premises to Cloud without Compr...
SLA Consistency: Protecting Workloads from On-premises to Cloud without Compr...SLA Consistency: Protecting Workloads from On-premises to Cloud without Compr...
SLA Consistency: Protecting Workloads from On-premises to Cloud without Compr...Veritas Technologies LLC
 
O365 E3 + Veritas > O365 E5: Solve the Governance Conundrum
O365 E3 + Veritas > O365 E5: Solve the Governance ConundrumO365 E3 + Veritas > O365 E5: Solve the Governance Conundrum
O365 E3 + Veritas > O365 E5: Solve the Governance ConundrumVeritas Technologies LLC
 
Making Data Protection Simple, Affordable, and BE Easy
Making Data Protection Simple, Affordable, and BE EasyMaking Data Protection Simple, Affordable, and BE Easy
Making Data Protection Simple, Affordable, and BE EasyVeritas Technologies LLC
 
Test Drive: Experience Single-Click Command with the Veritas Access User Inte...
Test Drive: Experience Single-Click Command with the Veritas Access User Inte...Test Drive: Experience Single-Click Command with the Veritas Access User Inte...
Test Drive: Experience Single-Click Command with the Veritas Access User Inte...Veritas Technologies LLC
 
Examining Technical Best Practices for Veritas and Azure Using a Detailed Re...
 Examining Technical Best Practices for Veritas and Azure Using a Detailed Re... Examining Technical Best Practices for Veritas and Azure Using a Detailed Re...
Examining Technical Best Practices for Veritas and Azure Using a Detailed Re...Veritas Technologies LLC
 
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup AppliancesDeep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup AppliancesVeritas Technologies LLC
 
Technical Best Practices for Veritas and Microsoft Azure Using a Detailed Ref...
Technical Best Practices for Veritas and Microsoft Azure Using a Detailed Ref...Technical Best Practices for Veritas and Microsoft Azure Using a Detailed Ref...
Technical Best Practices for Veritas and Microsoft Azure Using a Detailed Ref...Veritas Technologies LLC
 
Webinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Webinar: Cloud Storage: The 5 Reasons IT Can Do it BetterWebinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Webinar: Cloud Storage: The 5 Reasons IT Can Do it BetterStorage Switzerland
 
Stop compromising your data in the cloud with Veritas CloudPoint
Stop compromising your data in the cloud with Veritas CloudPointStop compromising your data in the cloud with Veritas CloudPoint
Stop compromising your data in the cloud with Veritas CloudPointVeritas Technologies LLC
 
Avoiding disaster recovery disasters
Avoiding disaster recovery disastersAvoiding disaster recovery disasters
Avoiding disaster recovery disastersAlexandra Matthiesen
 
Industrial production process visualization with the Elastic Stack in real-ti...
Industrial production process visualization with the Elastic Stack in real-ti...Industrial production process visualization with the Elastic Stack in real-ti...
Industrial production process visualization with the Elastic Stack in real-ti...Elasticsearch
 
Predictable Business Continuity for Amazon Web Services
Predictable Business Continuity for Amazon Web ServicesPredictable Business Continuity for Amazon Web Services
Predictable Business Continuity for Amazon Web ServicesVeritas Technologies LLC
 

What's hot (20)

Webinar: Moving the Enterprise Backup to the Cloud – A Step-By-Step Guide
Webinar: Moving the Enterprise Backup to the Cloud – A Step-By-Step GuideWebinar: Moving the Enterprise Backup to the Cloud – A Step-By-Step Guide
Webinar: Moving the Enterprise Backup to the Cloud – A Step-By-Step Guide
 
E2 evc 3-2-1-rule - mikeresseler
E2 evc   3-2-1-rule - mikeresselerE2 evc   3-2-1-rule - mikeresseler
E2 evc 3-2-1-rule - mikeresseler
 
Accelerate your digital business transformation with 360 Data Management
Accelerate your digital business transformation with 360 Data ManagementAccelerate your digital business transformation with 360 Data Management
Accelerate your digital business transformation with 360 Data Management
 
Cloud Bursting: Leveraging the Cloud to Maintain App Performance during Peak ...
Cloud Bursting: Leveraging the Cloud to Maintain App Performance during Peak ...Cloud Bursting: Leveraging the Cloud to Maintain App Performance during Peak ...
Cloud Bursting: Leveraging the Cloud to Maintain App Performance during Peak ...
 
NetBackup CloudCatalyst: Efficient, Cost-Effective Deduplication to the Cloud
NetBackup CloudCatalyst: Efficient, Cost-Effective Deduplication to the CloudNetBackup CloudCatalyst: Efficient, Cost-Effective Deduplication to the Cloud
NetBackup CloudCatalyst: Efficient, Cost-Effective Deduplication to the Cloud
 
Keeping Your Cloud Workloads Protected
Keeping Your Cloud Workloads ProtectedKeeping Your Cloud Workloads Protected
Keeping Your Cloud Workloads Protected
 
Examining Technical Best Practices for Veritas and AWS Using a Detailed Refer...
Examining Technical Best Practices for Veritas and AWS Using a Detailed Refer...Examining Technical Best Practices for Veritas and AWS Using a Detailed Refer...
Examining Technical Best Practices for Veritas and AWS Using a Detailed Refer...
 
D3SF17- Migrating to the Cloud 5- Years' Worth of Lessons Learned
D3SF17- Migrating to the Cloud 5- Years' Worth of Lessons LearnedD3SF17- Migrating to the Cloud 5- Years' Worth of Lessons Learned
D3SF17- Migrating to the Cloud 5- Years' Worth of Lessons Learned
 
SLA Consistency: Protecting Workloads from On-premises to Cloud without Compr...
SLA Consistency: Protecting Workloads from On-premises to Cloud without Compr...SLA Consistency: Protecting Workloads from On-premises to Cloud without Compr...
SLA Consistency: Protecting Workloads from On-premises to Cloud without Compr...
 
O365 E3 + Veritas > O365 E5: Solve the Governance Conundrum
O365 E3 + Veritas > O365 E5: Solve the Governance ConundrumO365 E3 + Veritas > O365 E5: Solve the Governance Conundrum
O365 E3 + Veritas > O365 E5: Solve the Governance Conundrum
 
Making Data Protection Simple, Affordable, and BE Easy
Making Data Protection Simple, Affordable, and BE EasyMaking Data Protection Simple, Affordable, and BE Easy
Making Data Protection Simple, Affordable, and BE Easy
 
Test Drive: Experience Single-Click Command with the Veritas Access User Inte...
Test Drive: Experience Single-Click Command with the Veritas Access User Inte...Test Drive: Experience Single-Click Command with the Veritas Access User Inte...
Test Drive: Experience Single-Click Command with the Veritas Access User Inte...
 
Examining Technical Best Practices for Veritas and Azure Using a Detailed Re...
 Examining Technical Best Practices for Veritas and Azure Using a Detailed Re... Examining Technical Best Practices for Veritas and Azure Using a Detailed Re...
Examining Technical Best Practices for Veritas and Azure Using a Detailed Re...
 
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup AppliancesDeep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
 
Technical Best Practices for Veritas and Microsoft Azure Using a Detailed Ref...
Technical Best Practices for Veritas and Microsoft Azure Using a Detailed Ref...Technical Best Practices for Veritas and Microsoft Azure Using a Detailed Ref...
Technical Best Practices for Veritas and Microsoft Azure Using a Detailed Ref...
 
Webinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Webinar: Cloud Storage: The 5 Reasons IT Can Do it BetterWebinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Webinar: Cloud Storage: The 5 Reasons IT Can Do it Better
 
Stop compromising your data in the cloud with Veritas CloudPoint
Stop compromising your data in the cloud with Veritas CloudPointStop compromising your data in the cloud with Veritas CloudPoint
Stop compromising your data in the cloud with Veritas CloudPoint
 
Avoiding disaster recovery disasters
Avoiding disaster recovery disastersAvoiding disaster recovery disasters
Avoiding disaster recovery disasters
 
Industrial production process visualization with the Elastic Stack in real-ti...
Industrial production process visualization with the Elastic Stack in real-ti...Industrial production process visualization with the Elastic Stack in real-ti...
Industrial production process visualization with the Elastic Stack in real-ti...
 
Predictable Business Continuity for Amazon Web Services
Predictable Business Continuity for Amazon Web ServicesPredictable Business Continuity for Amazon Web Services
Predictable Business Continuity for Amazon Web Services
 

Similar to Doing hadoop securely

Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Andrew Brust
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3tcloudcomputing-tw
 
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...SpringPeople
 
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Amazon Web Services
 
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527Zohar Elkayam
 
Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem
Things Every Oracle DBA Needs to Know about the Hadoop EcosystemThings Every Oracle DBA Needs to Know about the Hadoop Ecosystem
Things Every Oracle DBA Needs to Know about the Hadoop EcosystemZohar Elkayam
 
Things Every Oracle DBA Needs To Know About The Hadoop Ecosystem
Things Every Oracle DBA Needs To Know About The Hadoop EcosystemThings Every Oracle DBA Needs To Know About The Hadoop Ecosystem
Things Every Oracle DBA Needs To Know About The Hadoop EcosystemZohar Elkayam
 
Journey to the Cloud: What I Wish I Knew Before I Started
Journey to the Cloud: What I Wish I Knew Before I Started Journey to the Cloud: What I Wish I Knew Before I Started
Journey to the Cloud: What I Wish I Knew Before I Started Datavail
 
Strata NY 2014 - Architectural considerations for Hadoop applications tutorial
Strata NY 2014 - Architectural considerations for Hadoop applications tutorialStrata NY 2014 - Architectural considerations for Hadoop applications tutorial
Strata NY 2014 - Architectural considerations for Hadoop applications tutorialhadooparchbook
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick viewRajesh Nadipalli
 
HdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft PlatformHdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft Platformnvvrajesh
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick viewRajesh Nadipalli
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecasesudhakara st
 
Strata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationsStrata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationshadooparchbook
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the OrganizationSeeling Cheung
 
CommVault - Your Journey to A Secure Cloud Event
CommVault - Your Journey to A Secure Cloud EventCommVault - Your Journey to A Secure Cloud Event
CommVault - Your Journey to A Secure Cloud EventGoogle
 
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon ValleyIntro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valleymarkgrover
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoophadooparchbook
 

Similar to Doing hadoop securely (20)

Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World
 
Beyond TCO
Beyond TCOBeyond TCO
Beyond TCO
 
Hadoop and Big Data
Hadoop and Big DataHadoop and Big Data
Hadoop and Big Data
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
 
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
 
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
 
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527
 
Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem
Things Every Oracle DBA Needs to Know about the Hadoop EcosystemThings Every Oracle DBA Needs to Know about the Hadoop Ecosystem
Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem
 
Things Every Oracle DBA Needs To Know About The Hadoop Ecosystem
Things Every Oracle DBA Needs To Know About The Hadoop EcosystemThings Every Oracle DBA Needs To Know About The Hadoop Ecosystem
Things Every Oracle DBA Needs To Know About The Hadoop Ecosystem
 
Journey to the Cloud: What I Wish I Knew Before I Started
Journey to the Cloud: What I Wish I Knew Before I Started Journey to the Cloud: What I Wish I Knew Before I Started
Journey to the Cloud: What I Wish I Knew Before I Started
 
Strata NY 2014 - Architectural considerations for Hadoop applications tutorial
Strata NY 2014 - Architectural considerations for Hadoop applications tutorialStrata NY 2014 - Architectural considerations for Hadoop applications tutorial
Strata NY 2014 - Architectural considerations for Hadoop applications tutorial
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
 
HdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft PlatformHdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft Platform
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
 
Strata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationsStrata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applications
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
CommVault - Your Journey to A Secure Cloud Event
CommVault - Your Journey to A Secure Cloud EventCommVault - Your Journey to A Secure Cloud Event
CommVault - Your Journey to A Secure Cloud Event
 
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon ValleyIntro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoop
 

Recently uploaded

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 

Recently uploaded (20)

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 

Doing hadoop securely

  • 1. Big Data Consulting doing hadoop, securely
  • 2. Rob Gibbon ■ Architect @Big Industries Belgium ■ Focus on designing, deploying & integrating web scale solutions with Hadoop ■ Deliveries for clients in telco, financial services & media
  • 3. Hadoop was built to survive data tsunamis ■ a response to challenges that enterprise vendors were unable to address ■ focused on data volumes and cost reduction ■ initially, the solution had some serious holes
  • 4. Confidentiality, Integrity, Availability ■ early prereleases couldn’t really meet any of these three fundamental infosec objectives ■ basic controls weren’t there
  • 5. the early days ■ Multiple SPoF ■ No authentication ■ Easily spoofed authorisation ■ No encryption of data at rest nor in transit ■ No accounting
  • 6. enter the hadoop vendors ■ Vendors like Cloudera focus on making Apache Hadoop “enterprise ready” ■ Includes building robust infosec controls into Hadoop core ■ Multilayer security is now available for Hadoop
  • 7. running a cluster in non-secure mode ■ malicious|mistaken user: ■ recursively delete all the data please ■ by the way, I’m the system superuser ■ hadoop: ■ oh ok then
  • 8. bad things happen with slack controls in place
  • 9. average cost of a data breach = $3.8m
  • 10. running a secure cluster ■ Kerberos is one of the primary security controls you can use ■ Btw, what’s wrong with this kerberos principal? ■ hdfs@BIGINDUSTRIES.BE
  • 11. kerberos continued ■ Kerberos uses a three-part principal ■ hdfs/node1.cluster1.bigindustries.be@BIGINDUSTRIES.BE ■ hdfs/node1.cluster2.bigindustries.be@BIGINDUSTRIES.BE ■ Best to use explicit mappings from kerberos principals to local users
  • 12. hive / impala ■ HiveServer doesn’t support Kerberos => use HiveServer2 ■ Best to use Sentry to enforce role based access controls from SQL ■ Users can upload and execute arbitrary [possibly hostile] UDFs => enable Sentry ■ Older versions of Metastore don’t enforce permissions on grant_* and revoke_* APIs => stay up to date
  • 13. availability ■ Most core components now support HA ■ HDFS ■ YARN ■ Hive ■ Hbase
  • 14. disaster recovery ■ HDFS and HBase offer point-in-time snapshots ■ => consistentency! ■ Vendor-tethered solutions for site-to-site replication are available
  • 15. encryption at rest ■ HDFS encryption zones ■ transparent to existing applications ■ minimal performance overhead on Intel architecture ■ key management is externalised
  • 16. wire encryption ■ SSL encryption is now available for most Hadoop services ■ Note that AES-256 for SSL and for Kerberos preauth requires extra JCE policy files on the cluster
  • 17. accounting ■ Vendor-tethered solutions are available for auditing ■ Navigator for Cloudera clusters ■ Ranger for HortonWorks clusters
  • 18. tokenization ■ The process of substituting a sensitive data element with a non-sensitive equivalent ■ 3rd Party vendor solutions are available that integrate well with Hadoop
  • 19. some places where there’s still some work to do ■ Setting up hadoop security controls is complex and time consuming ■ Not much support for SELinux around here ■ No general, coherent, policy-based framework for controlling resource access demands ■ Apache Knox is a starting point ■ => network and host resource access?
  • 20. Integration ■ Integrating hadoop into an organisation’s services environment needs careful planning ■ Hadoop can conflict with established governance policies ■ system accounts & privileges ■ remote access ■ firewall flows ■ domains and trust ■ etc.
  • 21. layered security in hadoop-core ■ Authentication: Kerberos ■ Authorisation: Local unix group or LDAP mappings ■ Authorisation: Sentry RBACS for hive/impala ■ Encryption: HDFS encryption ■ Encryption: SSL encryption for most services ■ Availability: Active/Passive failover HDFS, YARN, Hbase ■ Integrity: HDFS block replication & CRC checksum
  • 22. but what about poodle/heartbleed/shellshock/whatever... ■ underlines the need for a mature information security governance strategy & architecture
  • 23. defence-in-depth ■ A layered security architecture for Hadoop clusters is doable ■ eg. MasterCard’s Cloudera Hadoop cluster achieved PCI compliance in 2014 http://goo.gl/FP5DUt