SlideShare a Scribd company logo
1 of 59
Download to read offline
Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Operations With Apache Ambari
We Do Hadoop.
Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache Ambari
Apache Ambari is the open source
operational platform to provision, manage
and monitor Hadoop clusters
Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
How Do People Use Ambari?
Health Checks, Alerts Stacks, Views
Lifecycle controls, Rolling
Restarts, Decommission/
Re-commission
Host Groups, Versioning,
Compare, Revert,
Recommendations,
Security Setup
Install Wizard (UI),
Blueprints (API)
Config
Management
ExtensibilityMonitoring
Service
Management
Cluster
Provisioning
Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Recent Ambari Releases
Ambari 1.7.0
Dec 2014
Ambari 1.6.0
May 2014
Introduced
Ambari Blueprints
Introduced
Ambari Views
Ambari 2.0.0
Apr 2014
HDP
2.2 GA
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
What’s New in Ambari 2.0
Core Platform
Simplified Kerberos Setup (AMBARI-7204)
Ambari Alerts (AMBARI-6354)
Ambari Metrics (AMBARI-5707)
Automated (Rolling) Upgrade (AMBARI-7804)
Stack Support
HDP 2.2: Ranger, Spark, Phoenix
Hive Metastore HA (AMBARI-6684)
HiveServer2 HA (AMBARI-8906)
Oozie HA (AMBARI-6683)
Ambari Platform
Handle umask 027 setting (AMBARI-7796)
Ambari Agent non-root (AMBARI-1596)
Blueprints API
Add Host (AMBARI-8458)
For a complete list of changes
https://issues.apache.org/jira/browse/AMBARI/fixforversion/12327486
Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Lab Setup
Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari 2.0 Security Lab Steps – 4 node cluster
•  Detailed steps available at: http://bit.ly/1J4IbIs
•  Install Ambari server and agents
•  Deploy custom Ambari service folders for OpenLDAP, KDC/kadmin, NSLCD
•  Use blueprint/API to provision a minimal Hadoop cluster with custom services
•  Use Add service wizard to also install Hive
•  Configure Ambari to sync/recognize business users in OpenLDAP
•  Authentication: Run Ambari security wizard to enable kerberos using KDC/kadmin service
•  Install Ranger as Ambari service and configure it to recognize LDAP users
•  Authorization/Audit: Setup HDFS/Hive plugins to allow users to set policies to control access
and audit consumption
Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Core Platform
Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Stack Components Support
HDP 2.2 HDP 2.1 HDP 2.0
HDFS, YARN, MapReduce, Hive,
HBase, Pig, ZooKeeper, Oozie,
Sqoop
Tez, Storm, Falcon, Flume
Knox, Slider, Kafka
Ranger, Spark, Phoenix NEW in Ambari 2.0
install/manage/monitor
Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Admin > Stack and Versions
List of Stack Services
Installed or Add Service
Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari 2.0.0 High Availability Support
High Availability Mode Ambari 1.6.1 Ambari 1.7.0 Ambari 2.0.0
HDFS: NameNode HDP 2.0+ Active/Standby
YARN: ResourceManager HDP 2.1+ Active/Standby
HBase: HBaseMaster HDP 2.1+ Multi-master
Hive: HiveServer2 HDP 2.1+ Multi-instance
Hive: Hive Metastore HDP 2.1+ Multi-instance
Oozie: Oozie Server* HDP 2.1+ Multi-instance
* Oozie Server needs external load balancer to complete HA solution
Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hive HA
Services > Hive > Service Actions
+ Add Hive Metastore
+ Add HiveServer2
Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Agent Non-Root
Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Non-Root Ambari Agent
Agent Runs Commands From the Ambari Server
•  Configuration Change
•  Service Start
•  Service Stop
Some Command require root level access
•  /bin/su	
  hdfs	
  -­‐l	
  -­‐s	
  /bin/bash	
  -­‐c	
  /usr/hdp/current/hadoop-­‐client/sbin/hadoop-­‐
daemon.sh	
  -­‐-­‐config	
  /etc/hadoop/conf	
  start	
  datanode	
  
Sudo Leveraged
•  Configuration for:
–  Customizable Users (su hdfs, yarn, etc.)
–  Non-Customizable Users (su mysql)
–  Commands (yum, mkdir, touch, test, etc.)
Ambari
AgentAmbari
AgentAmbari
Agent
python
Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Configuring Agent for Non-Root
1.  Create and configure a sudoer account
2.  Manually bootstrap Ambari Agents
3.  Set run_as_user in ambari-agent.ini for the sudoer account
Details
http://docs.hortonworks.com/HDPDocuments/Ambari-2.0.0.0/bk_ambari_reference_guide/
content/ch_amb_ref_configuring_ambari_for_non-root.html
Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Updated umask Handling
Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
What about umask?
Unix Permissions Basics: (user, group, other)
4 – read
2 – write
1 – execute
rwxr-­‐xr-­‐x	
  == 755
Previous Behavior:
•  If (umask > 022); Warning during agent pre-req check
•  Installations would fail if ignored
New Behavior:
•  If (umask > 027); Warning during agent pre-req check
•  Installation will fail if ignored
Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Alerts
Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Alerts
Summary
•  Migrated away from Nagios as the Ambari alerting system
•  No longer offer option to install or manage a Nagios service
•  Replaced with built-in alerting system
Motivation
•  Avoids Nagios package conflicts in customer environments
•  More flexibility with alerts in Ambari Stacks
•  Platform independence
Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Alerts
•  Ambari Alerts are installed and configured by default
•  Ambari Web provides centralized management of Health Alerts
Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Modifying Alerts
•  Control thresholds, check intervals and response text
Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Alert Groups
•  Create and manage groups of
alerts
•  Group alerts further controls what alerts
are dispatched which notifications
•  Assign group to notifications
•  Only dispatch to interested parties
Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Alert Notifications
•  What: Create and manage
multiple notification targets
•  Control who gets notified when
•  Why: Filter by severity
•  Send only certain notifications to certain
targets based on severity
•  How: Control dispatch method
•  Support for EMAIL + SNMP
Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Alerting System
1.  User creates or modifies cluster
2.  Ambari reads alert definitions
from Stack
3.  Ambari sends alert definitions to
Agents and Agent schedules
instance checks
4.  Agents reports alert instance
status in the heartbeat
5.  Ambari responds to alert instance
status changes and dispatches
notifications (if applicable)
Ambari
Server
1
2
4
Stack definition
alerts.json
5
Ambari
Agent(s)
3
email
snmp
Page 25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Notable Alert REST APIs
REST Endpoint Description
/api/v1/clusters/:clusterName/alert_definitions The list of alert definitions for the cluster.
/api/v1/clusters/:clusterName/alerts The list of alert instances for the cluster.
Example: find all alert instances that are CRITITAL
/api/v1/clusters/c1/alerts?Alert/state.in(CRITICAL)
Example: find all alert instances for “ZooKeeper Process” alert def
/api/v1/clusters/c1/alerts?Alert/
definition_name=zookeeper_server_process
/api/v1/clusters/:clusterName/alert_groups The list of alert groups.
/api/v1/clusters/:clusterName/alert_history The list of alert instance status changes.
/api/v1/alert_targets/ The list of configured alert notification targets for Ambari.
Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Metrics
Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Metrics
Summary
•  Migrated from Ganglia as the Ambari metrics collection system
•  No longer offer option to install or manage a Ganglia service
•  Replaced with built-in metrics system “Ambari Metrics”
Motivation
•  Avoids Ganglia package conflicts in customer environments
•  More flexibility to retain metrics in Hadoop
•  Platform independence
Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Terminology
Term Definition
Ambari Metrics (“AMS”) The built-in metrics collection system for Ambari (“AMS”).
Metrics Collector The standalone server that collects metrics, aggregates metrics, serves
metrics from the Hadoop service sinks and the Metrics Monitor. Analogous
to gmetad.
Metrics Monitor Installed on each host in the cluster to collect system-level metrics and
forward to the Collector. Analogous to gmond.
Metrics Hadoop Sinks Plugs into the Service sinks to send Hadoop metrics to the Collector.
Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Metrics Collection System
1.  Metric Monitors send system-
level metrics to Collector
2.  Sinks send Hadoop-level
metrics to Collector
3.  Metrics Collector service stores
and aggregates metrics
4.  Ambari exposes REST API for
metrics retrieval
Ambari
Server
Metrics
Monitor
Metrics
Collector
Host1
Sink(s)
3
Metrics
Monitor
Host1
Sink(s)Metrics
Monitor
Hosts
Sink(s)
1 2
4
Page 30 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Metrics Collector
Built using Hadoop technologies
Default uses local filesystem for
metrics storage (“embedded”) **
Local Filesystem **
HBase
ATS
Phoenix
** Tech Preview “distributed” storage option to use existing HDFS
Page 31 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Automated Rolling Upgrade
For HDP Stack
Page 32 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Rolling vs. In Place Upgrades
In Place Upgrades Upgrade Stack with one or more service disruptions. Explicit stop all services.
Rolling Upgrades
Ambari 2.0
Update Stack with minimized service disruption and degradation.
Page 33 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Upgrading the Stack with Ambari 2.0
Source HDP
Version
Target HDP
Versions
Method
HDP 2.0.x
HDP 2.0.x
HDP 2.1.x
HDP 2.2.x
In Place
HDP 2.1.x
HDP 2.1.x
HDP 2.2.x
In Place
HDP 2.2.x HDP 2.2.x Rolling NEW!!!
Page 34 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Rolling Upgrade Process
Pre-
requisites
Prepare
Rolling
Upgrade
Finalize
Rolling
Downgrade
Rollback
NOT Rolling. Shutdown all
services.
Page 35 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Process: Rolling Upgrade
HDP has a certified process
for Rolling Upgrades
Services are switched over to
new version in rolling fashion
ZooKeeper
Ranger
Core Masters
Core Slaves
Hive
Oozie
Falcon
Clients
Kafka
Knox
Storm
Slider
Flume
Finalize
HDFS, YARN, MR,
Tez, HBase, Pig.
Hive
HDFS
YARN
HBase
Page 36 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Process: Rolling Downgrade
ZooKeeper
Ranger
Core Masters
Core Slaves
Hive
Oozie
Falcon
Clients
Kafka
Knox
Storm
Slider
Flume
Downgrade
V
V
V
V
V
V
V
V
V
V
V
V
V
Finalize
Page 37 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Wizard Driven Experience Register Install
Perform
Upgrade
Finalize
With
verification
and validation
Page 38 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Process: Service Disruption by Component
Component Service Disruption
Zookeeper No Service Disruption
Ranger No Service Disruption
HDFS No Service Disruption
YARN No Service Disruption
HBase No Service Disruption
Hive No Service Disruption
Oozie No Service Disruption
Falcon Yes – Requires Stop/Start
Kafka Yes – Requires Stop/Start
Knox Yes – Requires Stop/Start
Storm Yes – Requires Stop/Start
Flume No Service Disruption
Slider applications Yes – Requires Stop/Start
Hue Yes – Requires Stop/Start
Accumulo Yes – Requires Stop/Start
Page 39 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Extensibility
Stacks, Blueprints and Views
Page 40 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Extensibility Features
•  To add new Services (ISV or otherwise) beyond HDP Stack
•  To customize a Stack for customer specific environments
•  To use Ambari for automating cluster installations
•  To share best practices on layout and cluster configuration
•  To extend and customize the Ambari Web UI
•  Add new capabilities, customize existing capabilities
Stacks
Blueprints
Views
Goal: Extend Ambari without hard-coding in Ambari
Page 41 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
New in Ambari 2.0 - Blueprints Add Host
Add hosts to a cluster based on a
host group from a Blueprint
Add one or more hosts with a
single call
POST /api/v1/clusters/MyCluster/hosts
{
"blueprint" : "myblueprint",
"host_group" : "workers",
"host_name" : "c6403.ambari.apache.org"
}
POST /api/v1/clusters/MyCluster/hosts
[
{
"blueprint" : "myblueprint",
"host_group" : "workers",
"host_name" : "c6403.ambari.apache.org"
},
{
"blueprint" : "myblueprint",
"host_group" : "workers",
"host_name" : "c6403.ambari.apache.org"
}
]
https://issues.apache.org/jira/browse/AMBARI-8458
Page 42 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
More on Ambari Extensibility
http://hortonworks.com/partners/learn/#ambari
Page 43 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Simplified Kerberos Setup
Page 44 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Quick Kerberos Overview
REALM
•  EXAMPLE.COM
Principals (Humans)
•  paul@EXAMPLE.COM
Service Principals (Services)
•  hbase@EXAMPLE.COM
•  hbase/r2u3s1.example.local@EXAMPLE.COM
Tickets
•  “paul@EXAMPLE.COM is authenticated and can access the HBASE service”
KDC – Key Distribution Center
•  Grant’s authenticated users tickets
Client
•  r1u2m1.example.com (.example.com maps to realm EXAMPLE.COM)
•  EXAMPLE.COM’s KDC is hosted on r1u2m3.example.com
Page 45 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
KDC Implementation Options
•  Microsoft Active Directory
•  Users
•  Service Principals
•  MIT Kerberos
•  Users
•  Service Principals
•  MIT Kerberos + Microsoft Active Directory (Trust Relationship)
•  Users in Active Directory
•  Service Principals in MIT
Page 46 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
What Ambari 2.0 will do
•  Step-by-Step wizard to setup Kerberos
•  Supports existing MIT KDC and Active Directory (AD) infrastructure
•  Deploys and manages Kerberos Clients, and configuration
•  First Time Setup as well as New Service/Host/Component
•  Automated creation of principals
•  Automated generation of keytabs
•  Automated distribution of keytabs
•  Support for regeneration and distribution of keytabs
Page 47 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Prerequisites
Category Requirements
General
•  Ambari Server must be part of cluster
•  Ambari Server and all hosts must have JCE installed
•  Ambari Server and all hosts must have network access to the KDC
KDC Admin
•  KDC admin account credentials are on-hand
•  !!! Ambari does not retain KDC admin credentials !!!
Active Directory
•  Security LDAP (LDAPS) connectivity has been configured
•  User container for principals has been created and is on-hand
•  Admin account has delegated control of “Create, delete and manage
user accounts”
Page 48 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Terminology
Term Definition
Service Principals Principals required for HDP Service Components.
Ambari Principals
Headless principals used by Ambari to perform “smoke tests” and “health
alert checks”.
KDC Admin Account
An administrative account that will be used by Ambari to create principals
and generate keytabs in KDC.
Page 49 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Principal and Keytab Generation and Distribution
1.  User provides KDC Admin
Account credentials to Ambari
2.  Ambari connects to KDC, creates
principals (Service and Ambari)
needed for cluster
3.  Ambari generates keytabs for the
principals
4.  Ambari distributes keytabs to
Ambari Server and cluster hosts
5.  Ambari discards the KDC Admin
Account credentials
Ambari
Server KDC
1 2
4
3
5
HDP
Cluster
Page 50 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari + Service Keytab Files
Ambari
Server
HDP
Cluster
Hosts
Keytabs for
Ambari
Principals
Keytabs for
Service +
Ambari
Principals
KDC
Service Principals
Ambari Principals
Ambari and
Service
Principals
Page 51 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Wizard Driven and Automated
Page 52 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
KDC and KDC Admin Information
Page 53 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Customizable Principal Attributes
Page 54 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Kerberos Clients
Ambari installs Kerberos clients
on cluster hosts
Optional to not have Ambari
manage krb5.conf client config
OS Client
RHEL/CentOS/OEL krb5-workstation
SLES 11 krb5-client
Ubuntu 12 krb5-user, krb5-config
Page 55 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Configure Ambari and Service Identities
Page 56 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Post-Kerberos Scenarios
Ambari does not retain KDC admin credentials
User is prompted for KDC Admin credentials:
•  Add/Delete Host
•  Add Service
•  Add/Delete Component
•  Regenerate Keytabs
•  Disable Kerberos
Page 57 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Security Lab
Page 58 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Security today in Hadoop with HDP
Authorization
Restrict access to
explicit data
Audit
Understand who
did what
Data Protection
Encrypt data at
rest & in motion
•  Kerberos in native
Apache Hadoop
•  HTTP/REST API
Secured with
Apache Knox
Gateway
Authentication
Who am I/prove it?
•  Wire encryption
in Hadoop
•  Orchestrated
encryption with
partner tools
•  HDFS, Hive and
Hbase, Storm
and Knox
•  Fine grain
access control
•  Centralized
audit reporting
•  Policy and
access history
HDP2.1
Ranger
Centralized Security Administration
More on Security: http://hortonworks.com/partners/learn/#secure
Page 59 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari 2.0 Security Lab Steps
•  Detailed steps available at: http://bit.ly/1J4IbIs
•  Install Ambari server and agents
•  Deploy custom Ambari service folders for OpenLDAP, KDC/kadmin, NSLCD
•  Use blueprint/API to provision a minimal Hadoop cluster with custom services
•  Use Add service wizard to also install Hive
•  Configure Ambari to sync/recognize business users in OpenLDAP
•  Authentication: Run Ambari security wizard to enable kerberos using KDC/kadmin service
•  Install Ranger as Ambari service and configure it to recognize LDAP users
•  Authorization/Audit: Setup HDFS/Hive plugins to allow users to set policies to control access
and audit consumption

More Related Content

What's hot

HDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHortonworks
 
Apache Ambari - What's New in 2.4
Apache Ambari - What's New in 2.4 Apache Ambari - What's New in 2.4
Apache Ambari - What's New in 2.4 Hortonworks
 
Authoring and Hosting Applications on YARN using Slider
Authoring and Hosting Applications on YARN using SliderAuthoring and Hosting Applications on YARN using Slider
Authoring and Hosting Applications on YARN using SliderDataWorks Summit
 
Hortonworks tech workshop in-memory processing with spark
Hortonworks tech workshop   in-memory processing with sparkHortonworks tech workshop   in-memory processing with spark
Hortonworks tech workshop in-memory processing with sparkHortonworks
 
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...DataWorks Summit
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN ApplicationsHortonworks
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache RangerDataWorks Summit
 
Deploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderDeploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderHortonworks
 
Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNEnabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNDataWorks Summit
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3Hortonworks
 
What s new in spark 2.3 and spark 2.4
What s new in spark 2.3 and spark 2.4What s new in spark 2.3 and spark 2.4
What s new in spark 2.3 and spark 2.4DataWorks Summit
 
Internet of things Crash Course Workshop
Internet of things Crash Course WorkshopInternet of things Crash Course Workshop
Internet of things Crash Course WorkshopDataWorks Summit
 
Apache Ambari - What's New in 2.2
 Apache Ambari - What's New in 2.2 Apache Ambari - What's New in 2.2
Apache Ambari - What's New in 2.2Hortonworks
 
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and TroubleshootingApache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and TroubleshootingDataWorks Summit/Hadoop Summit
 
Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14Hortonworks
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureDataWorks Summit
 
Double Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSenseDouble Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSenseHortonworks
 
Hortonworks Technical Workshop - build a yarn ready application with apache ...
Hortonworks Technical Workshop -  build a yarn ready application with apache ...Hortonworks Technical Workshop -  build a yarn ready application with apache ...
Hortonworks Technical Workshop - build a yarn ready application with apache ...Hortonworks
 
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Hortonworks
 

What's hot (20)

HDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical Workshop
 
Apache Ambari - What's New in 2.4
Apache Ambari - What's New in 2.4 Apache Ambari - What's New in 2.4
Apache Ambari - What's New in 2.4
 
Authoring and Hosting Applications on YARN using Slider
Authoring and Hosting Applications on YARN using SliderAuthoring and Hosting Applications on YARN using Slider
Authoring and Hosting Applications on YARN using Slider
 
Hortonworks tech workshop in-memory processing with spark
Hortonworks tech workshop   in-memory processing with sparkHortonworks tech workshop   in-memory processing with spark
Hortonworks tech workshop in-memory processing with spark
 
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN Applications
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
Deploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderDeploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via Slider
 
Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNEnabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARN
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
 
What s new in spark 2.3 and spark 2.4
What s new in spark 2.3 and spark 2.4What s new in spark 2.3 and spark 2.4
What s new in spark 2.3 and spark 2.4
 
Internet of things Crash Course Workshop
Internet of things Crash Course WorkshopInternet of things Crash Course Workshop
Internet of things Crash Course Workshop
 
Apache Ambari - What's New in 2.2
 Apache Ambari - What's New in 2.2 Apache Ambari - What's New in 2.2
Apache Ambari - What's New in 2.2
 
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and TroubleshootingApache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
 
Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Double Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSenseDouble Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSense
 
Hortonworks Technical Workshop - build a yarn ready application with apache ...
Hortonworks Technical Workshop -  build a yarn ready application with apache ...Hortonworks Technical Workshop -  build a yarn ready application with apache ...
Hortonworks Technical Workshop - build a yarn ready application with apache ...
 
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]
 

Similar to Hortonworks technical workshop operations with ambari

Apache Ambari - What's New in 2.1
Apache Ambari - What's New in 2.1Apache Ambari - What's New in 2.1
Apache Ambari - What's New in 2.1Hortonworks
 
Past, Present and Future of Apache Ambari
Past, Present and Future of Apache AmbariPast, Present and Future of Apache Ambari
Past, Present and Future of Apache AmbariArtem Ervits
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariManaging Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariJayush Luniya
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariManaging Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariHortonworks
 
Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0Hortonworks
 
Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Hortonworks
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Data Con LA
 
Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks Data In Motion Series Part 3 - HDF Ambari Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks Data In Motion Series Part 3 - HDF Ambari Hortonworks
 
Manage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariManage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariDataWorks Summit
 
Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesKrisztián Horváth
 
Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)
Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)
Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)Aravindan Vijayan
 
Manage Add-on Services in Apache Ambari
Manage Add-on Services in Apache AmbariManage Add-on Services in Apache Ambari
Manage Add-on Services in Apache AmbariJayush Luniya
 
Pivotal cf for_devops_mkim_20141209
Pivotal cf for_devops_mkim_20141209Pivotal cf for_devops_mkim_20141209
Pivotal cf for_devops_mkim_20141209minseok kim
 
Why is My Hadoop Job Slow?
Why is My Hadoop Job Slow?Why is My Hadoop Job Slow?
Why is My Hadoop Job Slow?Bikas Saha
 
Why is My Hadoop Job Slow?
Why is My Hadoop Job Slow?Why is My Hadoop Job Slow?
Why is My Hadoop Job Slow?Bikas Saha
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureDataWorks Summit
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureDataWorks Summit
 

Similar to Hortonworks technical workshop operations with ambari (20)

Apache Ambari - What's New in 2.1
Apache Ambari - What's New in 2.1Apache Ambari - What's New in 2.1
Apache Ambari - What's New in 2.1
 
Past, Present and Future of Apache Ambari
Past, Present and Future of Apache AmbariPast, Present and Future of Apache Ambari
Past, Present and Future of Apache Ambari
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariManaging Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache Ambari
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariManaging Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache Ambari
 
What's new in Ambari
What's new in AmbariWhat's new in Ambari
What's new in Ambari
 
Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0
 
Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
 
Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks Data In Motion Series Part 3 - HDF Ambari Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks Data In Motion Series Part 3 - HDF Ambari
 
Manage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariManage Add-On Services with Apache Ambari
Manage Add-On Services with Apache Ambari
 
Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
 
Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
 
Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)
Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)
Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)
 
Manage Add-on Services in Apache Ambari
Manage Add-on Services in Apache AmbariManage Add-on Services in Apache Ambari
Manage Add-on Services in Apache Ambari
 
Pivotal cf for_devops_mkim_20141209
Pivotal cf for_devops_mkim_20141209Pivotal cf for_devops_mkim_20141209
Pivotal cf for_devops_mkim_20141209
 
Why is My Hadoop Job Slow?
Why is My Hadoop Job Slow?Why is My Hadoop Job Slow?
Why is My Hadoop Job Slow?
 
Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?
 
Why is My Hadoop Job Slow?
Why is My Hadoop Job Slow?Why is My Hadoop Job Slow?
Why is My Hadoop Job Slow?
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

Hortonworks technical workshop operations with ambari

  • 1. Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Operations With Apache Ambari We Do Hadoop.
  • 2. Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Apache Ambari Apache Ambari is the open source operational platform to provision, manage and monitor Hadoop clusters
  • 3. Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved How Do People Use Ambari? Health Checks, Alerts Stacks, Views Lifecycle controls, Rolling Restarts, Decommission/ Re-commission Host Groups, Versioning, Compare, Revert, Recommendations, Security Setup Install Wizard (UI), Blueprints (API) Config Management ExtensibilityMonitoring Service Management Cluster Provisioning
  • 4. Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Recent Ambari Releases Ambari 1.7.0 Dec 2014 Ambari 1.6.0 May 2014 Introduced Ambari Blueprints Introduced Ambari Views Ambari 2.0.0 Apr 2014 HDP 2.2 GA
  • 5. Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved What’s New in Ambari 2.0 Core Platform Simplified Kerberos Setup (AMBARI-7204) Ambari Alerts (AMBARI-6354) Ambari Metrics (AMBARI-5707) Automated (Rolling) Upgrade (AMBARI-7804) Stack Support HDP 2.2: Ranger, Spark, Phoenix Hive Metastore HA (AMBARI-6684) HiveServer2 HA (AMBARI-8906) Oozie HA (AMBARI-6683) Ambari Platform Handle umask 027 setting (AMBARI-7796) Ambari Agent non-root (AMBARI-1596) Blueprints API Add Host (AMBARI-8458) For a complete list of changes https://issues.apache.org/jira/browse/AMBARI/fixforversion/12327486
  • 6. Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Lab Setup
  • 7. Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari 2.0 Security Lab Steps – 4 node cluster •  Detailed steps available at: http://bit.ly/1J4IbIs •  Install Ambari server and agents •  Deploy custom Ambari service folders for OpenLDAP, KDC/kadmin, NSLCD •  Use blueprint/API to provision a minimal Hadoop cluster with custom services •  Use Add service wizard to also install Hive •  Configure Ambari to sync/recognize business users in OpenLDAP •  Authentication: Run Ambari security wizard to enable kerberos using KDC/kadmin service •  Install Ranger as Ambari service and configure it to recognize LDAP users •  Authorization/Audit: Setup HDFS/Hive plugins to allow users to set policies to control access and audit consumption
  • 8. Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Core Platform
  • 9. Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Stack Components Support HDP 2.2 HDP 2.1 HDP 2.0 HDFS, YARN, MapReduce, Hive, HBase, Pig, ZooKeeper, Oozie, Sqoop Tez, Storm, Falcon, Flume Knox, Slider, Kafka Ranger, Spark, Phoenix NEW in Ambari 2.0 install/manage/monitor
  • 10. Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Admin > Stack and Versions List of Stack Services Installed or Add Service
  • 11. Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari 2.0.0 High Availability Support High Availability Mode Ambari 1.6.1 Ambari 1.7.0 Ambari 2.0.0 HDFS: NameNode HDP 2.0+ Active/Standby YARN: ResourceManager HDP 2.1+ Active/Standby HBase: HBaseMaster HDP 2.1+ Multi-master Hive: HiveServer2 HDP 2.1+ Multi-instance Hive: Hive Metastore HDP 2.1+ Multi-instance Oozie: Oozie Server* HDP 2.1+ Multi-instance * Oozie Server needs external load balancer to complete HA solution
  • 12. Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hive HA Services > Hive > Service Actions + Add Hive Metastore + Add HiveServer2
  • 13. Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Agent Non-Root
  • 14. Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Non-Root Ambari Agent Agent Runs Commands From the Ambari Server •  Configuration Change •  Service Start •  Service Stop Some Command require root level access •  /bin/su  hdfs  -­‐l  -­‐s  /bin/bash  -­‐c  /usr/hdp/current/hadoop-­‐client/sbin/hadoop-­‐ daemon.sh  -­‐-­‐config  /etc/hadoop/conf  start  datanode   Sudo Leveraged •  Configuration for: –  Customizable Users (su hdfs, yarn, etc.) –  Non-Customizable Users (su mysql) –  Commands (yum, mkdir, touch, test, etc.) Ambari AgentAmbari AgentAmbari Agent python
  • 15. Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Configuring Agent for Non-Root 1.  Create and configure a sudoer account 2.  Manually bootstrap Ambari Agents 3.  Set run_as_user in ambari-agent.ini for the sudoer account Details http://docs.hortonworks.com/HDPDocuments/Ambari-2.0.0.0/bk_ambari_reference_guide/ content/ch_amb_ref_configuring_ambari_for_non-root.html
  • 16. Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Updated umask Handling
  • 17. Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved What about umask? Unix Permissions Basics: (user, group, other) 4 – read 2 – write 1 – execute rwxr-­‐xr-­‐x  == 755 Previous Behavior: •  If (umask > 022); Warning during agent pre-req check •  Installations would fail if ignored New Behavior: •  If (umask > 027); Warning during agent pre-req check •  Installation will fail if ignored
  • 18. Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Alerts
  • 19. Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Alerts Summary •  Migrated away from Nagios as the Ambari alerting system •  No longer offer option to install or manage a Nagios service •  Replaced with built-in alerting system Motivation •  Avoids Nagios package conflicts in customer environments •  More flexibility with alerts in Ambari Stacks •  Platform independence
  • 20. Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Alerts •  Ambari Alerts are installed and configured by default •  Ambari Web provides centralized management of Health Alerts
  • 21. Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Modifying Alerts •  Control thresholds, check intervals and response text
  • 22. Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Alert Groups •  Create and manage groups of alerts •  Group alerts further controls what alerts are dispatched which notifications •  Assign group to notifications •  Only dispatch to interested parties
  • 23. Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Alert Notifications •  What: Create and manage multiple notification targets •  Control who gets notified when •  Why: Filter by severity •  Send only certain notifications to certain targets based on severity •  How: Control dispatch method •  Support for EMAIL + SNMP
  • 24. Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Alerting System 1.  User creates or modifies cluster 2.  Ambari reads alert definitions from Stack 3.  Ambari sends alert definitions to Agents and Agent schedules instance checks 4.  Agents reports alert instance status in the heartbeat 5.  Ambari responds to alert instance status changes and dispatches notifications (if applicable) Ambari Server 1 2 4 Stack definition alerts.json 5 Ambari Agent(s) 3 email snmp
  • 25. Page 25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Notable Alert REST APIs REST Endpoint Description /api/v1/clusters/:clusterName/alert_definitions The list of alert definitions for the cluster. /api/v1/clusters/:clusterName/alerts The list of alert instances for the cluster. Example: find all alert instances that are CRITITAL /api/v1/clusters/c1/alerts?Alert/state.in(CRITICAL) Example: find all alert instances for “ZooKeeper Process” alert def /api/v1/clusters/c1/alerts?Alert/ definition_name=zookeeper_server_process /api/v1/clusters/:clusterName/alert_groups The list of alert groups. /api/v1/clusters/:clusterName/alert_history The list of alert instance status changes. /api/v1/alert_targets/ The list of configured alert notification targets for Ambari.
  • 26. Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Metrics
  • 27. Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Metrics Summary •  Migrated from Ganglia as the Ambari metrics collection system •  No longer offer option to install or manage a Ganglia service •  Replaced with built-in metrics system “Ambari Metrics” Motivation •  Avoids Ganglia package conflicts in customer environments •  More flexibility to retain metrics in Hadoop •  Platform independence
  • 28. Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Terminology Term Definition Ambari Metrics (“AMS”) The built-in metrics collection system for Ambari (“AMS”). Metrics Collector The standalone server that collects metrics, aggregates metrics, serves metrics from the Hadoop service sinks and the Metrics Monitor. Analogous to gmetad. Metrics Monitor Installed on each host in the cluster to collect system-level metrics and forward to the Collector. Analogous to gmond. Metrics Hadoop Sinks Plugs into the Service sinks to send Hadoop metrics to the Collector.
  • 29. Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Metrics Collection System 1.  Metric Monitors send system- level metrics to Collector 2.  Sinks send Hadoop-level metrics to Collector 3.  Metrics Collector service stores and aggregates metrics 4.  Ambari exposes REST API for metrics retrieval Ambari Server Metrics Monitor Metrics Collector Host1 Sink(s) 3 Metrics Monitor Host1 Sink(s)Metrics Monitor Hosts Sink(s) 1 2 4
  • 30. Page 30 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Metrics Collector Built using Hadoop technologies Default uses local filesystem for metrics storage (“embedded”) ** Local Filesystem ** HBase ATS Phoenix ** Tech Preview “distributed” storage option to use existing HDFS
  • 31. Page 31 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Automated Rolling Upgrade For HDP Stack
  • 32. Page 32 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Rolling vs. In Place Upgrades In Place Upgrades Upgrade Stack with one or more service disruptions. Explicit stop all services. Rolling Upgrades Ambari 2.0 Update Stack with minimized service disruption and degradation.
  • 33. Page 33 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Upgrading the Stack with Ambari 2.0 Source HDP Version Target HDP Versions Method HDP 2.0.x HDP 2.0.x HDP 2.1.x HDP 2.2.x In Place HDP 2.1.x HDP 2.1.x HDP 2.2.x In Place HDP 2.2.x HDP 2.2.x Rolling NEW!!!
  • 34. Page 34 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Rolling Upgrade Process Pre- requisites Prepare Rolling Upgrade Finalize Rolling Downgrade Rollback NOT Rolling. Shutdown all services.
  • 35. Page 35 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Process: Rolling Upgrade HDP has a certified process for Rolling Upgrades Services are switched over to new version in rolling fashion ZooKeeper Ranger Core Masters Core Slaves Hive Oozie Falcon Clients Kafka Knox Storm Slider Flume Finalize HDFS, YARN, MR, Tez, HBase, Pig. Hive HDFS YARN HBase
  • 36. Page 36 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Process: Rolling Downgrade ZooKeeper Ranger Core Masters Core Slaves Hive Oozie Falcon Clients Kafka Knox Storm Slider Flume Downgrade V V V V V V V V V V V V V Finalize
  • 37. Page 37 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Wizard Driven Experience Register Install Perform Upgrade Finalize With verification and validation
  • 38. Page 38 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Process: Service Disruption by Component Component Service Disruption Zookeeper No Service Disruption Ranger No Service Disruption HDFS No Service Disruption YARN No Service Disruption HBase No Service Disruption Hive No Service Disruption Oozie No Service Disruption Falcon Yes – Requires Stop/Start Kafka Yes – Requires Stop/Start Knox Yes – Requires Stop/Start Storm Yes – Requires Stop/Start Flume No Service Disruption Slider applications Yes – Requires Stop/Start Hue Yes – Requires Stop/Start Accumulo Yes – Requires Stop/Start
  • 39. Page 39 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Extensibility Stacks, Blueprints and Views
  • 40. Page 40 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Extensibility Features •  To add new Services (ISV or otherwise) beyond HDP Stack •  To customize a Stack for customer specific environments •  To use Ambari for automating cluster installations •  To share best practices on layout and cluster configuration •  To extend and customize the Ambari Web UI •  Add new capabilities, customize existing capabilities Stacks Blueprints Views Goal: Extend Ambari without hard-coding in Ambari
  • 41. Page 41 © Hortonworks Inc. 2011 – 2015. All Rights Reserved New in Ambari 2.0 - Blueprints Add Host Add hosts to a cluster based on a host group from a Blueprint Add one or more hosts with a single call POST /api/v1/clusters/MyCluster/hosts { "blueprint" : "myblueprint", "host_group" : "workers", "host_name" : "c6403.ambari.apache.org" } POST /api/v1/clusters/MyCluster/hosts [ { "blueprint" : "myblueprint", "host_group" : "workers", "host_name" : "c6403.ambari.apache.org" }, { "blueprint" : "myblueprint", "host_group" : "workers", "host_name" : "c6403.ambari.apache.org" } ] https://issues.apache.org/jira/browse/AMBARI-8458
  • 42. Page 42 © Hortonworks Inc. 2011 – 2015. All Rights Reserved More on Ambari Extensibility http://hortonworks.com/partners/learn/#ambari
  • 43. Page 43 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Simplified Kerberos Setup
  • 44. Page 44 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Quick Kerberos Overview REALM •  EXAMPLE.COM Principals (Humans) •  paul@EXAMPLE.COM Service Principals (Services) •  hbase@EXAMPLE.COM •  hbase/r2u3s1.example.local@EXAMPLE.COM Tickets •  “paul@EXAMPLE.COM is authenticated and can access the HBASE service” KDC – Key Distribution Center •  Grant’s authenticated users tickets Client •  r1u2m1.example.com (.example.com maps to realm EXAMPLE.COM) •  EXAMPLE.COM’s KDC is hosted on r1u2m3.example.com
  • 45. Page 45 © Hortonworks Inc. 2011 – 2015. All Rights Reserved KDC Implementation Options •  Microsoft Active Directory •  Users •  Service Principals •  MIT Kerberos •  Users •  Service Principals •  MIT Kerberos + Microsoft Active Directory (Trust Relationship) •  Users in Active Directory •  Service Principals in MIT
  • 46. Page 46 © Hortonworks Inc. 2011 – 2015. All Rights Reserved What Ambari 2.0 will do •  Step-by-Step wizard to setup Kerberos •  Supports existing MIT KDC and Active Directory (AD) infrastructure •  Deploys and manages Kerberos Clients, and configuration •  First Time Setup as well as New Service/Host/Component •  Automated creation of principals •  Automated generation of keytabs •  Automated distribution of keytabs •  Support for regeneration and distribution of keytabs
  • 47. Page 47 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Prerequisites Category Requirements General •  Ambari Server must be part of cluster •  Ambari Server and all hosts must have JCE installed •  Ambari Server and all hosts must have network access to the KDC KDC Admin •  KDC admin account credentials are on-hand •  !!! Ambari does not retain KDC admin credentials !!! Active Directory •  Security LDAP (LDAPS) connectivity has been configured •  User container for principals has been created and is on-hand •  Admin account has delegated control of “Create, delete and manage user accounts”
  • 48. Page 48 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Terminology Term Definition Service Principals Principals required for HDP Service Components. Ambari Principals Headless principals used by Ambari to perform “smoke tests” and “health alert checks”. KDC Admin Account An administrative account that will be used by Ambari to create principals and generate keytabs in KDC.
  • 49. Page 49 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Principal and Keytab Generation and Distribution 1.  User provides KDC Admin Account credentials to Ambari 2.  Ambari connects to KDC, creates principals (Service and Ambari) needed for cluster 3.  Ambari generates keytabs for the principals 4.  Ambari distributes keytabs to Ambari Server and cluster hosts 5.  Ambari discards the KDC Admin Account credentials Ambari Server KDC 1 2 4 3 5 HDP Cluster
  • 50. Page 50 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari + Service Keytab Files Ambari Server HDP Cluster Hosts Keytabs for Ambari Principals Keytabs for Service + Ambari Principals KDC Service Principals Ambari Principals Ambari and Service Principals
  • 51. Page 51 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Wizard Driven and Automated
  • 52. Page 52 © Hortonworks Inc. 2011 – 2015. All Rights Reserved KDC and KDC Admin Information
  • 53. Page 53 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Customizable Principal Attributes
  • 54. Page 54 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Kerberos Clients Ambari installs Kerberos clients on cluster hosts Optional to not have Ambari manage krb5.conf client config OS Client RHEL/CentOS/OEL krb5-workstation SLES 11 krb5-client Ubuntu 12 krb5-user, krb5-config
  • 55. Page 55 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Configure Ambari and Service Identities
  • 56. Page 56 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Post-Kerberos Scenarios Ambari does not retain KDC admin credentials User is prompted for KDC Admin credentials: •  Add/Delete Host •  Add Service •  Add/Delete Component •  Regenerate Keytabs •  Disable Kerberos
  • 57. Page 57 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Security Lab
  • 58. Page 58 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Security today in Hadoop with HDP Authorization Restrict access to explicit data Audit Understand who did what Data Protection Encrypt data at rest & in motion •  Kerberos in native Apache Hadoop •  HTTP/REST API Secured with Apache Knox Gateway Authentication Who am I/prove it? •  Wire encryption in Hadoop •  Orchestrated encryption with partner tools •  HDFS, Hive and Hbase, Storm and Knox •  Fine grain access control •  Centralized audit reporting •  Policy and access history HDP2.1 Ranger Centralized Security Administration More on Security: http://hortonworks.com/partners/learn/#secure
  • 59. Page 59 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari 2.0 Security Lab Steps •  Detailed steps available at: http://bit.ly/1J4IbIs •  Install Ambari server and agents •  Deploy custom Ambari service folders for OpenLDAP, KDC/kadmin, NSLCD •  Use blueprint/API to provision a minimal Hadoop cluster with custom services •  Use Add service wizard to also install Hive •  Configure Ambari to sync/recognize business users in OpenLDAP •  Authentication: Run Ambari security wizard to enable kerberos using KDC/kadmin service •  Install Ranger as Ambari service and configure it to recognize LDAP users •  Authorization/Audit: Setup HDFS/Hive plugins to allow users to set policies to control access and audit consumption