SlideShare a Scribd company logo
1 of 26
Download to read offline
Cloudera Manager – API’s &
Extensibility
Bala Venkatrao, Products@Cloudera
December 2013

1
Cloudera Manager
End-to-End Administration for CDH

Manage

1
Monitor
2
Diagnose
3
Integrate
4

Easily deploy, configure & optimize clusters

Maintain a central view of all activity

Easily identify and resolve issues

Use Cloudera Manager with existing tools

2

©2013 Cloudera, Inc. All Rights Reserved.
Integrating with your IT Mgmt tools
Datacenter Operations

Various options of integrating Cloudera Manager into your existing
Installation,
Datacenter Operations/Tools Monitoring
Alerting
Deployment
Tools
tools
Tools
e.g. Orion,
• Cloudera Manager API
e.g. Chef,
e.g Nagios,
Tivoli, BMC
Puppet etc.
SNMP etc.
etc.
• Introduced in CM4 (June 2012)
• Installation & deployment
• Monitoring
• SNMP Alerts
• Introduced in CM4.5 (Feb 2013)
• Hadoop Operations
And more…
Cloudera
• Monitoring ‘tsquery’ (Feb 2013)
Manager
• User-defined triggers/alarms (new for C5!)
• Service extensibility (new for C5!)

3

©2013 Cloudera, Inc. All Rights Reserved.
Cloudera Manager (CM) API
•

•

API access was a feature introduced in Cloudera Manager 4.0, providing programmatic access
to cluster operations (such as configuration and restart) and monitoring information (such as
health and metrics).
The CM API is an HTTP REST API, using JSON serialization. The API is served on the same host
and port as the CM web UI, and does not require an extra process or extra configuration. API
users have the same privileges as they do in the web UI world.
• Docs & Examples
http://cloudera.github.io/cm_api/
https://github.com/cloudera/cm_api
• Java/Python clients
http://blog.cloudera.com/blog/2013/05/how-toautomate-your-hadoop-cluster-from-java/

4

©2013Cloudera, Inc. All Rights Reserved.
Examples of integration with CM API
•

Installation & Deployment
•
•

Chef/Puppet
Dell Crowbar
•

•

StackIQ
•

•
•

•

http://blog.cloudera.com/blog/2013/08/how-to-deploy-hadoop-clusters-automatically-withdell-crowbar-and-cloudera-manager/
http://web.stackiq.com/blog/bid/312064/StackIQ-Cluster-Manager-now-integrated-withCloudera

WANdisco – non-stop NN setup
Several other customers/partners leveraging the API’s as part of their
install & deployment process

Monitoring & Alerting
•
•

Oracle Enterprise Manager (via Big Data Appliance)
Nagios
•
•

https://github.com/cloudera/cm_api/tree/master/nagios
https://github.com/harisekhon/nagiosplugins/blob/master/check_hadoop_cloudera_manager_metrics.pl

Develop & Contribute your plug-in’s using Cloudera
• SNMP alerts integration with IBM Netcool
Manager API
5

©2013 Cloudera, Inc. All Rights Reserved.
Cloudera Manager – Monitoring via ‘tsquery’
•

Introduced as part of CM4.5 release (Feb 2013)

•

Great way to add interesting charts (above & beyond what is provided by default)
and monitor metrics that are relevant to your clusters

•

The tsquery language is used to specify statements for retrieving time-series data
from the Cloudera Manager time-series data store

•

Example: How do I compare all disk IO for all the DataNodes that belong to a specific
HDFS service?
select bytes_read, bytes_written where roleType=DATANODE and
serviceName=hdfs1

•

Retrieved time-series data can be plotted via various options – line, bar, scatter, heat
maps, table list etc.

•

Extending this concept to create user-defined triggers/alarms (new for C5!).

•

More details
•

6

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-ManagerDiagnostics-Guide/cm5dg_chart_time_series_data.html

©2013 Cloudera, Inc. All Rights Reserved.
Examples of Cloudera Manager ‘tsquery’
Example1: How do I track the
aggregate Cluster Disk IO?
select dt0(read_bytes_disk_sum),
dt0(write_bytes_disk_sum) where
category = CLUSTER and clusterId =
$CLUSTERID
Example2: How do I compare CPU
usage across hosts?
select dt0(total_cpu_user) / getHostFact(numCores, 1) * 100,
dt0(total_cpu_system) / getHostFact(numCores, 1) * 100,
dt0(total_cpu_nice) / getHostFact(numCores, 1) * 100,
dt0(total_cpu_iowait) / getHostFact(numCores, 1) * 100,
dt0(total_cpu_irq) / getHostFact(numCores, 1) * 100,
dt0(total_cpu_soft_irq) / getHostFact(numCores, 1) * 100

Create & Contribute your ‘tsqueries’!
https://github.com/cloudera/cm_charting_scrapbook
7

©2013 Cloudera, Inc. All Rights Reserved.
Cloudera as an Application Platform

ISV’s view of a Database

Workload
Mgmt

Drivers
JDBC/ODBC

Security
Mgmt

Data
Access
API’s

ISV’s view of an OS

Systems
Mgmt

Package
Mgmt

Core Database

8

Process/
Resource
Mgmt

Security
Mgmt

Data
Access
API’s

Core OS kernel

©2013Cloudera, Inc. All Rights Reserved.

Systems
Mgmt
Cloudera as an Application Platform

ISV’s view of Cloudera

Package
Mgmt

Workload/
Process
Mgmt

Security
Mgmt

Data
Access
API’s

Drivers
JDBC/ODBC

CDH

9

©2013Cloudera, Inc. All Rights Reserved.

Systems
Mgmt
Cloudera Platform Features
Features

Description

Examples

Package Mgmt

- Ability to easily package and distribute binaries/jars via
“Parcels”

Informatica, Syncsort, LZO libraries

Workload/ Process Mgmt

- Ability to deploy applications as stand-alone processes
or via YARN* on the Hadoop cluster
- Isolation of cluster resources

SAS, 0xData, Accumulo, Spark

Security Mgmt

- Support for Kerberos Mgmt
- Role bases access control for Tables/Views in
Hive/Impala via Sentry

Data Access API’s

- HDFS API, HBase API, Search API, Spark API
- Kite (formerly Cloudera Development Kit)

Causata, Basis Tech, CounterTack, Amdocs

Drivers

- ODBC/JDBC drivers for Hive/Impala

Zoomdata, Tableau, Microstrategy, Qlikview

Systems Mgmt

- End-to-End management of an application via Cloudera
Manager (CM)

StackIQ, Dell Crowbar, Oracle OEM

Manage

-Deploy and upgrade (rolling) services and pkgs
-Manage configurations

Monitor

-Proactive health checks
-Track resource utilization
-Custom metrics charts

Diagnose

-Distributed log collection and searching
-Tag and track key events

Integrate

-Access CM via API

* Support for YARN planned as part of CM5.x in FY14

10

©2013Cloudera, Inc. All Rights Reserved.
Example – Deployment via Parcels

The platform for Big Data

+

The ETL app for hadoop

•

•

Smarter Deployment & Administration: Seamless integration with
Cloudera Manager for one-click deployment and easier
administration

•

11

Smarter Architecture: No code generation. ETL engine runs natively
within Hadoop MapReduce, via plugin included in CDH 4.2

Smarter Monitoring: Comprehensive logging capabilities + activity
monitoring through Cloudera Manager

©2013Cloudera, Inc. All Rights Reserved.
How it works
1. Download Syncsort DMX-h “Parcel” file to your custom repository
File contains everything you need to properly
deploy Syncsort DMX-h ETL Edition on Cloudera

2. Distribute & activate DMX-h parcel on your Cloudera cluster

A

C

Find Nodes

Install
Components

Assign Roles

Enter the names of the hosts
which will be included in the
Hadoop cluster. Click
Continue.
12

B

Cloudera Manager
automatically installs the CDH
components on the hosts you
specified.

Verify the roles of the nodes
within your cluster. Make
changes as necessary.

©2013Cloudera, Inc. All Rights Reserved.
Syncsort DMX-h + Cloudera Manager
Cloudera Manager

CDH Cluster + ISV software

Support
Integration
Monitoring

Syncsort
DMX-h

A
P
I

Management

Installation

CDH Nodes

13

DMX-h on every CDH node

©2013Cloudera, Inc. All Rights Reserved.

13
Get a 360° View of Your Cluster, Including DMX-h Logs

View service health
& performance
Get host-level
snapshots
Monitor &
diagnose workloads
Gather, view & Distribute your own Parcels via Cloudera Manager and
Build and search
Hadoop & DMX-h logs

…And more!!
14

share it with the community !
©2013Cloudera, Inc. All Rights Reserved.
Service Extensibility
•

Introduced in C5
•

Still in Beta!

•
•

Similar look and feel as existing services

•

Easy to write (Java-free!)

•

Flexible

•

15

Single management console for CDH, non-CDH services and
ISV applications

Independent release cycle

©2013Cloudera, Inc. All Rights Reserved.
So.. How does it work?
• A JSON file that describes of your service
• Set of control scripts
• Packaged as a JAR file
• As promised, Java-free

16

©2013Cloudera, Inc. All Rights Reserved.
Example: Cloudera Manager Extensions - Spark

17

©2013Cloudera, Inc. All Rights Reserved.
Cloudera Manager Extensions

18

©2013Cloudera, Inc. All Rights Reserved.
Cloudera Manager Extensions: Spark

19

©2013Cloudera, Inc. All Rights Reserved.
Cloudera Manager Extensions: Spark

20

©2013Cloudera, Inc. All Rights Reserved.
Cloudera Manager Extensions: Spark

21

©2013Cloudera, Inc. All Rights Reserved.
The Code
name : “spark”,

#!/bin/bash

roles : [{

CMD=$1

name : "master",

MASTER_PORT=<read in from ./params.properties>

startRunner : {
program : "scripts/control.sh",

case $CMD in

args : [ "start_master",

(start_master)

"./params.properties"]

exec $SPARK_HOME/scripts/spark-start.sh master"

},

;;

parameters : [{

(*)

name : "master_port",

echo "$timestamp Don't understand [$CMD]"

type : "port",

;;

default : 7077

esac

}],
configWriter : {
generators : [{
filename : "params.properties"
}]
}]
22

©2013Cloudera, Inc. All Rights Reserved.
Next Steps
• Documentation & SDK as part of C5 Beta2
or later (definitely before GA!)
• Working with select ISV’s (SAS, 0xData
etc.) as part of Beta to further fine-tune
this feature
Develop & Contribute your Cloudera Manager service extensibility
plug-in’s !
23

©2013Cloudera, Inc. All Rights Reserved.
Service Extensibility

Vertical Extension

Vision of CM Extensibility

Horizontal Extension

0xData

SAS

Syncsort

Informatica

Revolution

API

Ops Apps
Capacity
Mgr

Security
ISV’s

SLA Mgr

Cost
Optimizer

CDH

CM
SNMP API

Oracle
OEM

24

Nagios

Dell

Chef/
Puppet

©2013Cloudera, Inc. All Rights Reserved.

Accumulo

Spark

Giraph
Q&A
• If you interested in learning more,
participating in Beta, contributing plug-ins
or Apps, contact: bala@cloudera.com

25

©2013Cloudera, Inc. All Rights Reserved.
Appendix/Resources
•

•

•

•

•

26

Systems Management
•
Cloudera Manager API
•
http://cloudera.github.io/cm_api/
•
http://blog.cloudera.com/blog/2013/05/how-to-automate-your-hadoop-cluster-from-java/
Package Management
•
Docs on Parcels
•
http://training.cloudera.com/elearning/Parcels/
•
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-ManagerIntroduction/cmi_primer.html
•
http://blog.cloudera.com/blog/2013/05/faq-understanding-the-parcel-binary-distribution-format/
•
http://blog.cloudera.com/blog/2013/07/one-engineers-experience-with-parcel/
Data Access API’s
•
http://blog.cloudera.com/blog/2013/05/cloudera-development-kit-cdk/
•
https://github.com/cloudera/cdk
Workload/Resource Management
•
Cloudera Manager 5 documentation
•
http://cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Manager-ManagingClusters/cm5mc_managing_resources.html
•
http://blog.cloudera.com/blog/2013/05/how-the-sas-and-cloudera-platforms-work-together/
Security Management
•
http://blog.cloudera.com/blog/2013/07/with-sentry-cloudera-fills-hadoops-enterprise-security-gap/

©2013Cloudera, Inc. All Rights Reserved.

More Related Content

What's hot

Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
buildacloud
 
Enhancing OpenStack FWaaS for real world application
Enhancing OpenStack FWaaS for real world applicationEnhancing OpenStack FWaaS for real world application
Enhancing OpenStack FWaaS for real world application
openstackindia
 

What's hot (20)

A Tour of Internal Accumulo Testing
A Tour of Internal Accumulo TestingA Tour of Internal Accumulo Testing
A Tour of Internal Accumulo Testing
 
Mmik powershell dsc_slideshare_v1
Mmik powershell dsc_slideshare_v1Mmik powershell dsc_slideshare_v1
Mmik powershell dsc_slideshare_v1
 
Troubleshooting Apache Cloudstack
Troubleshooting Apache CloudstackTroubleshooting Apache Cloudstack
Troubleshooting Apache Cloudstack
 
Installing Hadoop / Spark from scratch
Installing Hadoop / Spark from scratchInstalling Hadoop / Spark from scratch
Installing Hadoop / Spark from scratch
 
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
 
Virtual Router in CloudStack 4.4
Virtual Router in CloudStack 4.4Virtual Router in CloudStack 4.4
Virtual Router in CloudStack 4.4
 
Enhancing OpenStack FWaaS for real world application
Enhancing OpenStack FWaaS for real world applicationEnhancing OpenStack FWaaS for real world application
Enhancing OpenStack FWaaS for real world application
 
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
 
Why Your Apache Spark Job is Failing
Why Your Apache Spark Job is FailingWhy Your Apache Spark Job is Failing
Why Your Apache Spark Job is Failing
 
Whats new in Cloudstack 4.11 - behind the headlines
Whats new in Cloudstack 4.11 - behind the headlinesWhats new in Cloudstack 4.11 - behind the headlines
Whats new in Cloudstack 4.11 - behind the headlines
 
Cloud stack troubleshooting
Cloud stack troubleshooting Cloud stack troubleshooting
Cloud stack troubleshooting
 
Deploying OpenStack with Chef
Deploying OpenStack with ChefDeploying OpenStack with Chef
Deploying OpenStack with Chef
 
Guide - Migrating from Heroku to AWS using CloudFormation
Guide - Migrating from Heroku to AWS using CloudFormationGuide - Migrating from Heroku to AWS using CloudFormation
Guide - Migrating from Heroku to AWS using CloudFormation
 
OpenStack Keystone with LDAP
OpenStack Keystone with LDAPOpenStack Keystone with LDAP
OpenStack Keystone with LDAP
 
Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28
 
Chef for OpenStack: OpenStack Spring Summit 2013
Chef for OpenStack: OpenStack Spring Summit 2013Chef for OpenStack: OpenStack Spring Summit 2013
Chef for OpenStack: OpenStack Spring Summit 2013
 
OpenStack in Enterprise
OpenStack in EnterpriseOpenStack in Enterprise
OpenStack in Enterprise
 
YARN
YARNYARN
YARN
 
Compute node HA - current upstream development
Compute node HA - current upstream developmentCompute node HA - current upstream development
Compute node HA - current upstream development
 
OpenStack Deployment with Chef Workshop
OpenStack Deployment with Chef WorkshopOpenStack Deployment with Chef Workshop
OpenStack Deployment with Chef Workshop
 

Viewers also liked

Hadoop trong triển khai Big Data
Hadoop trong triển khai Big DataHadoop trong triển khai Big Data
Hadoop trong triển khai Big Data
Nguyễn Duy Nhân
 

Viewers also liked (7)

Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
 
What the Enterprise Requires - Usability
What the Enterprise Requires - UsabilityWhat the Enterprise Requires - Usability
What the Enterprise Requires - Usability
 
Cloudera hadoop installation
Cloudera hadoop installationCloudera hadoop installation
Cloudera hadoop installation
 
Inside Flume
Inside FlumeInside Flume
Inside Flume
 
Hadoop trong triển khai Big Data
Hadoop trong triển khai Big DataHadoop trong triển khai Big Data
Hadoop trong triển khai Big Data
 
Livy: A REST Web Service For Apache Spark
Livy: A REST Web Service For Apache SparkLivy: A REST Web Service For Apache Spark
Livy: A REST Web Service For Apache Spark
 
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
 

Similar to Cloudera User Group SF - Cloudera Manager: APIs & Extensibility

Pa cloudera manager-api's_extensibility_v2
Pa   cloudera manager-api's_extensibility_v2Pa   cloudera manager-api's_extensibility_v2
Pa cloudera manager-api's_extensibility_v2
ClouderaUserGroups
 

Similar to Cloudera User Group SF - Cloudera Manager: APIs & Extensibility (20)

Pa cloudera manager-api's_extensibility_v2
Pa   cloudera manager-api's_extensibility_v2Pa   cloudera manager-api's_extensibility_v2
Pa cloudera manager-api's_extensibility_v2
 
Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility
Cloudera User Group Chicago - Cloudera Manager: APIs & ExtensibilityCloudera User Group Chicago - Cloudera Manager: APIs & Extensibility
Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
 
Cloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudCloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the Cloud
 
BlueData Integration with Cloudera Manager
BlueData Integration with Cloudera ManagerBlueData Integration with Cloudera Manager
BlueData Integration with Cloudera Manager
 
Cloudera training: secure your Cloudera cluster
Cloudera training: secure your Cloudera clusterCloudera training: secure your Cloudera cluster
Cloudera training: secure your Cloudera cluster
 
Cloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera Director: Unlock the Full Potential of Hadoop in the CloudCloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera Director: Unlock the Full Potential of Hadoop in the Cloud
 
Cloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemacht
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全
 
Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxData platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptx
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
 
Introducing Cloudera Director at Big Data Bash
Introducing Cloudera Director at Big Data BashIntroducing Cloudera Director at Big Data Bash
Introducing Cloudera Director at Big Data Bash
 
Hadoop on Cloud: Why and How?
Hadoop on Cloud: Why and How?Hadoop on Cloud: Why and How?
Hadoop on Cloud: Why and How?
 
One Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data MeetupOne Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data Meetup
 
One Hadoop, Multiple Clouds
One Hadoop, Multiple CloudsOne Hadoop, Multiple Clouds
One Hadoop, Multiple Clouds
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
Multi-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BTMulti-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BT
 
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
 
Apache Accumulo Overview
Apache Accumulo OverviewApache Accumulo Overview
Apache Accumulo Overview
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Cloudera User Group SF - Cloudera Manager: APIs & Extensibility

  • 1. Cloudera Manager – API’s & Extensibility Bala Venkatrao, Products@Cloudera December 2013 1
  • 2. Cloudera Manager End-to-End Administration for CDH Manage 1 Monitor 2 Diagnose 3 Integrate 4 Easily deploy, configure & optimize clusters Maintain a central view of all activity Easily identify and resolve issues Use Cloudera Manager with existing tools 2 ©2013 Cloudera, Inc. All Rights Reserved.
  • 3. Integrating with your IT Mgmt tools Datacenter Operations Various options of integrating Cloudera Manager into your existing Installation, Datacenter Operations/Tools Monitoring Alerting Deployment Tools tools Tools e.g. Orion, • Cloudera Manager API e.g. Chef, e.g Nagios, Tivoli, BMC Puppet etc. SNMP etc. etc. • Introduced in CM4 (June 2012) • Installation & deployment • Monitoring • SNMP Alerts • Introduced in CM4.5 (Feb 2013) • Hadoop Operations And more… Cloudera • Monitoring ‘tsquery’ (Feb 2013) Manager • User-defined triggers/alarms (new for C5!) • Service extensibility (new for C5!) 3 ©2013 Cloudera, Inc. All Rights Reserved.
  • 4. Cloudera Manager (CM) API • • API access was a feature introduced in Cloudera Manager 4.0, providing programmatic access to cluster operations (such as configuration and restart) and monitoring information (such as health and metrics). The CM API is an HTTP REST API, using JSON serialization. The API is served on the same host and port as the CM web UI, and does not require an extra process or extra configuration. API users have the same privileges as they do in the web UI world. • Docs & Examples http://cloudera.github.io/cm_api/ https://github.com/cloudera/cm_api • Java/Python clients http://blog.cloudera.com/blog/2013/05/how-toautomate-your-hadoop-cluster-from-java/ 4 ©2013Cloudera, Inc. All Rights Reserved.
  • 5. Examples of integration with CM API • Installation & Deployment • • Chef/Puppet Dell Crowbar • • StackIQ • • • • http://blog.cloudera.com/blog/2013/08/how-to-deploy-hadoop-clusters-automatically-withdell-crowbar-and-cloudera-manager/ http://web.stackiq.com/blog/bid/312064/StackIQ-Cluster-Manager-now-integrated-withCloudera WANdisco – non-stop NN setup Several other customers/partners leveraging the API’s as part of their install & deployment process Monitoring & Alerting • • Oracle Enterprise Manager (via Big Data Appliance) Nagios • • https://github.com/cloudera/cm_api/tree/master/nagios https://github.com/harisekhon/nagiosplugins/blob/master/check_hadoop_cloudera_manager_metrics.pl Develop & Contribute your plug-in’s using Cloudera • SNMP alerts integration with IBM Netcool Manager API 5 ©2013 Cloudera, Inc. All Rights Reserved.
  • 6. Cloudera Manager – Monitoring via ‘tsquery’ • Introduced as part of CM4.5 release (Feb 2013) • Great way to add interesting charts (above & beyond what is provided by default) and monitor metrics that are relevant to your clusters • The tsquery language is used to specify statements for retrieving time-series data from the Cloudera Manager time-series data store • Example: How do I compare all disk IO for all the DataNodes that belong to a specific HDFS service? select bytes_read, bytes_written where roleType=DATANODE and serviceName=hdfs1 • Retrieved time-series data can be plotted via various options – line, bar, scatter, heat maps, table list etc. • Extending this concept to create user-defined triggers/alarms (new for C5!). • More details • 6 http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-ManagerDiagnostics-Guide/cm5dg_chart_time_series_data.html ©2013 Cloudera, Inc. All Rights Reserved.
  • 7. Examples of Cloudera Manager ‘tsquery’ Example1: How do I track the aggregate Cluster Disk IO? select dt0(read_bytes_disk_sum), dt0(write_bytes_disk_sum) where category = CLUSTER and clusterId = $CLUSTERID Example2: How do I compare CPU usage across hosts? select dt0(total_cpu_user) / getHostFact(numCores, 1) * 100, dt0(total_cpu_system) / getHostFact(numCores, 1) * 100, dt0(total_cpu_nice) / getHostFact(numCores, 1) * 100, dt0(total_cpu_iowait) / getHostFact(numCores, 1) * 100, dt0(total_cpu_irq) / getHostFact(numCores, 1) * 100, dt0(total_cpu_soft_irq) / getHostFact(numCores, 1) * 100 Create & Contribute your ‘tsqueries’! https://github.com/cloudera/cm_charting_scrapbook 7 ©2013 Cloudera, Inc. All Rights Reserved.
  • 8. Cloudera as an Application Platform ISV’s view of a Database Workload Mgmt Drivers JDBC/ODBC Security Mgmt Data Access API’s ISV’s view of an OS Systems Mgmt Package Mgmt Core Database 8 Process/ Resource Mgmt Security Mgmt Data Access API’s Core OS kernel ©2013Cloudera, Inc. All Rights Reserved. Systems Mgmt
  • 9. Cloudera as an Application Platform ISV’s view of Cloudera Package Mgmt Workload/ Process Mgmt Security Mgmt Data Access API’s Drivers JDBC/ODBC CDH 9 ©2013Cloudera, Inc. All Rights Reserved. Systems Mgmt
  • 10. Cloudera Platform Features Features Description Examples Package Mgmt - Ability to easily package and distribute binaries/jars via “Parcels” Informatica, Syncsort, LZO libraries Workload/ Process Mgmt - Ability to deploy applications as stand-alone processes or via YARN* on the Hadoop cluster - Isolation of cluster resources SAS, 0xData, Accumulo, Spark Security Mgmt - Support for Kerberos Mgmt - Role bases access control for Tables/Views in Hive/Impala via Sentry Data Access API’s - HDFS API, HBase API, Search API, Spark API - Kite (formerly Cloudera Development Kit) Causata, Basis Tech, CounterTack, Amdocs Drivers - ODBC/JDBC drivers for Hive/Impala Zoomdata, Tableau, Microstrategy, Qlikview Systems Mgmt - End-to-End management of an application via Cloudera Manager (CM) StackIQ, Dell Crowbar, Oracle OEM Manage -Deploy and upgrade (rolling) services and pkgs -Manage configurations Monitor -Proactive health checks -Track resource utilization -Custom metrics charts Diagnose -Distributed log collection and searching -Tag and track key events Integrate -Access CM via API * Support for YARN planned as part of CM5.x in FY14 10 ©2013Cloudera, Inc. All Rights Reserved.
  • 11. Example – Deployment via Parcels The platform for Big Data + The ETL app for hadoop • • Smarter Deployment & Administration: Seamless integration with Cloudera Manager for one-click deployment and easier administration • 11 Smarter Architecture: No code generation. ETL engine runs natively within Hadoop MapReduce, via plugin included in CDH 4.2 Smarter Monitoring: Comprehensive logging capabilities + activity monitoring through Cloudera Manager ©2013Cloudera, Inc. All Rights Reserved.
  • 12. How it works 1. Download Syncsort DMX-h “Parcel” file to your custom repository File contains everything you need to properly deploy Syncsort DMX-h ETL Edition on Cloudera 2. Distribute & activate DMX-h parcel on your Cloudera cluster A C Find Nodes Install Components Assign Roles Enter the names of the hosts which will be included in the Hadoop cluster. Click Continue. 12 B Cloudera Manager automatically installs the CDH components on the hosts you specified. Verify the roles of the nodes within your cluster. Make changes as necessary. ©2013Cloudera, Inc. All Rights Reserved.
  • 13. Syncsort DMX-h + Cloudera Manager Cloudera Manager CDH Cluster + ISV software Support Integration Monitoring Syncsort DMX-h A P I Management Installation CDH Nodes 13 DMX-h on every CDH node ©2013Cloudera, Inc. All Rights Reserved. 13
  • 14. Get a 360° View of Your Cluster, Including DMX-h Logs View service health & performance Get host-level snapshots Monitor & diagnose workloads Gather, view & Distribute your own Parcels via Cloudera Manager and Build and search Hadoop & DMX-h logs …And more!! 14 share it with the community ! ©2013Cloudera, Inc. All Rights Reserved.
  • 15. Service Extensibility • Introduced in C5 • Still in Beta! • • Similar look and feel as existing services • Easy to write (Java-free!) • Flexible • 15 Single management console for CDH, non-CDH services and ISV applications Independent release cycle ©2013Cloudera, Inc. All Rights Reserved.
  • 16. So.. How does it work? • A JSON file that describes of your service • Set of control scripts • Packaged as a JAR file • As promised, Java-free 16 ©2013Cloudera, Inc. All Rights Reserved.
  • 17. Example: Cloudera Manager Extensions - Spark 17 ©2013Cloudera, Inc. All Rights Reserved.
  • 19. Cloudera Manager Extensions: Spark 19 ©2013Cloudera, Inc. All Rights Reserved.
  • 20. Cloudera Manager Extensions: Spark 20 ©2013Cloudera, Inc. All Rights Reserved.
  • 21. Cloudera Manager Extensions: Spark 21 ©2013Cloudera, Inc. All Rights Reserved.
  • 22. The Code name : “spark”, #!/bin/bash roles : [{ CMD=$1 name : "master", MASTER_PORT=<read in from ./params.properties> startRunner : { program : "scripts/control.sh", case $CMD in args : [ "start_master", (start_master) "./params.properties"] exec $SPARK_HOME/scripts/spark-start.sh master" }, ;; parameters : [{ (*) name : "master_port", echo "$timestamp Don't understand [$CMD]" type : "port", ;; default : 7077 esac }], configWriter : { generators : [{ filename : "params.properties" }] }] 22 ©2013Cloudera, Inc. All Rights Reserved.
  • 23. Next Steps • Documentation & SDK as part of C5 Beta2 or later (definitely before GA!) • Working with select ISV’s (SAS, 0xData etc.) as part of Beta to further fine-tune this feature Develop & Contribute your Cloudera Manager service extensibility plug-in’s ! 23 ©2013Cloudera, Inc. All Rights Reserved.
  • 24. Service Extensibility Vertical Extension Vision of CM Extensibility Horizontal Extension 0xData SAS Syncsort Informatica Revolution API Ops Apps Capacity Mgr Security ISV’s SLA Mgr Cost Optimizer CDH CM SNMP API Oracle OEM 24 Nagios Dell Chef/ Puppet ©2013Cloudera, Inc. All Rights Reserved. Accumulo Spark Giraph
  • 25. Q&A • If you interested in learning more, participating in Beta, contributing plug-ins or Apps, contact: bala@cloudera.com 25 ©2013Cloudera, Inc. All Rights Reserved.
  • 26. Appendix/Resources • • • • • 26 Systems Management • Cloudera Manager API • http://cloudera.github.io/cm_api/ • http://blog.cloudera.com/blog/2013/05/how-to-automate-your-hadoop-cluster-from-java/ Package Management • Docs on Parcels • http://training.cloudera.com/elearning/Parcels/ • http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-ManagerIntroduction/cmi_primer.html • http://blog.cloudera.com/blog/2013/05/faq-understanding-the-parcel-binary-distribution-format/ • http://blog.cloudera.com/blog/2013/07/one-engineers-experience-with-parcel/ Data Access API’s • http://blog.cloudera.com/blog/2013/05/cloudera-development-kit-cdk/ • https://github.com/cloudera/cdk Workload/Resource Management • Cloudera Manager 5 documentation • http://cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Manager-ManagingClusters/cm5mc_managing_resources.html • http://blog.cloudera.com/blog/2013/05/how-the-sas-and-cloudera-platforms-work-together/ Security Management • http://blog.cloudera.com/blog/2013/07/with-sentry-cloudera-fills-hadoops-enterprise-security-gap/ ©2013Cloudera, Inc. All Rights Reserved.