SlideShare a Scribd company logo
1 of 56
Download to read offline
Please give me your feedback
–Use the mobile app to complete a session survey
1. Access “My schedule”
2. Click on the session detail page
3. Scroll down to “Rate & review”
– If the session is not on your schedule, just find it via the Discover app’s “Session Schedule” menu, click on this session, and scroll down
to “Rate & Review”
– If you have not downloaded our event app, please go to your phone’s app store and search on “Discover 2016 Las Vegas”
– Thank you for providing your feedback,
which helps us enhance content for future events.
Session ID: TB8568 Speaker: Gary Brandt and Ronnie Falgout
#HPEDiscover
Fast-track the value of
central IT with HPE
Operations Analytics
Ronnie Falgout, IT Delivery Manager
Gary Brandt, OpsA Product Manager
#HPEDiscover
@HPE_Discover
June 2016
4
Speaker biography/multiple speakers
Gary Brandt
Hewlett Packard Enterprise Software
gary.brandt@hpe.com
– Number of years in IT 17
– Previous experience (industry experience) 4
– Domain knowledge
– Operational Analytics
– IT Operations Management
– Enterprise Architecture
Ronnie Falgout
GIT Global Delivery
ronnie.falgout@hpe.com
– Number of years in IT 16
– Previous experience (industry experience) 26
– Domain knowledge
– IT Operations Management
– IT Automation and Service Management
– Operations Analytics
#HPEDiscover
Outside forces are disrupting businesses and government
Internet of Things,
explosion of devices
5
New, disruptive business
models
The Idea Economy
Cloud is redefining how applications
and devices are written and delivered
No business, industry or
government is safe
Turning ideas into new
products or services has
never been easier
#HPEDiscover
Inside forces are pushing you to evolve IT
6
Shadow IT is
everywhere
Technology is business
strategy
Developers are
the new Kingmakers
DevOps driving
culture shifts
#HPEDiscover
Distributed compute
Distributed systems and
containers
Distributed
data
Data locality
and latency
Multi-cloud
brokerage
Management and
governance
Continuous
delivery
DevOps
speed
Analytics and
visualization
Real-time and
predictive
7
Backend
Frontend
Devices
Humans
App-to-app
#HPEDiscover
Technology architectures are rapidly shifting
Traditional IT Digital enterprise
Provide hardened
systems and networks
Manage and mitigate risk
Efficiently host
workloads and services
Continuously create and
deliver new services
Store and manage data
Software automates
business systems
Software differentiates
products and services
Provide real-time
insight and understanding
IT must bridge the traditional and new
8
The right
balance between
traditional
and digital
#HPEDiscover
The customer transformation journey
Automate, orchestrate and transform
9
Traditional
IT
Digital
enterprise
Transform
delivery
Orchestrate
processes
Automate
tasks
Focus on user experience
Gain customer engagement and loyalty
Leverage Big Data
Realize continuous improvement
Once implemented, these
three steps will give you Efficiency Agility Experience
#HPEDiscover
IT Operations Management solutions
Simplified ITOM solution set
10
Automation
solutions
Intelligently drive efficiency across the
virtualized datacenter
Transform
solutions
Modernize customer experience for
cloud native and traditional applications
Orchestration
solutions
Increase speed of delivery in a
heterogeneous, hybrid cloud environment.
Datacenter automation
Operations bridge
Service management automation Cloud orchestration
Service Broker
User experience management
Solutions are delivered in a simple, consistent way to drive TTV.
SaaS I software appliance | remote managed service
System Network Storage
#HPEDiscover
Analytics in IT
11
HPE
Operations
Analytics
Market trends and growth drivers
ITOA
growth drivers
Customer
expectations
– Outdated systems
– Point tools limitations
– Complex diverse environments
– Flexible, scalable architecture
– Transform data into intelligence
– Problem detection and prediction
#HPEDiscover
Servers Network Storage
All green doesn’t always mean
all-clear
Analyzing performance problems is hard
13
Servers Network Storage
One single problem can
trigger multiple events
?
Limited
view into
resource utilization
Hidden
performance issues
and trends
Low
visibility across
OneView domains
#HPEDiscover
The answer lies in your data
But how do you make sense of it?
14
siloed data sources types of data of device types
of different operating
systems
data per server/day
Mobile app
Network
Cloud
System LOB data
Storage
#HPEDiscover
Results
Reduce outages
Faster resolution
Optimize
resources
Increased
productivity
Introducing HPE Operations Analytics
Predictive
analytics
Machine
learning
Relationship score
Automated
log and event
analysis
Anomaly
detection
and alerting
Visual
analytics
and RCA
HPE Operations
analytics
Standalone, scalable platform
HPE Vertica
Data types
Mobile app
Network
Cloud
System LOB data
Storage
#HPEDiscover
Behavioral learning
Clustering
Predicting
future behavior
Event analytics
Anomaly detection
Unstructured
text indexing,
search and inference
Machine learning powers HPE Operations Analytics
Developed in collaboration with HPE Labs
16
Machine
Learning
Predictive
analytics
Relationship score
Automated
log and event
analysis
Anomaly
detection
and alerting
Visual
analytics
and RCA
HPE Operations
Analytics
Standalone, scalable platform
Machine
learning
#HPEDiscover
Overview of Operations Analytics
17#HPEDiscover
HPE Operations Analytics
Key features
Log and event analytics
Focus on relevant items
for quicker resolution
Automated analysis of
logs and events
#HPEDiscover
HPE Operations Analytics
Key features
Root Cause Analysis (RCA)
Identify when problems start
Visual analytics
Clear, intuitive dashboards
Performance
heat map
Performance
overview
#HPEDiscover
HPE Operations Analytics
Key features
Advanced log search
Deep-dive into messages
Relationship score
Connection between metrics
Smart filter
Relationship score
#HPEDiscover
HPE Operations Analytics
Key features
Predictive analytics
Forecast future performance with one click
Anomaly alerting
Real-time problem warnings
Predict button
Dynamic
baselines
#HPEDiscover
HPE Operations Analytics
Standard use cases
22
Anomaly detection and
troubleshooting
Historical and
predictive analytics
Business insights
Big Data store and analysis
#HPEDiscover
HPE IT Operations Analytics
How HPE IT uses Big Data in IT operations
23
HPE IT key operational data
24
Number of incident
tickets per month
Average help desk
calls per month
Average number of
major incidents/
meetings per
month
Proactive monitoring
of planned changes
per month
Configuration items in uCMDB68,770
48,000
5,000,000
57,000
66,000
2,000
Servers
Network devices
Applications
900
300/800
Scheduled jobs executed per
month
19,000,000
Event notifications sent
per month
1,500
4 Private cloud
datacenters
>8,000 simulated
transactions
365 global
locations
4 traditional
datacenters
#HPEDiscover
Troubleshooting without Operations Analytics
– Many subject matter experts
involved in major incidents
– Manual analysis in isolation
– Manual correlation of data
– Long time to identify root cause
Operation support
team
Application SME
Network SME
Security SME
Server SME
Application ecosystem
Physical or virtual server
Business
application
Network
25
Storage
Storage SME
Database SME
#HPEDiscover
Troubleshooting with Operations Analytics
– All relevant data in a single dashboard
– Data is timely and correlated
– Data easily viewed in visual analytics
– Historical view of data instantly
available
– Faster time to identify root cause with
fewer people involved
OpsA Operation support
Application ecosystem
Physical or virtual server
Business
application
Network
Storage
8#HPEDiscover
HPE Operations Analytics in HPE IT
Trend
analysis
Predictive
insights
Anomaly
detection
Unknown
root cause
resolution
Application
(SiteScope)
Network
(Network Node Mgr +iSPIs)
Cloud VMs
(Operations Agents)
System Perf
(Operations Agents)
Third-party tools
(SCOM, Lync, Exchange)
Event Data
(OMi)
Metrics
Events
Topology
Logs
Log Data
(ArcSight)
Operations Analytics is an analytics platform for IT to proactively manage its operational performance and reduce mean time to repair. It is able to take in
data from all sources and utilize different data types, not just performance metrics and events, but topology data and logs.
Big Data Store
Vertica
HPE Operations Analytics
Visual
analytics
Automated log
analytics
Predictive
analytics
Content
framework
Intelligent
search
Guided
troubleshooting
8 datacenters
2600 apps
25K databases
57K servers
5M objects
66K network devices
Custom Data
(CSV, XML)
#HPEDiscover
HPE IT use cases and scenarios
Widespread production network outage in HPE IT
28
Widespread production network outage in HPE IT
Transient network outage
Affecting multiple HPE work sites
“All hands on deck” production issue
Network monitoring causing event storm
Saturated support teams
Steam of incidents causing noise
29
1000s of critical network
events detected in BSM
#HPEDiscover
Analytics-based abnormality detection
30
Solution
Benefit
– OpsA is Big Data
– Correlate multiple sources together to narrow
problem
– Identifying patterns in huge amounts of
network syslog data
– Patterns reveal leading cause
Pattern detected in log data
that reveals the problem
Correlate metrics and logs
with problem time (network
events)
– Huge time savings (less than 30 min. to find
cause)
– Faster restoration of service
– Fewer SME required to troubleshoot
#HPEDiscover
HPE IT use cases and scenarios
Troubleshooting application and database
production issues
Troubleshooting application and database production issues
– Solving problems before performance is
affected
– Siloed teams mean no big picture
Challenge
17#HPEDiscover
ACT
Analyze millions of messages to reveal root cause with automated log
analytics
Automatically reveals time
and count of most significant
log events.
#HPEDiscover
Drill down to actual root cause log messages
Automatically reveals time
and count of most significant
log events.
View the log message
content to identify root cause
Number of occurrences of
significant log data over time
#HPEDiscover
Troubleshooting application and database production issues
35
– Real-time dashboards
– Automated log analytics
– Fast root cause identification (¼ the time)
– Fewer experts involved (5 SMEs to 1)
– Cuts order backlog by 50%
Solution
Benefit
#HPEDiscover
HPE IT 3PAR use case
OpsA increasing value to LOB
Hewlett Packard Enterprise 3PAR storage line of business
HPE premier storage business
Proactive “phone home” monitoring service available to 3PAR
customers
Service enables 3PAR customers with latest capabilities and
proactive protection of potential problems
HPE IT systems enable/support “phone home” services
37#HPEDiscover
Optimizing HPE 3PAR operations using Big Data analytics
– Isolate problems
– Reduction of file transfers late
– Reduce file transfer overdue
– Difficult to isolate problem
– Manually interpreting behavior
– Near real-time metric collection using
OpsA
– Define and measure big-picture view
of 3PAR ecosystem
– OpsA baselines defines “normal”
behavior
– OpsA guided troubleshooting
– Quickly identify what is not ‘normal’
– Faster to diagnose problems
– Eliminated manual efforts of
collecting and correlating data
– Decreased Mean Time To Recover
(MTTR)
38
Solution BenefitsBusiness challenge
#HPEDiscover
Use baselines to define “what’s normal”
Baselining metrics help IT define normal
behavior
Starting point for troubleshooting
39#HPEDiscover
Analyze the eco-system
40
– Define services that describe the ecosystem
– Quickly analyze the ecosystem
– Correlate metrics from disparate areas
– Identify areas of impacts
Network metrics
(nfs call rate)
Application metrics
(processing rate)
Database metrics
(active session count)
Application metrics
(file queues)
File system metrics
(disk queues)
Identify trends and take action before problem occurs
41
Dangerous rate of
file system growth
#HPEDiscover
HPE IT use cases and scenarios
Analytics for predictive anomaly detection
42
@HPE office@Remote
Remote Microsoft® Lync user experience
– Microsoft® Lync depended on daily by
thousands of HPE employees
– Complex infrastructure means difficulty
in diagnosing issues
Challenge
Total PSTN conferences/week 14,909
PSTN (public switched telephone network)
Mobile users
Remote road warrior
users
Lync
application
Coffee houses
Airports
14#HPEDiscover
OpsA advance troubleshooting
The correlation values vary from -1 to 1. The higher the absolute value of the correlation, the closer the relationship.
Metric A Metric BCalculated correlation
score of metrics A and B
Seeming dozens of unrelated metrics from
many disparate sources
Challenge
Analytic real-time scoring determines how closely
related multiple disparate metrics are to the
problem
Solution
Outcome 90% statistical correlation between “Bad Requests Received” and Lync Edge Server authentication failures.
#HPEDiscover
OpsA advance troubleshooting, another example
The correlation values vary from -1 to 1. The higher the absolute value of the correlation, the closer the relationship.
Outcome “Sends Outstanding” performance metrics correlate 100% with server network errors.
Metric A Metric B
Calculated correlation
score of metrics A and B
Use analytics to determine how closely related multiple disparate metrics are to the problem
#HPEDiscover
Analytics-based abnormality detection
46
take action before a problem occurs
Performance metrics Dynamic baseline
Zone of prevention
Fixed threshold
Solution
Benefit
Near real-time alerting of
anomaly trend
– Analytics to narrow focus of troubleshoot
– Abnormality detection
– Alerts triggered by anomaly trends
– Faster diagnostic time
– Less reaction, more prevention
– Automated correction
#HPEDiscover
HPE IT use cases and scenarios
Miscellaneous examples
47
Event analysis
#HPEDiscover
HPIT uses analytics to detect dangerous event patterns
Identified HPIT application generating events at an increasing rate over 5 day period
Detect a specific HPIT application generating events in 90th percentile (i.e. ‘nosiest’ application).
Identify risk of dangerous pattern of a trend of events increasing spiking at midnight.
Support proactively took action before major problem occurred.
Breach in normal baseline
Breach in normal baseline
Dynamic normal baseline
#HPEDiscover
HPIT applies predictive analytics to prevent problems
Predictive views of server performance behavior under specific workloads
1
2
3
35% Increase
1 Dynamic baselines automatically created for all metrics collected for a server.
Server Memory Utilization metrics show increasing over time.
Applying predictive analytics on server’s memory metrics predicts a 35% increase in
memory utilization under current workload over next two weeks.
3
2
#HPEDiscover
Applying predictive analytics to key applications
Predictive views of HPIT applications performance behavior
Predict future performance
patterns based on
historical baselines.
#HPEDiscover
HPE IT’s OpsA journey
Continuous progress
52
Early 2014 Mid 2014 Late 2014 2015 2016 (Next)
New data sources
Expand to key
Applications
Sitescope
Integration
Application metrics
Introduce predictive
capabilities into
support
New data sources
Integrated OMi
Event data
Integrated Network
metrics
~60K devices
Database Logs
Analyics
~20K DBs
Analytics on key
applications and
Business (i.e. 3PAR)
OpsA PoCs
Troubleshooting
Microsoft®
Exchange Proof of
Concept
Expanded IT
Private Cloud 18K
virtual servers
Introduced OpsA
Server metrics IT
Private Cloud (~10K
virtual servers).
Cloud Infrastructure
Support team
Expand coverage
Traditional server
metrics (40K
servers) and virtual
cloud (+20K VMs)
Network
Outage
Opportunity
Apply analytics to
large scale
production network
outage.
Global Telecom
Support team
Continue
expansion
Predictive alerting
Event Analytics
Integration into
HPE Helion Cloud
OpsA PoCs
Troubleshooting and
Anomaly detection
Microsoft® Lync
Proof of Concept
#HPEDiscover
8 datacenters
HPE IT Operations Analytics solution
OpsA highly
scalable
collection
framework
Integration
with ArcSight,
OMi, Sitescope,
BPM,
Logstash,
JDBC,
TCP/UDP,
REST WS
Big data
analytics
platform
Highly scalable
Cluster-based
Column -
oriented
Visual Analytics
Play back
dashboard
results
Phrased
Search
Guided
Troubleshooting
User defined
topologies
Predictive
analytics
Industry
analytics (R
packages)
Pattern
detection via
correlation
coefficient
Abnormality
behavior
detection
Automated
machine
learning drives
log and event
analytics
Collection Vertica Analytics VisualizationForensicsEnvironment Users
2600 apps
25K databases
66K network devices
56K servers
5M objects
38#HPEDiscover
54
Questions?
Get more information
55
Attend these sessions:
– HOL9100 Go hands-on with HPE Operations Analytics; reveal
what’s hidden in your data
– BB8013 HPE Operations Analytics; providing validity to
Safeguard Properties’ monitoring footprint
– RT 9084 Breaking Bad processes; align central IT and the
business using HPE Operations Analytics
– RT9083 Increase the efficiency of support teams with
automated Analytics-as-a-Service
Visit these demos: Follow us on Social Media:
– DEMO8816 HPE Operations Analytics;
automated machine learning and predictive
analysis at the speed of business
– TPS9206 The Future of Operations Analytics
– Twitter @HPE_ITOps
– LinkedIn linkedin.com/company/hpe-software
– Facebook facebook.com/HPESoftware
– Blog http://hpsw.co/BSMblog
#HPEDiscover
Thank you
gary.brandt@hpe.com
56

More Related Content

What's hot

What's hot (20)

Splunk: How to Design, Build and Map IT Services
Splunk: How to Design, Build and Map IT ServicesSplunk: How to Design, Build and Map IT Services
Splunk: How to Design, Build and Map IT Services
 
How to Design, Build and Map IT and Business Services in Splunk
How to Design, Build and Map IT and Business Services in SplunkHow to Design, Build and Map IT and Business Services in Splunk
How to Design, Build and Map IT and Business Services in Splunk
 
Splunk for IT Operations
Splunk for IT OperationsSplunk for IT Operations
Splunk for IT Operations
 
Splunk Ninjas: New Features and Search Dojo
Splunk Ninjas: New Features and Search DojoSplunk Ninjas: New Features and Search Dojo
Splunk Ninjas: New Features and Search Dojo
 
Keynote Presentation
Keynote PresentationKeynote Presentation
Keynote Presentation
 
Splunk for IT Operations
Splunk for IT OperationsSplunk for IT Operations
Splunk for IT Operations
 
Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...
Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...
Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...
 
Distributed Management Console Breakout Session
Distributed Management Console Breakout Session Distributed Management Console Breakout Session
Distributed Management Console Breakout Session
 
How to Align Your Daily Splunk Activities Breakout Session
How to Align Your Daily Splunk Activities Breakout SessionHow to Align Your Daily Splunk Activities Breakout Session
How to Align Your Daily Splunk Activities Breakout Session
 
SplunkLive! Utrecht 2016 - NXP
SplunkLive! Utrecht 2016 - NXPSplunkLive! Utrecht 2016 - NXP
SplunkLive! Utrecht 2016 - NXP
 
Splunk for ITOps
Splunk for ITOpsSplunk for ITOps
Splunk for ITOps
 
How to Design, Build and Map IT and Business Services in Splunk
How to Design, Build and Map IT and Business Services in SplunkHow to Design, Build and Map IT and Business Services in Splunk
How to Design, Build and Map IT and Business Services in Splunk
 
Getting started with Splunk Breakout Session
Getting started with Splunk Breakout SessionGetting started with Splunk Breakout Session
Getting started with Splunk Breakout Session
 
Splunk conf2014 - Dashboard Fun - Creating an Interactive Transaction Profiler
Splunk conf2014 - Dashboard Fun - Creating an Interactive Transaction ProfilerSplunk conf2014 - Dashboard Fun - Creating an Interactive Transaction Profiler
Splunk conf2014 - Dashboard Fun - Creating an Interactive Transaction Profiler
 
Integrating IBM Z and IBM i Operational Intelligence Into Splunk, Elastic, an...
Integrating IBM Z and IBM i Operational Intelligence Into Splunk, Elastic, an...Integrating IBM Z and IBM i Operational Intelligence Into Splunk, Elastic, an...
Integrating IBM Z and IBM i Operational Intelligence Into Splunk, Elastic, an...
 
Get your Service Intelligence off to a Flying Start
Get your Service Intelligence off to a Flying StartGet your Service Intelligence off to a Flying Start
Get your Service Intelligence off to a Flying Start
 
Leverage Machine Data
Leverage Machine DataLeverage Machine Data
Leverage Machine Data
 
Getting Started with Splunk Enterprise Hands-On
Getting Started with Splunk Enterprise Hands-OnGetting Started with Splunk Enterprise Hands-On
Getting Started with Splunk Enterprise Hands-On
 
How to Design, Build and Map IT and Business Services in Splunk
How to Design, Build and Map IT and Business Services in SplunkHow to Design, Build and Map IT and Business Services in Splunk
How to Design, Build and Map IT and Business Services in Splunk
 
Splunk Tutorial for Beginners - What is Splunk | Edureka
Splunk Tutorial for Beginners - What is Splunk | EdurekaSplunk Tutorial for Beginners - What is Splunk | Edureka
Splunk Tutorial for Beginners - What is Splunk | Edureka
 

Similar to TB8568_8568_Presentation

Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkk
guest4e975e2
 
GERSIS INDUSTRY CASES
GERSIS INDUSTRY CASESGERSIS INDUSTRY CASES
GERSIS INDUSTRY CASES
Sergej Markov
 

Similar to TB8568_8568_Presentation (20)

Big Data analytics per le IT Operations
Big Data analytics per le IT OperationsBig Data analytics per le IT Operations
Big Data analytics per le IT Operations
 
HPE_Software_Portfolio_VKS2016
HPE_Software_Portfolio_VKS2016HPE_Software_Portfolio_VKS2016
HPE_Software_Portfolio_VKS2016
 
Analytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsAnalytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old Constraints
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
 
Confluent Partner Tech Talk with BearingPoint
Confluent Partner Tech Talk with BearingPointConfluent Partner Tech Talk with BearingPoint
Confluent Partner Tech Talk with BearingPoint
 
A Winning Strategy for the Digital Economy
A Winning Strategy for the Digital EconomyA Winning Strategy for the Digital Economy
A Winning Strategy for the Digital Economy
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven Business
 
Building Service Intelligence with Splunk IT Service Intelligence (ITSI)
Building Service Intelligence with Splunk IT Service Intelligence (ITSI) Building Service Intelligence with Splunk IT Service Intelligence (ITSI)
Building Service Intelligence with Splunk IT Service Intelligence (ITSI)
 
The Eco-System of AI and How to Use It
The Eco-System of AI and How to Use ItThe Eco-System of AI and How to Use It
The Eco-System of AI and How to Use It
 
Splunk Webinar: IT Operations Demo für Troubleshooting & Dashboarding
Splunk Webinar: IT Operations Demo für Troubleshooting & DashboardingSplunk Webinar: IT Operations Demo für Troubleshooting & Dashboarding
Splunk Webinar: IT Operations Demo für Troubleshooting & Dashboarding
 
Evaluation guide to Streaming Analytics
Evaluation guide to Streaming AnalyticsEvaluation guide to Streaming Analytics
Evaluation guide to Streaming Analytics
 
Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkk
 
A Fully Automated SOC: Fact or Fiction
A Fully Automated SOC: Fact or FictionA Fully Automated SOC: Fact or Fiction
A Fully Automated SOC: Fact or Fiction
 
5 practical operability techniques for teams - Matthew Skelton - ADDO 2018
5 practical operability techniques for teams - Matthew Skelton - ADDO 20185 practical operability techniques for teams - Matthew Skelton - ADDO 2018
5 practical operability techniques for teams - Matthew Skelton - ADDO 2018
 
Data summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data opsData summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data ops
 
Splunk Sales Presentation Imagemaker 2014
Splunk Sales Presentation Imagemaker 2014Splunk Sales Presentation Imagemaker 2014
Splunk Sales Presentation Imagemaker 2014
 
GERSIS INDUSTRY CASES
GERSIS INDUSTRY CASESGERSIS INDUSTRY CASES
GERSIS INDUSTRY CASES
 
Innovating With Data and Analytics
Innovating With Data and AnalyticsInnovating With Data and Analytics
Innovating With Data and Analytics
 
ERP
ERPERP
ERP
 
Delivering New Visibility and Analytics for IT Operations
Delivering New Visibility and Analytics for IT OperationsDelivering New Visibility and Analytics for IT Operations
Delivering New Visibility and Analytics for IT Operations
 

TB8568_8568_Presentation

  • 1.
  • 2. Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session detail page 3. Scroll down to “Rate & review” – If the session is not on your schedule, just find it via the Discover app’s “Session Schedule” menu, click on this session, and scroll down to “Rate & Review” – If you have not downloaded our event app, please go to your phone’s app store and search on “Discover 2016 Las Vegas” – Thank you for providing your feedback, which helps us enhance content for future events. Session ID: TB8568 Speaker: Gary Brandt and Ronnie Falgout #HPEDiscover
  • 3. Fast-track the value of central IT with HPE Operations Analytics Ronnie Falgout, IT Delivery Manager Gary Brandt, OpsA Product Manager #HPEDiscover @HPE_Discover June 2016
  • 4. 4 Speaker biography/multiple speakers Gary Brandt Hewlett Packard Enterprise Software gary.brandt@hpe.com – Number of years in IT 17 – Previous experience (industry experience) 4 – Domain knowledge – Operational Analytics – IT Operations Management – Enterprise Architecture Ronnie Falgout GIT Global Delivery ronnie.falgout@hpe.com – Number of years in IT 16 – Previous experience (industry experience) 26 – Domain knowledge – IT Operations Management – IT Automation and Service Management – Operations Analytics #HPEDiscover
  • 5. Outside forces are disrupting businesses and government Internet of Things, explosion of devices 5 New, disruptive business models The Idea Economy Cloud is redefining how applications and devices are written and delivered No business, industry or government is safe Turning ideas into new products or services has never been easier #HPEDiscover
  • 6. Inside forces are pushing you to evolve IT 6 Shadow IT is everywhere Technology is business strategy Developers are the new Kingmakers DevOps driving culture shifts #HPEDiscover
  • 7. Distributed compute Distributed systems and containers Distributed data Data locality and latency Multi-cloud brokerage Management and governance Continuous delivery DevOps speed Analytics and visualization Real-time and predictive 7 Backend Frontend Devices Humans App-to-app #HPEDiscover Technology architectures are rapidly shifting
  • 8. Traditional IT Digital enterprise Provide hardened systems and networks Manage and mitigate risk Efficiently host workloads and services Continuously create and deliver new services Store and manage data Software automates business systems Software differentiates products and services Provide real-time insight and understanding IT must bridge the traditional and new 8 The right balance between traditional and digital #HPEDiscover
  • 9. The customer transformation journey Automate, orchestrate and transform 9 Traditional IT Digital enterprise Transform delivery Orchestrate processes Automate tasks Focus on user experience Gain customer engagement and loyalty Leverage Big Data Realize continuous improvement Once implemented, these three steps will give you Efficiency Agility Experience #HPEDiscover
  • 10. IT Operations Management solutions Simplified ITOM solution set 10 Automation solutions Intelligently drive efficiency across the virtualized datacenter Transform solutions Modernize customer experience for cloud native and traditional applications Orchestration solutions Increase speed of delivery in a heterogeneous, hybrid cloud environment. Datacenter automation Operations bridge Service management automation Cloud orchestration Service Broker User experience management Solutions are delivered in a simple, consistent way to drive TTV. SaaS I software appliance | remote managed service System Network Storage #HPEDiscover
  • 12. HPE Operations Analytics Market trends and growth drivers ITOA growth drivers Customer expectations – Outdated systems – Point tools limitations – Complex diverse environments – Flexible, scalable architecture – Transform data into intelligence – Problem detection and prediction #HPEDiscover
  • 13. Servers Network Storage All green doesn’t always mean all-clear Analyzing performance problems is hard 13 Servers Network Storage One single problem can trigger multiple events ? Limited view into resource utilization Hidden performance issues and trends Low visibility across OneView domains #HPEDiscover
  • 14. The answer lies in your data But how do you make sense of it? 14 siloed data sources types of data of device types of different operating systems data per server/day Mobile app Network Cloud System LOB data Storage #HPEDiscover
  • 15. Results Reduce outages Faster resolution Optimize resources Increased productivity Introducing HPE Operations Analytics Predictive analytics Machine learning Relationship score Automated log and event analysis Anomaly detection and alerting Visual analytics and RCA HPE Operations analytics Standalone, scalable platform HPE Vertica Data types Mobile app Network Cloud System LOB data Storage #HPEDiscover
  • 16. Behavioral learning Clustering Predicting future behavior Event analytics Anomaly detection Unstructured text indexing, search and inference Machine learning powers HPE Operations Analytics Developed in collaboration with HPE Labs 16 Machine Learning Predictive analytics Relationship score Automated log and event analysis Anomaly detection and alerting Visual analytics and RCA HPE Operations Analytics Standalone, scalable platform Machine learning #HPEDiscover
  • 17. Overview of Operations Analytics 17#HPEDiscover
  • 18. HPE Operations Analytics Key features Log and event analytics Focus on relevant items for quicker resolution Automated analysis of logs and events #HPEDiscover
  • 19. HPE Operations Analytics Key features Root Cause Analysis (RCA) Identify when problems start Visual analytics Clear, intuitive dashboards Performance heat map Performance overview #HPEDiscover
  • 20. HPE Operations Analytics Key features Advanced log search Deep-dive into messages Relationship score Connection between metrics Smart filter Relationship score #HPEDiscover
  • 21. HPE Operations Analytics Key features Predictive analytics Forecast future performance with one click Anomaly alerting Real-time problem warnings Predict button Dynamic baselines #HPEDiscover
  • 22. HPE Operations Analytics Standard use cases 22 Anomaly detection and troubleshooting Historical and predictive analytics Business insights Big Data store and analysis #HPEDiscover
  • 23. HPE IT Operations Analytics How HPE IT uses Big Data in IT operations 23
  • 24. HPE IT key operational data 24 Number of incident tickets per month Average help desk calls per month Average number of major incidents/ meetings per month Proactive monitoring of planned changes per month Configuration items in uCMDB68,770 48,000 5,000,000 57,000 66,000 2,000 Servers Network devices Applications 900 300/800 Scheduled jobs executed per month 19,000,000 Event notifications sent per month 1,500 4 Private cloud datacenters >8,000 simulated transactions 365 global locations 4 traditional datacenters #HPEDiscover
  • 25. Troubleshooting without Operations Analytics – Many subject matter experts involved in major incidents – Manual analysis in isolation – Manual correlation of data – Long time to identify root cause Operation support team Application SME Network SME Security SME Server SME Application ecosystem Physical or virtual server Business application Network 25 Storage Storage SME Database SME #HPEDiscover
  • 26. Troubleshooting with Operations Analytics – All relevant data in a single dashboard – Data is timely and correlated – Data easily viewed in visual analytics – Historical view of data instantly available – Faster time to identify root cause with fewer people involved OpsA Operation support Application ecosystem Physical or virtual server Business application Network Storage 8#HPEDiscover
  • 27. HPE Operations Analytics in HPE IT Trend analysis Predictive insights Anomaly detection Unknown root cause resolution Application (SiteScope) Network (Network Node Mgr +iSPIs) Cloud VMs (Operations Agents) System Perf (Operations Agents) Third-party tools (SCOM, Lync, Exchange) Event Data (OMi) Metrics Events Topology Logs Log Data (ArcSight) Operations Analytics is an analytics platform for IT to proactively manage its operational performance and reduce mean time to repair. It is able to take in data from all sources and utilize different data types, not just performance metrics and events, but topology data and logs. Big Data Store Vertica HPE Operations Analytics Visual analytics Automated log analytics Predictive analytics Content framework Intelligent search Guided troubleshooting 8 datacenters 2600 apps 25K databases 57K servers 5M objects 66K network devices Custom Data (CSV, XML) #HPEDiscover
  • 28. HPE IT use cases and scenarios Widespread production network outage in HPE IT 28
  • 29. Widespread production network outage in HPE IT Transient network outage Affecting multiple HPE work sites “All hands on deck” production issue Network monitoring causing event storm Saturated support teams Steam of incidents causing noise 29 1000s of critical network events detected in BSM #HPEDiscover
  • 30. Analytics-based abnormality detection 30 Solution Benefit – OpsA is Big Data – Correlate multiple sources together to narrow problem – Identifying patterns in huge amounts of network syslog data – Patterns reveal leading cause Pattern detected in log data that reveals the problem Correlate metrics and logs with problem time (network events) – Huge time savings (less than 30 min. to find cause) – Faster restoration of service – Fewer SME required to troubleshoot #HPEDiscover
  • 31. HPE IT use cases and scenarios Troubleshooting application and database production issues
  • 32. Troubleshooting application and database production issues – Solving problems before performance is affected – Siloed teams mean no big picture Challenge 17#HPEDiscover ACT
  • 33. Analyze millions of messages to reveal root cause with automated log analytics Automatically reveals time and count of most significant log events. #HPEDiscover
  • 34. Drill down to actual root cause log messages Automatically reveals time and count of most significant log events. View the log message content to identify root cause Number of occurrences of significant log data over time #HPEDiscover
  • 35. Troubleshooting application and database production issues 35 – Real-time dashboards – Automated log analytics – Fast root cause identification (¼ the time) – Fewer experts involved (5 SMEs to 1) – Cuts order backlog by 50% Solution Benefit #HPEDiscover
  • 36. HPE IT 3PAR use case OpsA increasing value to LOB
  • 37. Hewlett Packard Enterprise 3PAR storage line of business HPE premier storage business Proactive “phone home” monitoring service available to 3PAR customers Service enables 3PAR customers with latest capabilities and proactive protection of potential problems HPE IT systems enable/support “phone home” services 37#HPEDiscover
  • 38. Optimizing HPE 3PAR operations using Big Data analytics – Isolate problems – Reduction of file transfers late – Reduce file transfer overdue – Difficult to isolate problem – Manually interpreting behavior – Near real-time metric collection using OpsA – Define and measure big-picture view of 3PAR ecosystem – OpsA baselines defines “normal” behavior – OpsA guided troubleshooting – Quickly identify what is not ‘normal’ – Faster to diagnose problems – Eliminated manual efforts of collecting and correlating data – Decreased Mean Time To Recover (MTTR) 38 Solution BenefitsBusiness challenge #HPEDiscover
  • 39. Use baselines to define “what’s normal” Baselining metrics help IT define normal behavior Starting point for troubleshooting 39#HPEDiscover
  • 40. Analyze the eco-system 40 – Define services that describe the ecosystem – Quickly analyze the ecosystem – Correlate metrics from disparate areas – Identify areas of impacts Network metrics (nfs call rate) Application metrics (processing rate) Database metrics (active session count) Application metrics (file queues) File system metrics (disk queues)
  • 41. Identify trends and take action before problem occurs 41 Dangerous rate of file system growth #HPEDiscover
  • 42. HPE IT use cases and scenarios Analytics for predictive anomaly detection 42
  • 43. @HPE office@Remote Remote Microsoft® Lync user experience – Microsoft® Lync depended on daily by thousands of HPE employees – Complex infrastructure means difficulty in diagnosing issues Challenge Total PSTN conferences/week 14,909 PSTN (public switched telephone network) Mobile users Remote road warrior users Lync application Coffee houses Airports 14#HPEDiscover
  • 44. OpsA advance troubleshooting The correlation values vary from -1 to 1. The higher the absolute value of the correlation, the closer the relationship. Metric A Metric BCalculated correlation score of metrics A and B Seeming dozens of unrelated metrics from many disparate sources Challenge Analytic real-time scoring determines how closely related multiple disparate metrics are to the problem Solution Outcome 90% statistical correlation between “Bad Requests Received” and Lync Edge Server authentication failures. #HPEDiscover
  • 45. OpsA advance troubleshooting, another example The correlation values vary from -1 to 1. The higher the absolute value of the correlation, the closer the relationship. Outcome “Sends Outstanding” performance metrics correlate 100% with server network errors. Metric A Metric B Calculated correlation score of metrics A and B Use analytics to determine how closely related multiple disparate metrics are to the problem #HPEDiscover
  • 46. Analytics-based abnormality detection 46 take action before a problem occurs Performance metrics Dynamic baseline Zone of prevention Fixed threshold Solution Benefit Near real-time alerting of anomaly trend – Analytics to narrow focus of troubleshoot – Abnormality detection – Alerts triggered by anomaly trends – Faster diagnostic time – Less reaction, more prevention – Automated correction #HPEDiscover
  • 47. HPE IT use cases and scenarios Miscellaneous examples 47
  • 49. HPIT uses analytics to detect dangerous event patterns Identified HPIT application generating events at an increasing rate over 5 day period Detect a specific HPIT application generating events in 90th percentile (i.e. ‘nosiest’ application). Identify risk of dangerous pattern of a trend of events increasing spiking at midnight. Support proactively took action before major problem occurred. Breach in normal baseline Breach in normal baseline Dynamic normal baseline #HPEDiscover
  • 50. HPIT applies predictive analytics to prevent problems Predictive views of server performance behavior under specific workloads 1 2 3 35% Increase 1 Dynamic baselines automatically created for all metrics collected for a server. Server Memory Utilization metrics show increasing over time. Applying predictive analytics on server’s memory metrics predicts a 35% increase in memory utilization under current workload over next two weeks. 3 2 #HPEDiscover
  • 51. Applying predictive analytics to key applications Predictive views of HPIT applications performance behavior Predict future performance patterns based on historical baselines. #HPEDiscover
  • 52. HPE IT’s OpsA journey Continuous progress 52 Early 2014 Mid 2014 Late 2014 2015 2016 (Next) New data sources Expand to key Applications Sitescope Integration Application metrics Introduce predictive capabilities into support New data sources Integrated OMi Event data Integrated Network metrics ~60K devices Database Logs Analyics ~20K DBs Analytics on key applications and Business (i.e. 3PAR) OpsA PoCs Troubleshooting Microsoft® Exchange Proof of Concept Expanded IT Private Cloud 18K virtual servers Introduced OpsA Server metrics IT Private Cloud (~10K virtual servers). Cloud Infrastructure Support team Expand coverage Traditional server metrics (40K servers) and virtual cloud (+20K VMs) Network Outage Opportunity Apply analytics to large scale production network outage. Global Telecom Support team Continue expansion Predictive alerting Event Analytics Integration into HPE Helion Cloud OpsA PoCs Troubleshooting and Anomaly detection Microsoft® Lync Proof of Concept #HPEDiscover
  • 53. 8 datacenters HPE IT Operations Analytics solution OpsA highly scalable collection framework Integration with ArcSight, OMi, Sitescope, BPM, Logstash, JDBC, TCP/UDP, REST WS Big data analytics platform Highly scalable Cluster-based Column - oriented Visual Analytics Play back dashboard results Phrased Search Guided Troubleshooting User defined topologies Predictive analytics Industry analytics (R packages) Pattern detection via correlation coefficient Abnormality behavior detection Automated machine learning drives log and event analytics Collection Vertica Analytics VisualizationForensicsEnvironment Users 2600 apps 25K databases 66K network devices 56K servers 5M objects 38#HPEDiscover
  • 55. Get more information 55 Attend these sessions: – HOL9100 Go hands-on with HPE Operations Analytics; reveal what’s hidden in your data – BB8013 HPE Operations Analytics; providing validity to Safeguard Properties’ monitoring footprint – RT 9084 Breaking Bad processes; align central IT and the business using HPE Operations Analytics – RT9083 Increase the efficiency of support teams with automated Analytics-as-a-Service Visit these demos: Follow us on Social Media: – DEMO8816 HPE Operations Analytics; automated machine learning and predictive analysis at the speed of business – TPS9206 The Future of Operations Analytics – Twitter @HPE_ITOps – LinkedIn linkedin.com/company/hpe-software – Facebook facebook.com/HPESoftware – Blog http://hpsw.co/BSMblog #HPEDiscover