Scott Riley discusses the importance of real-time network management given the challenges of an increasingly mobile workforce, growing app usage, and rising cloud adoption (the "digital tsunami"). Traditional monitoring silos provide only a partial view and manual correlation of issues, while a single, well-connected monitoring platform can correlate events in real-time to more rapidly diagnose problems. Key benefits include real-time monitoring, automatic fault remediation through runbook automation, bandwidth monitoring, and compliance management to track network changes. Scott demonstrates OpManager, a vendor that provides integrated, real-time IT management solutions.
2. How to Get Real-Time Network Management right
Overcoming the challenges involved
With Scott Riley
3. About Scott
Scott is an IT management professional with 12 years of expertise in IT operations.
During the course of his career, he’s led technical teams to success across the UK in a
number of areas including: Network and Security, Hosting and Datacentres and Product
Development.
As Director of Cloud & Hosting Solutions at GCI, Scott has developed virtualisation
technology solutions and manages the shift from physical servers to virtualised services.
He ensures business processes and cost models are well aligned to a solution roadmap.
Email: scott.riley@gcicom.net
Twitter: @Fauxnuts
4. You Will Soon Discover
1. Importance of network performance management
2. Current challenges in network management
3. Impact of downtime on your business
4. Devising a real-time network monitoring strategy
5. Compliance management
6. Traffic shaping & bandwidth monitoring
7. Fault identification and remediation
5. Why is Network Management so
Important?
IT continually evolves, we need a core monitoring strategy and adaptable tooling
• Current Challenges
• Mobilegeddon!
• Server Growth in APAC
• The Digital Tsunami
6. Current Challenges
Users are working longer hours, in more locations and across multiple devices
• Consumerisation of IT
• Explosion of Apps
• 4G LTE coverage expansion
• Enterprise playing catch-up on the
Home User experience
“Telstra now spends 50
per cent of each board
meeting discussing
future strategy and most
of it about how
technology is going to
change.”
ProfessorSteve Burdon
University of Technology,
Sydney
7. Mobilegeddon!
Mobile/Tablet usage has overtaken desktop
• Users now spend more time on
mobile than on Desktop Apps
• Google Ranking lowered if your
site is not Mobile-friendly
• 40% of sites are not ready
0
200
400
600
800
1000
1200
1400
1600
1800
2000
2007 2008 2009 2010 2011 2012 2013 2014 2015
Number of Global Users (Millions)
Desktop Mobile
8. Continued Server Growth in APAC
APAC (Exlcuding Japan) is showing the biggest growth in Server hardware sales
8.7
-10.8
-6.7
-6.3
-5.2
0.7
-2.0
7.5
-6.3
-10.6
-4.1
-6.7
2.6
3.8
-15.0
-10.0
-5.0
0.0
5.0
10.0
APAC East EU Japan Latin Am MEA North Am West EU
Worldwide Server Shipping and Reveue 2014
Shipping Revenue
9. The Digital Tsunami
Cloud uptake in Australia has skyrocketed in 2014
• More Demanding Users
• Faster Network Access
• More Servers in the Datacentres
• More Apps to support
2013 2014 2015 2016 2017 2018
Australia: IaaS Cloud Spending
42% Growth 2013-14
$439 Million by 2018
10. The Cost of Downtime
64% of Australian businesses surveyed suffered downtime
Cost of Data loss and
Downtime in 2014
$65.5 Billion
Average Downtime
Experienced
27 hours
13. Challenges with Monitoring Silos
Rapidly diagnose issues with a single capable monitoring engine
• Partial view of system performance
• Manual correlation
• Multiple system-hopping to diagnose faults
• Alert flooding
14. A Strategy for Real-Time Monitoring
A single, well connected platform can correlate events in real-time
• Monitoring all of our systems
• Establishing baselines and trends
• Alert with specific information
• Guides us to the root cause
• Enables us to take action…quickly!
17. Compliance Management
Using a system to track configuration greatly improves your adherence to standards
• Measure compliance against
industry standards
• Set your own configuration
targets
• Track what was changed,
when, by whom
• Report on compliance status
18. Real-Time Monitoring
View events up to the second, not in a 5 minute average
• Real-time
statistics
• Up to the
second
information on
any metric
• Live bandwidth
graphs
19. Bandwidth Monitoring
From a high level overview, drill straight to areas of concern for rapid investigation
• Build High Level Business Maps
• Maps are colour coded based on
availability and performance of links
• Rapid drill down to detailed node
statistics
20. Traffic Shaping
Classify applications and apply Quality of Service policies
• Identify and categorise applications
• Apply Quality of Service Policies for
priority traffic
• Apply restrictions around non mission-
critical traffic
Anything
Else
Mission
Apps
Realtime
Apps
Voice &
Video
23. Automatic Fault Remediation
Small Service Provider with around 22,000
DSL subscribers
• Repeated fault with DSL disconnects
• Irritated customers
• Loss of confidence
• Increased Service Desk Tickets
24. Automatic Fault Remediation
Continuous monitor of SNR & Attenuation
levels
Alerts triggered on deviation from
expected levels | Helpdesk Ticket Created
Automated script reboots the alerting
devices at 00:01 the next morning
Ticket automatically updated after
maintenance confirming device online
25. Automatic Fault Remediation
By using Runbook automation;
1. Reduced their helpdesk callouts
2. Pro-actively repaired fault before
they impacted service
3. Improved overall customer
experience
Automation saves costs, reduces Mean Time To Repair (MTTR) and increases customer satisfaction
26. Summary
Thank you for your time!
The changing IT landscape | “Digital Tsunami”
The cost of downtime
The perils of monitoring silos
Real-time monitoring and event correlation
Automatic fault resolution