More Related Content Similar to AIOps - Steps Towards Autonomous Operations - AWS Summit Sydney 2019 (20) More from Amazon Web Services (20) AIOps - Steps Towards Autonomous Operations - AWS Summit Sydney 20192. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AIOps: Steps towards
autonomous operations
Sri (Srichakri) Nadendla
Enterprise Solutions Architect
Amazon Web Services
3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Agenda
• Effective operational practices
• Enablers for autonomous operations
• Demo using Amazon Sagemaker
4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Operations
School of hard knocks
5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Whatever can go wrong,
will go wrong.
6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Operations objectives
Keep it
safe
Keep the
lights on
Reliability
(Availability + Performance)
Security
7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Ops holy grail
Prevent Correct Baseline
8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Enablers to effective operations
Collection Patterns
Actions
9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Foundations to observability
Alerts and
notifications
Data, tools
and patterns
Ingestion
Metrics,
events
and logs
Threat intel Budgets
Planned events
10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Step 1 – Enable instrumentation
Metrics,
events
and logs
11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS services in the context
Instrumentation
12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Step 2 – Ingest and store
Data
storage
Ingestion
Metrics,
events
and logs
13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS services in the context
Instrumentation Ingestion Storage
14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Step 3 – Query and pattern mining
Data
Storage
Ingestion
Metrics,
events
and logs
Threat intel Budgets
Planned events
Analysis
15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS services in the context
Instrumentation Ingestion Storage Analysis
16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Step 4 – Alerts and Remediation
Alerts
& notifications
Data, tools
& patterns
Ingestion
Metrics,
events
and logs
Threat intel Budgets
Planned events
17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS services in the context
Instrumentation Ingestion Storage Analysis Alerts &
Actions
18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Some challenges
Metrics
galore
Dashboards
fatigue
Manual
correlation
Static
thresholds
19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Responsiveness matters
Root cause
identification
Dynamic
detection
Proactive
remediation
20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Path to autonomous operations
Data, tools
& patterns
Ingestion
Metrics,
events
and logs
Threat intel Budgets
Planned events
Predictive,
actionable and
automated
remediation
21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Key techniques
Correlation Anomaly
detection
Forecasting
22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon SageMaker: Build, train, and deploy
machine learning models at scale
1
2
3
23. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
24. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Artificial Intelligence
amplifies the possibilities of
human-machine collaboration
25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Additional considerations
Shift left
(CI/CD)
Runbook
invocation
Knowledge
assist
26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
GOOD intentions should have GOALs
Mean time
between
failures
Proactive
actions
# Problems
avoided
Time to detect
Time to resolve
Mean time to
recover
27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
What we are really trying to achieve …
Infrastructure
Support
Innovation
Infrastructure
Support
Innovation
Innovation
Support
✅
28. Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sri (Srichakri) Nadendla
nadendls@amazon.com