Managing scalable infrastructure based on monitoring
Managing scalable infrastructure
What is ForthScale?
kick ass team of
Semantics (intent or a desired state)
Dynamics (actual behavior)
Dev think mostly about semantics
Ops think mostly about dynamics
DevOps = complete picture
● Thank you Mark Burgess
● Complexity is THE source of randomness.
● Complexity is the birthplace of chaos
● We all know the KISS principle but fail it.
The maintenance theorem says: you can't
really control anything over time. Best you can
do is to keep it roughly in balance.
"Equilibrium" replaces determinism as the most
important idea in science. It is the definition of
What is scaling
Scaling is switching multiple Equilibrium points
in accordance with current resources demand.
● People tend to use modern IaaS same way
they used to operate hardware systems.
● If you do not take true advantage of IaaS
why pay it`s premia?
● If you are paying for it, maybe it could be a
good idea to actually use it?
Basics - Operator goals
Use IaaS to improve infrastructure lifecycle
Keep user experience
Account for usage
Mainly running an operation with fluctuating
● Web app (externally or internally hosted)
● Mobile app (externally hosted)
● Intranet app (internally hosted)
● Data analysis (externally or internally
Basics - a perfectly good stack
Clear policy on resource consumption
Stack parts horizontal scalability
Non sticky nodes (for easy replacements)
The Forthscale way
What is the best way to allocate
Monitoring stack and learning what
resources are needed and when
Controller building blocks
● CloudStack allows to manage allocation of
computation resources (IaaS).
● Munin monitoring collects analytical data
about infrastructure performance.
● Ansible for infrastructure orchestration.
Is a cloud orchestration platform
Manages computing resources allocation
Has an extensive API
Yes we can use other platforms
Munin is an open source client /server
monitoring application that presents output in
graphs through a web interface
yes, any monitoring will do.
stores to rrd
very very very simple to deploy
text based configuration
Cloudstack + Munin = 日本のスタイル
Set of plugins done with CloudMonkey
Can store all metrics provided by api
Correlate IaaS, OS and AppStack issues
Can use Nagios or similar for alerting
But why just alert then you can handle?
Handling - Orchestration
Issues can arise on any level
Better use orchestration tool that supports any
dynamic, any part of the stack.
Why ? “why not just use Ansible instead?”
Generate playbooks to:
Add or remove nodes
Change nodes sizes
Upgrade components or application
Add / remove nodes to / from arrays
Scalable infrastructure what:
● Operates on true metrix.
● Efficient and accountable.
● Available, Scalable and Redundant.
Use scenarios - Vertica
Vertica is a SQL compliant grid database.
1. Adding or removing nodes based on amount
of resources needed.
2. Changing nodes sizes based on amount of
3. Proactive policy for resource allocation.
Use scenarios - version release
New version can also lead to changes in
computing power requests.
Building a stage environment.
Executing a load test.
Comparing monitoring metrics.
Adjusting resources together with