This presentation outlines the need to expand the notion of continuous delivery to encompass operational excellence. It discusses how a cloud-native platform can automate and simplify many operational concerns and what desirable properties such a platform should possess. The presentation concludes with a brief discussion of Pivotal Cloud Foundry.
Introduction
Describe Pivotal
Describe Cloud Foundry – near the end we will discuss it
Why you are here – You’ve learned about CI and CD. You’ve learned about why it’s important and why you need to do it. You’ve learned how to do it. However, I am here to challenge you to think about what CD really means; to introduce you to new questions and hopefully some answers. CD should expand to include operational issues.
Andreessen - Software is eating the world
Jamie Dimon – Silicon Valley is coming – examples of unicorns – twitter, uber, facebook…
Come out of nowhere
Can deliver at exceptional pace and quality – disrupt industries
Rare breed
Disrupting industries
Everybody wants to be one
Dave McClure – Some Unicorns may be overvalued, but all Dinosaurs Gonna Die
Unicorns overvalued
Dinosaurs risk extinction
Reality
Reality is somewhere in between
Realization – the pace of commerce has accelerated and it is driven not just by unicorns but by technology – the haves and the have nots. Technology, and not just talent available to unicorns, can let your competitors move faster. So you need to learn to move fast…and invest in technology.
Many of us are trying to evolve. That’s why we’re here…to learn to move fast and at better quality
Clear outcome: An imperative to evolve – We must learn to deliver continuously
What does it mean to deliver continuously?
Compile code continuously
Unit test continuously?
Integration test continuously?
Push bits continuously?
Then what? Can you operate your service to truly deliver?
True Delivery – Speed of Business – Karl Malone
How do we get there
Delivery is not about passing the ball…getting close to the hoop…or taking the shot…
It is about getting the ball in the hoop
Karl Malone delivered. He could be counted on it. He was reliable…he delivered.
Delivery is not about pushing the bits…
Delivery is about customer success
We cannot make customers buy or use our product…but we can ensure that our product is available to those who want to buy or use…
That goes a step beyond deployment…
Day 0 – Develop Code
Day 1 – Deploy Code
Day 2 – Maintain and Operate
There is an impedance mismatch – you automate the pipeline until deploy and then what?
You deploy your bits where? Do you want to spray your bits across machines…and pray? What’s next? What are the issues you can expect to encounter?
You might think your job is done after deployment…
Do you still think in terms of load balancer, VMs, in-memory grids, clusters and failover?
Availability
What happens when there is an app crash. Does everything go down? Even unrelated parts?
How quickly do you recover?
What happens if you lose infrastructure? Does someone have to spin up a new VM? How long does that take?
Scaling
What do you do if there is a spike in load? Scale up? Scale out? Allocate a server? VM? Who does it? How long does it take? Is it automated?
What is the unit of scale? VM? Container? Microservices? Is it granular? Does it matter?
Do you scale the whole app or just the part that is experiencing load?
What do you do when workload varies by the hour? Minute? Second? One tweet can lead to a frenzy of requests. Who/how does the capacity spin down?
Do you still do manual “capacity planning”
Manual capacity planning and re-architecture
Manual Integration and rebalancing with rest of infra, e.g. load balancers
That’s so 20th century
Security
External: Surface area of attack, DDoS
Between applications?
Internal: Employee? Intentional or unintentional; lack of security policies, roles, etc.
Is it consistently applied?
Updates
Can you update without downtime?
How do you fix bugs quickly?
What do you do if your fix introduces new problems? Can you rollback quickly?
How do you see if a new feature is getting traction with just a few users?
How do you know if one layout on your site leads to better customer returns than another?
Operating without a platform is like bringing a knife to a gun fight
A lot of these problems happen because a single point of control is lacking. All solutions to them are custom, non-reproducible, manual, low-level. A platform is necessary to abstract, standardize, automate and reduce redundant work. A well designed platform can address many of these concerns and even if it does not address particular issue, provides a centralized placeholder control point for future enhancements. A well designed platform can provide the following benefits:
Automation – alleviates manual configuration, fire-fighting, etc.
Standardization – creates one way, tested, reproducible process that gives confidence, and also enables sharing of workload by eliminating custom solutions for each group.
Collaboration Platform for DevOps – Creates a single platform that enforces the same process, tools, primitives and vocabulary for dev and ops teams, making collaboration possible with less friction
Ease with Extensibility – Makes common things simple while allowing special cases to still be supported (since every environment is unique).
Encapsulation – Placeholder for Reduction of Concerns; A single place where all concerns that need to be handled in a general manner can be encapsulated to reduce redundancy and create repeatability and confidence.
Elevation of Concerns – abstracts away some of the infrastructure; stop thinking about plumbing (VMs, app servers, DB, OS, etc.) and think about the units of value, e.g. apps and services. A platform makes it easy to deploy and operate the relevant primitives and abstractions that central to the prevailing philosophy of software design, architecture and development at the time.
Software platforms are developed to support architectural philosophies – these philosophies become the design points for platforms
It is desirable to support these philosophies. What are emergent philosophies with substantial consensus that should be supported design points?
There are 2 architectural approaches that are critical in a cloud native world. Microservices and 12-factor app
Define Microservices (slide)
A Microservices approach has multiple ramifications
Development – an application can be developed as multiple, decoupled/independent units with well defined interfaces/contracts. Behind the contracts/endpoints/interfaces, development proceeds in parallel and makes design choices, unconstrained by others. Their choices are based on what is the best choice for the service being designed. This yields lower application complexity, faster time to market, smaller testing surface, more stability, tremendous freedom of choice w.r.t. implementation. It also promotes robustness in application design since you can replace one service with another as long as the interface is consistent. It also allows new functionality to be released faster as services update with independent cadences.
Operations – Operations teams can operate at a level of service granularity rather than VMs, load balancers, etc. Ease of comprehension; Services can be independently scaled, secured, updated, etc.
Organizations – Teams can be organized as services teams that enable them to move at different paces to release features and not be constrained by the slowest, most complex components any feature at all.
A cloud-native platform should make it easy to work with Microservices.
Need Backing Services - Need to provide extensibility to applications to be able to consume new services – whether native or external, single tenant or shared
Apps today demand rapidity, simplicity, reliability and scalability of deployment. Collectively, these features are equated with being a “cloud-native” application because they capture the best qualities of cloud: simplicity, speed, scale and reliability. The 12-Factor App approach is a collection of emergent best practices that facilitate the creation of cloud-native applications. Thus, it would be desirable for a platform to make it easy to deploy and manage 12-factor apps.
12-Factor App – some of the best practices can be translated into platform features and workflow requirements
Backing Services – Services
Processes – Containers
Dev/Prod Parity – Inherent feature; can deploy both development and production workloads; same tooling and behavior.
Build/Release/Run – enforced via buildpack, droplet staging, and cloud controller/health monitor model
If you do develop such a platform, it yields support for cloud-native apps that are easily scalable and reliable because processes are independent, stateless and dispoable/fungible
Must support various languages with ease
Must support various runtimes with ease
Not just current technologies – but a framework that can support future technologies through extension
Rare is the scenario that we need to develop once and never move.
Some components are better off in one place whereas others are better off elsewhere, e.g. due to latency.
New dependencies are created
Better platforms come up
Some workloads are better in a public cloud; Others are better on premise
Do we want to deal with migration friction?
Availability - The need to avoid downtime is critical. And it must be supported at many levels. Platform enabled availability – platform should detect loss and take remedial action (directly or indirectly by informing). Levels: AZ, containers, VMs, health monitor redundancy
Adaptive Scaling – Variable workloads cannot be predicted, and hence scalability should be adaptive rather than prescriptive. Avoid pre-provisioning. Scale fast. The platform should detect demand in real-time and intelligently provision additional resources. Similarly, it should “spin down” when no longer required. Thus the platform can pool resources to achieve maximum utilization across multiple workloads.
Security – The platform should provide a means to secure the apps from multiple threats, e.g. external threats (reduce surface area of attack), from other apps (isolation), from users (security policies, roles, etc.).
Update Management – The platform should make it easy to deploy new functionality incrementally to a few users, while still enabling reversion to a prior version. The platform should have rich monitoring capabilities, e.g. logs, performance, etc.
A platform should understand and support an organizational context comprised of multiple, collaborating organizations/users with different requirements for security, resources, etc.
Support multi-tenant context
Notion of workspaces, roles that can support chargebacks…
Encapsulate and enforce a framework – A single platform, mutliple deploys allows standardization and repeatability which is valued in org. contexts. A focal point for enterprise workflows.
Handle multiple contexts with ease and elegance – allow the platform to adapt to different requirements in different contexts using a framework that supports extensibility, e.g. buildpacks
Given our increasing focus on software delivered as a service, DevOps has assumed a greater importance. A single, well-defined platform should support DevOps by:
Multiple deploys – supporting multiple instances/deploys (dev/test/prod) of an application so it can move through CI/CD pipeline
Support Dev/Prod parity – so there is less translation friction and delay in moving from dev to production
Encourage shared culture – if everyone uses the same tools and is constrained by the same rules, then they are likely to share the same processes, culture and vocabulary to make DevOps collaboration easier
Could you build a platform with these capabilities? Probably. But, do you want to? Some questions to consider:
Is building a platform your business?
How much money do you have?
How lucky do you feel?
And most importantly how much time do you have?
Unicorns did it out of necessity…are you in the same situation? Not really…there’s this thing called Cloud Foundry. It represents
iPhone experience
A Cloud-Native Platform
An opinion (opinionated cloud platform)
Open source software offers true community-driven outcomes.
Cloud Foundry is guided by a foundation with leading companies as members.
Cloud Foundry Foundation has 40 (and growing) companies that participate in the community with a vested interest in a platform that continues to grow and meet customer needs, now and in the future.
Discuss 1 or 2 customer stories. See CF Summit 2015 videos.
Humana (Digital Experience Center)
Risk averse healthcare insurer, 50+ years old
2 months to obtain server provisioning
Delivered apple watch app within 5 weeks by team of 4; launched on day of apple watch launch; for all users, not just customers; listed in MacWorld top 20 apps alongside Uber, EverNote (unicorns)
“Cue by Humana” app Rethinking their own industry reminds you to do simply healthy things, e.g. drink water, walk around, etc.
AllState
Risk averse company
Went from deploying servers in 100 days to minutes of self-serve
Release fast, fix bugs fast
Competing with Google
Not a cost argument; it’s an opportunity argument
Can now go global overnight, not just be U.S. focused. No longer technology constrained. Technology drives the business empowers to business to pursue new opportunities