The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
Puppet camp london nov 2014 slides (1)
1. FROM ZERO TO DELIVERY IN A LARGE ENTERPRISE
PUPPET CAMP LONDON
ALAN SCHWARZENBERGER (Head of Engineering – Tools & Automation)
CHRIS SPENCE (All round nice guy)
17th NOVEMBER 2014
2. Outline - Background
•Scale
–many applications
–each application has multiple contributing groups
–many data centres around globe
–many servers (>10k)
•Complexity
–many OS image variants
–competing solutions from different teams
–most products are 24x7
•Organisational tension
•Previous failed attempts (not puppet)
2
3. Outline – Why & Aims
•Why?
–Simplify delivery to decrease cost, effort and time
–Improve quality & transparency of change implementation
–Standardise the way that change is performed
–Introduce common workflows
•Aims
–Automated deployments & configuration management
–Standard open source tools
–Replace non-scalable legacy tools
–Provide consistency from dev through to production
–Modular and loosely coupled
–Self service
–Single scaled platform
3
4. Outline – Starting point
•Previous toolsets
–didn't scale, hard to workflow, hard to use
•Small pockets of Puppet, Chef, Rundeck etc
•Greenfield site for Puppet
•Other initiatives informed solution
–data aggregator
–release management UI
•External integrations with other tools using REST api
•Workflow implied by overall data model and business structures
•Multi-tenant problems
4
5. Design & Architecture
•Puppet master solution, with Hiera and External Node Classifier
•Resilient, horizontally scaled (mix of DNS, F5, Apache)
•MoM, Compilers, Puppet DB, Puppet Dashboard
•Directory Environments
•Separate artifact distribution with caches around the globe
–This could be a talk all on its own!
–It turns out distributing things is hard.
•Servers modelled in hiera, templated configuration
5
puppetmaster::servers: puppetmaster: upg-dev-moma.amers1.ciscloud servers: compiler1: group: compilers compiler2: group: compilers mom1: group: mom primary: true mom2: group: mom
6. Filling in the missing bits
•Hiera at scale
•Resilient CA
•Handling server-side Ruby
•Resilient & scaled PuppetDB
•Merging Hiera from multiple contributors into a directory environment
•Aggregating puppet code from multiple contributors for one stack
•How do you release?
–Components and Component Versions
–Component Stacks, Component Stack Versions
–Deployment Groups
6
7. Filling in the missing bits – Hiera at scale
•Don’t put every value into Hiera
–consider performance implications
–keep values tightly coupled to code in the code
•Use for what changes in different builds
•Our chosen hierarchy:
7
--- :hierarchy: - "host/%{::fqdn}" - "server_type/%{server_type}/server_environment_class/%{server_environment_class}/deployment_location_code/%{deployment_location_code}" - "server_type/%{server_type}/server_environment_class/%{server_environment_class}" - "server_environment_class/%{server_environment_class}/deployment_location_code/%{deployment_location_code}" - "server_environment_class/%{server_environment_class}/server_environment_class_instance/%{server_environment_class_instance}" - "server_type/%{server_type}" - "deployment_location_code/%{deployment_location_code}" - "server_environment_class/%{server_environment_class}/region/%{region}" - "server_environment_class/%{server_environment_class}" - default :backends: - yaml :yaml: :datadir: "/platform/puppet/hieradata/%{::environment}" :merge_behavior: deeper
8. Filling in the missing bits – Resilient CA
•Building a resilient pair of CA servers
–Uses the Puppet CA
–You can’t just rsync!
–CAsync!
•Copies certificates
•Merges the inventory files
•Works out latest CRL
–Don’t forget the serial number offset!
8
9. Filling in the missing bits – Server-side Ruby
•Server-side ruby is immutable – only one version!
–functions, types, providers, core Puppet modules (inc forge modules)
–Directory Environments – use basemodulepath for the server-side ruby
–Multiple teams must contribute server side ruby to a single place
9
10. Filling in the missing bits – Scaling PuppetDB
•PuppetDB with read slaves
•Puppetlabs modules with a wrapper class
•Set up PostgreSQL write ahead log (WAL) archiving
10
11. Filling in the missing bits – Merge & Aggregate
•Merging Hiera from multiple contributors into a directory environment
–Hieramerge - deep merge hashes
–right to left priority wins
•Aggregating puppet code from multiple contributors for one stack
–Components, component stacks and deployment groups
11
foo = HieraMerge.new foo.dirs = [ "/base/module/path/hieradata", "/thing/otherthing/1.6.6.6/hieradata", "/thing/penguinthing/1.2.3/hieradata” ] foo.target = "/out/put/hieradata" foo.merge
12. How do you release?
•How do you release software?
–Components – what we release, e.g. MyWebApp
–Component Versions – a point version MyWebApp-1.3.37
•That’s probably pretty normal
•But we’re releasing multiple apps at a time from different groups
–Component Stacks – MyWebApp + Monitoring
–Component Stack Versions – MyWebApp 1.3.37 + Monitoring version 9
12
13. Components
•Components
–discrete sets of functionality
–informed by organisation structures & application responsiblities
•Component Versions
–a release, published version of a discrete set of functionality
–expected to be self contained without external dependencies other than on the base module
–consumed into an rpm and become immutable ( Jordan Sissel for FPM)
–never deployed on their own unless they are put in a component stack
13
{ component: "winning", versionvalidated: "true", componentreleaseversion: "1.3.37", packageurl: "http://gitserver.mcgitserver/project/winning.git" }, { component: "losing", versionvalidated: "false", componentreleaseversion: "6.6.6", packageurl: "http://gitserver.mcgitserver/otherproject/losing.git", versionvalidationerror: "failed to download modules" },
14. Component Stacks & Deployment Groups
•Component Stacks
–one or more components which together describe the desired overall state of node
•Component Stack Version
–A release, published version of code, RPM packaged
–rpm contains:
•the hieradata, merged with hieramerge
•an environment.conf
•RPM dependencies to component versions
–at install time use yum for dependency resolution
–reuse of different component versions across stacks
–component stack version becomes a Puppet Directory Environment
•Deployment Group
–nodes that need same component stack version
–nodes that will be upgraded together (eg A/B side)
–ENC integration – deploymentgroup controls which component stack version a node gets
14
{ version: "9.1.1", versionvalidated: "true", componentstackname: "nigel", componentversion: [ { component: { component: "allyourbase", versionvalidated: "true", componentreleaseversion: "22.7”, }, { component: { component: "arebelongtous", versionvalidated: "true", componentreleaseversion: "3.141", }, ], }
15. Build workflow
•Jenkins build server in full control
–Remove deploy-time dependency on version control system & build server
–Enforces immutability at point in time
–BYO version control system (gitlab/github/svn)
•driven from data in the data layer
•we support reading from git directly (r10k)
•we support people doing their own thing and giving us an artifact (tar/zip)
15
16. What’s the process?
•Workflow
–Component version built into rpm
–Component versions built into a stack rpm with merged hieradata
–Component stacks installed on puppetmasters using puppet
–Stateful description of all versions of all apps/states through data layer
–Validated/deprecated releases are marked
–Data available to puppet master, so we can install and purge stacks using puppet and yum
•We are treating Puppet releases as Software releases
16
17. Other interesting things
•Standard internal Travis-CI like per repo testing
•Self-service development for groups
–managed dev master pinned to production release
–has own CA, so downstream is independent
–no ENC
–iterate on published componentstackversions
–quick iteration/self-service/pre-publish
•Custom reporting events into the data layer
17
begin config = YAML::load_file '.config.yaml' rescue Errno::ENOENT config = {} end command = config['command'] ? config['command'] : 'rake lint && rake syntax' [command].flatten.each do |check| system(check) returnvalue = $? unless returnvalue.exitstatus == 0 exit(1) end end
18. Future and next steps
•Adoption – going from 200 to >10000 nodes
•Reporting into data layer from PuppetDB for a custom dashboard
•Component dependencies because currently hidden/not explicit
•User experience of building stacks needs improving
•Fully automated acceptance testing
–rspec-puppet gives some guarantees
•Open sourcing interesting, novel & useful things
–casync
–hieramerge
•Resolving split responsibility between OS and app
18
19. Contact us
19
Alan Schwarzenberger alan.schwarzenberger@thomsonreuters.com Or via LinkedIn
Chris Spence github.com/fiddyspence chris.spence@thomsonreuters.com Or via LinkedIn