Separating data from code is just as important as it has always been for creating reusable code that can be conveniently configured for different situations. This has been possible to do in Puppet for quite some time, using Hiera and automatic parameter lookup. The new release of Hiera 5, introduced late in the Puppet 4.x series, brings new capabilities for data management. Data is no longer just global — it can be defined in an environment and inside a module. Plus, data integration no longer requires special backends — the point of integration is now a function. There are also new ways to reference data files. And there's so much more in Hiera 5. This talk introduces all the features of Hiera 5 now available in Puppet 5, and shows how they can be used. Integrators who want to write their own backends will also learn how to do that.
3. About Me
• Swedish
• Live on the
island
of Gozo,
Malta
• Father of 3
• Author of the Puppet 4
Language, Hiera 5, and Puppet
Type system, Task Plans.
• On the Puppet Core team
helindbe @hel
4. Agenda
• What is Hiera?
• What does Hiera do?
• Authoring Data
• Differences hiera 3, (4), and 5
• Writing backends
12. “hiera" - the gem“hiera" - the command line tool
“hiera" - the function
13.
14. What Hiera is
• A key-value store abstraction with multiple and
extensible set of backends (backend API).
• A key lookup resolution mechanism searching
multiple key-value stores (~ query)
• A hierarchical data organization
• A data composition mechanism (defaults,
override, merge, unique - etc.)
15. What do we use Hiera for?
• Explicit lookup
• Automatic Parameter Lookup (APL)
17. hiera 3 does this thing…
• Similar code in every backend - COPY PASTA - very hard to fix
general things when the logic is in each and every backend.
• Easy to make mistakes and leak memory in a backend
• Global & static architecture - must restart after changes
• Global config pointing into environments - all environments
must change at the same time - yeah right, when you have
1000nds forever changing environments…
• Search uses a cartesian product of levels and backends based on file suffix - lots of
trickery and lots of file stats
• Different backend versions not supported in different environments.
• No explanation support - need to trace/debug - must restart server
• Use of dynamic variables (because of lack of suitable features) makes it impossible to
have efficient cashing to speed up performance.
• Has its own backed loading system
• Circular dependency on puppet - made it very hard to fix certain types of issues
19. where a gem in puppet
CLI hiera puppet lookup
hiera.yaml
version
3 5 ( supports 3 and 4)
explicit lookup hiera() hiera_array()
hiera_hash() lookup()
backend API complicated simple using function API
APL options no
lookup_options in data,
explicit and APL the
same!
explain support no --explain
advanced paths no globs, mapped paths
deprecated
20. Explicit lookup
# get value, and…
# …fail if not present
$x = lookup(‘key’)
# …verify data type
$x = lookup(‘key’, Array)
# …return default if not present
$x = lookup(‘key’, Array, first,[blue])
# …options hash
$x = lookup('key', { <options> })
21. Many Lookup Options
name the key to lookup
value_type the return type to assert
merge merge options
defalt_value if not found use this
default_values_hash if not found pick lookup here
override look here first
22. default via code block
$x = lookup(‘mykey’) |$x| {
# calculate the value
compute_it($x)
}
23. Merge Behaviour
first the first found (default)
unique for Array ( v3 = “array merge”)
hash
merge hash, highest prio key
wins (no recursion)
deep
merge hash, recursive, higher
prio wins on conflict, arrays
made unique
24. Deep Options
knockout_prefix
string to match for removal
(undef = no knockout)
sort_merged_arrays sorts arrays (false)
merge_hash_arrays
if hashes in arrays should be
merged (false)
26. explicit lookup vs. APL
• APL = “Inversion of control” - “Push don’t Pull”
• Much easier to test
• Can be overridden!
• Parameterized classes are documented - your
arbitrary keys are not…
• Use APL in your APIs
27. APL and options
• All lookup options can be set in the data!
• Control “deep merge” etc per key!
• Any backend can return a Hash for the key “lookup_options” with a map of “key” => <options-
hash>
• All “lookup_options” are merged
• You can supply defaults in a module for example
31. Global
Layer
• For operational use
• Across all environments
• Overrides environment and modules
• Ok to use a deprecated version 3 hiera.yaml
• In Hiera 3, the only layer
(with nasty tricks of referencing into each environment).
32. Environment
Layer
• The typical place to store data
• Across all modules in the env
• Overrides modules
• Use a hiera version 5 hiera.yaml
33. Module
Layer
• Regular hierarchy for overridable and merge-able
values
• a default_hierarchy only consulted when not
found in regular hierarchy
• Only keys for the module’s namespace
• Must use a hiera 5 version hiera.yaml
themodule’sdefault_hierarchy
45. ---
version: 5
defaults: # Used for any hierarchy level that omits these keys.
datadir: data # This path is relative to hiera.yaml's directory.
data_hash: yaml_data # Use the built-in YAML backend.
hierarchy:
- name: "Per-node data" # Human-readable name.
path: "nodes/%{trusted.certname}.yaml" # File path, relative to datadir.
# ^^^ IMPORTANT: include the file extension!
- name: "Per-datacenter business group data" # Uses custom facts.
path: "location/%{facts.whereami}/%{facts.group}.yaml"
- name: "Global business group data"
path: "groups/%{facts.group}.yaml"
- name: "Per-datacenter secret data (encrypted)"
lookup_key: eyaml_lookup_key # Uses non-default backend.
path: "secrets/%{facts.whereami}.eyaml"
options:
pkcs7_private_key: "/etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem"
pkcs7_public_key: "/etc/puppetlabs/puppet/eyaml/public_key.pkcs7.pem"
- name: "Per-OS defaults"
path: "os/%{facts.os.family}.yaml"
- name: "Common data"
path: "common.yaml"
a level
46. Inside a layer
Highest prio
Lowest prio
multiple
paths/globs etc.
per level
(backend only called if file exists)
47. ---
version: 5
defaults: # Used for any hierarchy level that omits these keys.
datadir: data # This path is relative to hiera.yaml's directory.
data_hash: yaml_data # Use the built-in YAML backend.
hierarchy:
- name: "Per-node, datacenter, and business group data"
paths:
- "nodes/%{trusted.certname}.yaml"
- "location/%{facts.whereami}/%{facts.group}.yaml"
- "groups/%{facts.group}.yaml"
- name: "Per-datacenter secret data (encrypted)"
lookup_key: eyaml_lookup_key # Uses non-default backend.
path: "secrets/%{facts.whereami}.eyaml"
options:
pkcs7_private_key: "/etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem"
pkcs7_public_key: "/etc/puppetlabs/puppet/eyaml/public_key.pkcs7.pem"
- name: "Defaults per os and common"
paths:
- "os/%{facts.os.family}.yaml"
- "common.yaml"
multiple
48. Different ways to reference
data files / sources
Key Data type Expected value
path
paths
String
Array
One file path.
Any number of file paths. This acts like a sub-hierarchy: if multiple
files exist, Hiera searches all of them, in the order in which they’re
written.
glob
globs
String
Array
One (or several) shell-like glob patterns, which might match any
number of files. If multiple files are found, Hiera searches all of them
in alphanumerical order (ignoring the order in which multiple globs
were given).
uri
uris
String
Array
One, (or several) URIs that are not checked for existence. One call
to the backend is performed for every given URI.
mapped_paths Array or
Hash
A fact that is a collection (array or hash) of values. Hiera expands
these values to produce an array of paths.
mapped_paths: [services, tmp, "service/%{tmp}/common.yaml"]
49. Tips and Tricks
• Have keys with ‘.’ in them?
Quote the key when looking up to prevents the built
in “dig” behaviour:
lookup("'my.dotted.key'")
52. data_hash
Produces all of the
key => value
pairs at once as a Hash
Good for small to moderate data volume and where most of the
data is always used. Limit; static in nature.
53. Reading a json file
function mymodule::myjson(Hash $options, Puppet::LookupContext $ctx) {
$options[‘path’].file.parsejson()
}
54. lookup_key
Produces values
per key - called
multiple times.
Slightly more complex because of the added flexibility/power, but still not
complicated to implement.
55. A “prefixer” added to our
hiera.yaml
- name: "Using example with prefix"
path: “examples/%{trusted.certname}.yaml"
lookup_key: mymodule::json_with_prefix # the example function
options:
prefix: “Yo, Waldo! The value is: "
56. transforming values by
key…
function mymodule::myjson_with_prefix(
Variant[String, Numeric] $key, # the key being looked up
Hash $options, # the options from hiera.yaml
Puppet::LookupContext $ctx # the context/helper
){
$hash = $ctx.cache_file($options[‘path’]) |$content| {
$content.parsejson()
}
case $val = $hash[$key] {
String : { "${options[‘prefix']}${val}" }
NotUndef: { $val }
default : { $ctx.not_found() }
}
}
57. data_dig
Like lookup_key but is
responsible for any
digging into the key.
because lookup(“users.jane_doe.pager_nbr”) would be terrible if
there are thousands of users…
58. Puppet::LookupContext object
Key Expected value
not_found() Immediately returns from the function and tells hiera there is no
value for the key
interpolate(value) Perform hiera style interpolation on the given string value
environment_name()
module_name()
Produces information about the container where this function is
part of a hiera.yaml
cache(key, value)
cache_all(hash)
Adds values to a cache.
cached_value(key)
cache_has_value(key)
cached_entries()
Retrieves value(s) from the cache
cached_file_data(path) |$content|
{...}
Reads and caches the contents of a file, or the transformed
content of a file
explain() || { 'message' } Emits an “explain” message if —explain mode is on
59. Ideas for backends
• DRY up data - use lookup inside backend to compose
values from lookups - earlier not possible for arrays and
hashes.
• Computed values - given input from hiera.yaml (maybe
even a key), other values can be derived
• Provide different data sets (in a module; for example
“standalone” vs. a “client server” configuration) that can
be integrated. While a module cannot directly supply
global keys, it can provide the data/backend that does
so if added to the env’s hiera.yaml!
62. 3 vs 5 - deprecated bad magic
• Nothing good came from using these hiera 3 magic variables:
$calling_module
$calling_class
$calling_class_path
• Hiera 3 could use these as a hacky predecessor of module data, but
anything you were doing with them is better accomplished with the
module layer. You can continue using these in a version 3 hiera.yaml
file, but you’ll need to remove them once you update your global config
to version 5.
• If used to split up data in multiple files (per module etc). Use the ‘glob’
pattern.
63. 3 vs 5
• Use lookup() instead of hiera_xxx()
• Use lookup() + include() instead of hiera_include()
• Use lookup CLI instead of hiera CLI
• No global merge/deep-merge setting (was: horrible!) - use lookup options.
• Move to using hiera 5 backends!
• The ‘data binding terminus’ (advanced hackery) is no longer used - write a
backend instead
• Hiera 5 is faster (much thanks to caching) and with greatly reduced risk of
memory leaks due to mistakes in backends
• You can call lookup() from within backend functions! Can do what hiera 3
alias never could (hiera 3 - limited to strings).
• The lookup_key function opens up for advanced data composition - merge
multiple (different) keys into one etc.