Ask Your Question

Best practice for locating data in hiera vs. manifest

asked 2015-05-29 03:01:00 -0500

maxwell gravatar image

updated 2015-05-31 13:05:32 -0500

ramindk gravatar image

We're completely refactoring our manifests for the 3rd time after studying Gary Larizza's excellent blog post series. We've adopted the role/profiles pattern and every other best practice we could pickup. However we've ended up storing data inconsistently between hiera and the profile manifests. Puppet provides a huge amount of rope to hang oneself in this area. For example, we did the following:

  1. We have a profile::base class that collects all crontab entries across all servers i.e. create_resources(cron, $crontabs)
  2. Similarly we have a profiles::apache class that creates vhosts i.e. create_resources(apache::vhost, $apache_vhosts)
  3. Similarly we have a profile::mysql class that passes a hash of users+grants+databases from hiera to the mysql::server class.

This results in thin profile manifests and fat hiera frames. It feels as if our data is split inconsistently across the profile manifest and hiera. In 2013 Gary wrote in that the only data that should go into hiera is:

  1. Business-specific data (i.e. internal NTP server, VIP address, per-environment java application versions, etc…)
  2. Sensitive data
  3. Data that you don’t want to share with anyone else

This implies that we're severally abusing create_resources and that we should actually have 95% of the data in the profile manifests. When setting up a webserver instead of the profile manifests explicitly declaring the vhosts required using apache::vhost, we just include profile::apache. So by looking at the manifest you actually have no idea of how the profile is setup and have to consult hiera (which means checking through the whole hierarchy).

Can anyone using/familiar with the profiles/roles pattern comment on how they are using hiera i.e. are you explicitly defining resources in the manifest or are you using create resources wherever possible?

edit retag flag offensive close merge delete

3 Answers

Sort by » oldest newest most voted

answered 2015-05-31 13:11:36 -0500

ramindk gravatar image

updated 2015-05-31 13:16:42 -0500

You're taking a very narrow interpretation of what business-specific data is. It's everything that makes your system different from mine. You might find it simpler to think about what data isn't in Hiera. Quoting from the blog

Data that does NOT go in the site-specific Hiera datastore

  • OS-specific data
  • Data that EVERYONE ELSE who uses this module will need to know (paths to config files, package names, etc…)

Everything else is business specific data.

In regards to what goes in the Profile I'm opinionated.

I also did a short talk about the same topic this year at the local meetup.

edit flag offensive delete link more

answered 2015-05-29 05:09:11 -0500

Gary's blog post caused you to refactor your entire code base three times? This post is almost too good. :)

Programming is a bit of an art and no amount of best practices and rules of thumb ever replace the best judgement of an experienced programmer. I would like to see some of your profile classes and the hiera hierarchy you ended up with.

I sometimes use the create_resources pattern but not always. If you're going to use it extensively, be consistent, and make sure you use it in conjunction with resource defaults:



    uid: 501
    uid: 502


class user (
  $users = {},
) {
  User {
    ensure => present,
    manage_home => true,
  create_resources('user', $users)

Happy to offer some more comments if I can see your code.

edit flag offensive delete link more


Thanks Alex, much appreciated. I've pasted our base class and several random profiles You should see the inconsistencies resulting from different developer styles e.g. shorewall rules defined in manifests and hiera.

maxwell gravatar imagemaxwell ( 2015-05-29 05:45:44 -0500 )edit

While I haven't seen your hierarchy or your hiera data, your use of create_resources looks fine to me. Allowing different developers to have different styles is more of a problem in my opinion. Are you doing pair programming and/or code reviews? 1/3

Alex Harvey gravatar imageAlex Harvey ( 2015-05-29 22:54:00 -0500 )edit

My other comment would be that you have a lot going on in these classes, so I would look at splitting your classes up. E.g. instead of everything in profiles::base, you could have profiles::base::sshd, profiles::base::zabbix etc. Then your profiles::base will be a neat list of includes and 2/3

Alex Harvey gravatar imageAlex Harvey ( 2015-05-29 22:55:39 -0500 )edit

this may also make your hiera data look a bit simpler, although I still haven't seen your hierarchy so not sure. 3/3

Alex Harvey gravatar imageAlex Harvey ( 2015-05-29 22:56:41 -0500 )edit

answered 2015-05-31 05:56:18 -0500

dbschofield gravatar image

One thing that may help to keep in mind is the type of data you are storing in hiera. When using hiera and create_resources you are essentially using hiera as an ENC to do classification. This is data that defines what to install. The other type of data is data that defines how classes are installed. These are the puppet 3 data bindings and overrides that map to variables in the manifest code. It helps from a readability standpoint if the ENC type of data is stored separately from the binding data. Not necessary but less confusing. Possible ways to do this are using a separate directory structure in the hierarchy for classification data or a separate hiera backend altogether. In my experience we ended up moving the classification data into an actual ENC for easier integration with other enterprise systems and to inject a security layer around the data. The ENC does similar fact based filtering as hiera and so technically it could just be a hiera backend. The data that Larizza mentions is in hiera and will stay there.

 This implies that we're severally abusing create_resources and that we should actually have 95% of the data in the profile manifests.

It feels that way because puppet is mostly "class" centric and classes are singletons. Hiera and data bindings are built around classes. (automatic data bindings don't happen on defines) Createresources was a "feature" added for addressing the need to define resources in data. It's my opinion that createresources is popular because it addresses the shortcomings of class singletons when dynamic resources is what is really needed. It lets you set "define" level parameters with hiera when creating resource instances. So don't worry about abusing it. It adds a dynamic data driven element to classification. Many people do it and are successful with it.

You may want to look ahead at puppet 4 though. From what I understand, changes in the DSL make create_resources obsolete. Check out the 4 minute mark of this video. I haven't tried the future puppet 4 parser so I can't speak any more to it.

That is my two cents and in full disclosure I have been using roles/profiles and create_resources for 2.5 years and haven't needed a major refactoring of the code base.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower


Asked: 2015-05-29 03:01:00 -0500

Seen: 2,969 times

Last updated: May 31 '15