Ask Your Question

Handling a 65k+ line yaml file

asked 2017-09-22 06:52:57 -0500

puser gravatar image

updated 2017-09-22 06:54:29 -0500


I have a yaml file that is about 65,000 lines long. The yaml contains a hash with server user and group relationships. The format is the following:

          - user1
          - user2
          - user3
          - user4
          - user1
          - user2
          - user3
          - user4

My question is, How does puppet read in YAML files, I assume it is using the ruby YAML.load function. How devastating would keeping this file in memory have on the puppet master itself?


edit retag flag offensive close merge delete

2 Answers

Sort by » oldest newest most voted

answered 2017-09-24 04:06:35 -0500

Henrik Lindberg gravatar image

I would not worry much about the memory consumption unless you have huge data items in your yaml. If you data is just short strings - say 10 characters per line I would guess it takes about one Mb or so (you could measure to be sure). What is worse is the time it takes to read the file, and how many times the file gets read. If you typically only use a fraction of the data during a particular compilation it would be better to split it up and write your own hiera5 backend. Also make sure that you are not using constructs in your hiera config that requires the hiera implementation throw out the cache (you should for example not interpolate local/context dependant variables into your hiera.yaml).

As with all optimization - the result varies with use cases and the only way to know is to measure, change, and measure again. It may be that what you gain by splitting it up is lost by having to handle that it is split up.

edit flag offensive delete link more

answered 2017-09-22 08:30:21 -0500

DarylW gravatar image

You can probably make a custom hiera backend that would either read the file in a more efficient manner. ( )

Hiera backends are Puppet functions

In this version of Hiera, a backend is simply a custom Puppet function that accepts a particular set of arguments and whose return value obeys a particular format. The function can do whatever is necessary to locate its data.

A backend function can use the modern Ruby functions API or the Puppet language. (They can’t use the legacy Ruby functions API.) Among other things, this means you can use different versions of a Hiera backend in different environments, and you can distribute Hiera backends in Puppet modules.

This is a simpler interface than in previous versions of Hiera, where custom backends were globally-loaded Ruby classes that had to define particular methods.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower


Asked: 2017-09-22 06:52:57 -0500

Seen: 50 times

Last updated: Sep 24