Ask Your Question

vcsrepo scalable?

asked 2018-03-14 12:23:20 -0500

daniel.serrao gravatar image


Right now I'm using the puppet module vcsrepo ( to update repos in a server. The problem is that we started doing that for 75 repos and puppet takes about 13 minutes to finish even when there isn't any changes on the repos. Considering that I need to implement this on a client and the client have more than 200 repos, this module seem to not be the best solution.

Right now the puppet code looks like:

$repos_to_install.each | String $repo_path | {
  vcsrepo { $repo_path:
    ensure   => latest,
    provider => git,
    source   => $repo_git_url,
    revision => 'master'

As you can see I iterate a list of repos and for each one I execute the vcsrepo module to fetch the latest version. I would like to know if there is a better way of using the vcsrepo to improve the performance. The goal is to run puppet in less than 5 minutes with 250 repos when these repos have no changes, since is not that bad if the first time puppet runs it takes more than 5 minutes.

If you know that it is impossible to get such performance with vcsrepo or if you know a better way please tell me.


edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted

answered 2018-03-14 19:19:14 -0500

natemccurdy gravatar image

The slowness is coming from the fact that git has to fetch and pull all the changes from all of those repos. It's not Puppet being slow, it's your Git server that's slow.

One thing you could do is not manage individual repos with Puppet, but instead, write some script that manages them for you. Then have Puppet manage that script and maybe tell the script to run every x hours with a cron job.

What's your ultimate goal here? To have the repos always be updated every 30 minutes... or something else? Why do the repos exist on each client machine?

edit flag offensive delete link more


Hi natemccurdy. The goal is to make sure that the repos are updated (if there is any changes) every 30 minutes. The repos should only be in one of the machines, not all clients machines, but that part is already done by checking the FQDN with a fact.

daniel.serrao gravatar imagedaniel.serrao ( 2018-03-15 03:14:22 -0500 )edit

I will try the approach that you mentioned with a cron job and then I will provide some feedback here related to how did go. Thanks for your help!

daniel.serrao gravatar imagedaniel.serrao ( 2018-03-15 03:14:33 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower


Asked: 2018-03-14 12:23:20 -0500

Seen: 333 times

Last updated: Mar 14