503 This website is under heavy load

asked 2017-11-15 10:35:46 -0600

jzoof gravatar image

I've googled around and haven't been able to find a solid fix for our puppet master repeatedly returning a 503 and falling behind. Hoping someone here has some pointers:

Puppet master specs:

  • 16 core
  • 30gb Memory
  • Centos 7.2 Puppet
  • version 3.8.7 (Yes I know its old)

We have multiple server environments, probably close to 900 servers that are running puppet. 3/4 of the servers run on a 4hour interval, and the other 1/4 run on an hour interval.

runinterval = 14400
report      = true
splay       = true
splaylimit  = 60m
environment = env
pluginsync  = true
ordering    = manifest

Httpd puppet.conf

PassengerMinInstances 12   
PassengerMaxPoolSize 20   
PassengerMaxRequests 20000   
PassengerPoolIdleTime 600   
PassengerStatThrottleRate 120   
PassengerPreStart https://server.com:8140   
PassengerHighPerformance on
edit retag flag offensive close merge delete

Comments

If the set of manifests per node to evaluate is large, it is – to be fair _was_ [PP 3.8] – indeed possible that the master's becoming too stressed out. Can you split/maintain multiple masters? Is it really your sysinfo()[loads] being more than 16 (number of CPU cores)?

Kai Burghardt gravatar imageKai Burghardt ( 2017-12-11 13:13:04 -0600 )edit

Was hoping that it was a simple parameter change, not really in a situation to run a multi-master setup. I'm not sure what you mean by "Is it really your sysinfo()[loads] being more than 16 (number of CPU cores)?"

jzoof gravatar imagejzoof ( 2017-12-11 14:01:22 -0600 )edit

What [e.g.] `uptime(1)` reports as “load average”. [sysinfo() is the (or to say a) function to retrieve that information]

Kai Burghardt gravatar imageKai Burghardt ( 2017-12-12 12:23:46 -0600 )edit