Is there a way to fix a corrupted puppetdb?

2013-08-22 18:30:38

jhossain

Hi everyone;

The root partition (/) on our puppetmaster (in Red Hat Enterprise Linux Server release 6.4) became FULL. Puppet server 2.7.18 was still running. We believe the puppetdb on this master got corrupted due to the above incident. The puppet master is not puppetizing any new clients.

Is there any way to fix the puppetdb on this puppetmaster so that we can continue using it?

Can you post more detailed information (log entries, etc.) that indicate how the master is corrupted? What kind of diagnostics and log entries do you see on the agents? Is ...(more)

GregLarkin ( 2013-08-23 01:40:51 -0600 )

Hi GregLarkin; Please see my answer with command output and log information on master. Thanks!

jhossain ( 2013-08-23 13:09:03 -0600 )

2013-08-23 13:08:12

jhossain

updated 2013-08-23 16:21:53

Hi GregLarkin;

With a clean environment on both master and agent (no old, pending certificates), I used the command:

puppet agent --server puppetmaster.<fqdn> --waitforcert 60 --test

Get the message:

info: Caching certificate for ca
info: Creating a new SSL certificate request <agent.fqdn>
info: Certificate Request fingerprint (md5): F9:68:A4:7B:79:96:94:E8:09:3D:42:04:D9:DA:FF:88
notice: Did not receive certificate

On master, I used the command: puppet cert --sign <agent.fqdn> Get the message:

notice: Signed certificate request for <agent.fqdn>
notice: Removing file Puppet::SSL::CertificateRequest <agent.fqdn> at ...
This is significant: "Could not find default node or by name with '<agent.fqdn>, <agent> ,<agent> on node <agent.fqdn>". What does your init.pp look like? Do you have ...(more)

GregLarkin ( 2013-08-23 15:12:06 -0600 )

Hi GregLarkin; Most of my modules have init.pp with classes and file defined. Yes, the nodes.pp has a node that matches agent's fqdn name. The certificate is ...(more)

jhossain ( 2013-08-23 16:52:34 -0600 )

Is your DNS reverse/forward information correct? Also, check the permissions on the configuration files that hold your node definitions. Also, are you running this behind Passenger, or standalone? If ...(more)

Trevor Vaughan ( 2013-08-23 17:33:30 -0600 )

Hi Trevor Vaughan; Yes, DNS information is correct. Not running behind passenger or standalone. The root partition of this host got full and server started acting up since then. Probably ...(more)

jhossain ( 2013-08-26 17:06:25 -0600 )

