PE-console-services fail upon start

asked 2017-08-10 14:54:17 -0600

Good afternoon,

Having an issue where our pe-console-services fail upon start, and hoping someone can help lead me in the right direction. All other puppet services are running. We are running Puppet Enterprise 2015.2 on CentOS 7

The console-services log gives the following info:

ERROR [p.t.internal] Error during service init!!!
java.sql.SQLException: Unable to open a test connection to the given database. JDBC url = jdbc:postgresql://puppet-corp-02.domain.com:5432/pe-rbac?ssl=true&sslfactory=org.postgresql.ssl.jdbc4.LibPQFactory&sslmode=verify-full&sslrootcert=/etc/puppetlabs/puppet/ssl/certs/ca.pem&stringtype=unspecified&characterEncoding=UTF-8, username = pe-rbac. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------

...followed by:

Caused by: java.net.ConnectException: Connection refused


Am also seeing various SSL related errors:
mcollective.log:

ERROR -- : activemq.rb:149:in `onsslconnectfail' SSL session creation with stomp+ssl://mcollective@puppet-corp-02.domain.com:61613 failed: Connection refused - connect(2) for "puppet-corp-02.domain.com" port 61613


puppetdb.log:

2017-08-10 00:00:55,817 ERROR [c.j.b.h.AbstractConnectionHook] Failed to acquire connection Sleeping for 7000ms and trying again. Attempts left: 1. Exception: null 2017-08-10 00:01:02,817 ERROR [c.j.b.PoolWatchThread] Error in trying to obtain a connection. Retrying in 7000ms org.postgresql.util.PSQLException: The server does not support SSL.


I know that this is a lot of scattered information, but I hope that it can at least start a conversation to lead to a solution. Currently no agents can talk to the puppet server. This all started happening Tuesday night after a reboot. However, we restored a snapshot of the server from July 5 and are seeing the same issue, so it's apparently been a problem for awhile, but didn't show up until the services had to be started again.
Any suggestions are welcome at this point, as we feel like we've exhausted our resources on our end with various troubleshooting.


Thanks, Mike

edit retag flag offensive close merge delete

Comments

Do you still have valid certs on the master? I have observed similar behavior when I accidently cleaned up the master's ssl directory, and it wasn't until the process restarted that it failed all of the connections (we had an error in our script that would cleanup 'stale' entries/certs)

DarylW gravatar imageDarylW ( 2017-08-11 10:58:47 -0600 )edit

Looking at /etc/puppetlabs/puppet/ssl/certs/ca.pem, the cert is valid until 2020. There is an entry in console.conf pointing to /opt/puppetlabs/server/data/console-services/certs/ but that is also valid until 2020

mcarnes79 gravatar imagemcarnes79 ( 2017-08-11 11:51:15 -0600 )edit

If you accdently delete the cert directory, it will create a new cert, but when that is vaildated against what it expects, it will be a different cert, and it will be disallowed by the master. You can try this out with a new node against a master. (or a node that you can fix without impacting)

DarylW gravatar imageDarylW ( 2017-10-15 22:33:39 -0600 )edit

If you take a working node, delete it's ssl directory (or mv it), then rerun puppet agent -t, you should see a similar error message. If you moved the folder, delete the new one and move it back, and rerun puppet. If you deleted the folder, you need to clean up the 'old' cert from the master, and

DarylW gravatar imageDarylW ( 2017-10-15 22:34:42 -0600 )edit

then the master will acept your 'new' cert

DarylW gravatar imageDarylW ( 2017-10-15 22:34:56 -0600 )edit