Puppet agent on the same host as master gets SSL error during first runs

2017-05-11 08:46:50 -0600

Alina

Hello community!

I'm facing the issue with configuring puppet master with puppet. Initial data: Puppet v.4.10, PuppetDB v.4.4. Architecture: ELB -> Puppet master; ELB -> PuppetDB (Postgresql RDS for backend database). Process:
1) Install puppetserver (dependencies are installed automatically) on Puppet Master host.
2) Clone git repo with puppet code; run puppet apply to configure puppetserver.
3) Install puppet on PuppetDB host; set puppet server value in puppet.conf (DNS of puppet master ELB); run puppet agent to install and configure PuppetDB.
4) Run puppet apply on master node again to apply puppetdb::master::config class (set storeconfigs to "true" and storeconfigs backend to "puppetdb"). Restart puppetserver service.

All the steps above run well without any issues. But when I run puppet agent on master node I receive error SSL_connect SYSCALL returned=5 errno=0 state=SSLv2/v3 read server hello A. Interesting fact is that the issue is self-resolved after several puppet agent runs (or just after some time maybe and not depends of agent runs). I didn't find any information in logs that could help to understand what requests receive this error and also no events precede the moment when the issue become resolved.

I'm new to Puppet so maybe I miss some basics about communication between Puppet master and PuppetDB. I will be grateful for any your thoughts on the described issue.

Thank you!

2 Answers

2017-05-11 14:54:50 -0600

smarlow

Hm. It's a bit tricky to say, but with these sorts of errors it's usually related to establishing the connection properly and SSL certificates.

Often there's an issue with the certificates not matching or the times between servers being out of sync (so certs are not yet valid). But if you're connecting to the local box those don't seem applicable.

If you're hairpinning through the load balancer shortly after a puppetserver restart it may be the case that the ELB hadn't yet brought the server back in and it wasn't returning any valid servers.

Thank you Steve! It seems that the issue is really related to ELB health checks.

Alina ( 2017-05-12 08:12:19 -0600 )

2017-05-12 11:37:53 -0600

Alina

I decided to check smarlow's assumption about ELB. Also, I desided to find out if the issue is related to PuppetDB or not, so I performed the next experiment. Architecture: ELB -> Puppet server.

1) Installed puppetserver on EC2.
2) Cloned Git repo with puppet code.
3) Ran puppet apply to configure Puppet server.

All the steps above ran successfully, ELB showed 1 of 1 instances in service. ELB health check is as follows.

Ping Target          TCP:8140  
Timeout              10 seconds  
Interval             15 seconds  
Unhealthy threshold  5  
Healthy threshold    3

I waited about 20 minutes just in case and then ran puppet agent on master node. During agent run at some moment it received the same SSL error. I checked ELB and found that it detects 0 of 1 instances in service...

puppetserver service is running, iptables have rule that allows TCP connection to 8140, but remote telnet connection to 8140 is failed. netstat -tapn | grep :8140 returns:

tcp6       0      0 :::8140                 :::*                    LISTEN      10366/java

I restarted puppetserver service and the issue was gone. So it seems that I have some issue with puppet scripts that configure puppet server. I will try to find out what breaks a connection and get back with updates (to share my findings with someone who maybe is facing similar issue).

Actually I can't understand why connection get lost because if I run puppet apply all is ok, but when I run the same code with agent then connection get lost...

Alina ( 2017-05-12 12:13:59 -0600 )

