Thursday, January 2, 2020

Can't run remote agents from Puppet console. Also shuts down soon after starting.

PROBLEM:
Console won't run the agent saying, "Run Puppet has been disabled because Node Manager cannot connect to <fqdn>".
Also 'puppet job' can't run anything on any nodes from the MoM command line.

Tried:
Turn on debugging in activemq.

cp -p /etc/puppetlabs/activemq/log4j.properties /etc/puppetlabs/activemq/log4j.properties.orig
vim /etc/puppetlabs/activemq/log4j.properties
Comment this line:          #log4j.rootLogger=INFO, console, logfile
Uncomment this line:     log4j.rootLogger=DEBUG, logfile, console

Bounce the service:
sv=pe-activemq;               echo == $sv; puppet resource service $sv ensure=stopped
sv=pe-activemq;               echo == $sv; puppet resource service $sv ensure=running

The log rotates, so use -F
tail -F /var/log/puppetlabs/activemq/activemq.log


Certificate expiration messages started appearing:


2019-12-31T08:44:01.440-06:00 | WARN | Transport Connection to: tcp://<node_ip>:44166 failed: javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_expired | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: ssl:///<node_ip>:44166

Puppet support says to disable MCollective if we're not using it:

Trying to rekey Orchestration on the MoM:
mv /etc/puppetlabs/orchestration-services/ssl/<files> /tmp
cp <puppet>/ssl <orchestration
chown -R pe-orchestration-services:pe-orchestration-services /etc/puppetlabs/orchestraion-services/ssl

CAUSE:
The real cause was the PXP agent.  On one of the failing managed nodes, the PXP agent was throwing errors that it couldn't connect to wss://<compile_master_load_balancer>:8140/pcp2/agent

SOLUTION:
Documentation says to set master_uris and pcp_broker_list for the PE Agent and PE Infrastructure Agent groups in the console (https://puppet.com/docs/pe/2018.1/installing_compile_masters.html).  For example:

Change the PXP agent to connect directly to the MoM, not the compile masters' loadbalancer.
Classification --> PE Infrastructure --> PE Agent --> Configuration tab
Class: puppet_enterprise::profile::agent
master_uris = ["https://<MoM_FQDN>/"]
pcp_broker_list = ["<MoM_FQDN>:8142"]

However, our experience is that removing master_uris and pcp_broker had a much higher success rate.

After removing them, run the agent on the managed node.  Successful connections should start appearing in 
tail -F /var/log/puppetlabs//pxp-agent/pxp-agent.log

Also, the new master_uris and pcp_broker_list values should appear in the pxp-agent.conf file that the MoM manages remotely on the managed node.  Look on the managed node:
cat /etc/puppetlabs/pxp-agent/pxp-agent.conf | python -m json.tool

EOS





No comments:

Post a Comment