James T McBride: Can't run remote agents from Puppet console. Also shuts down soon after starting.

PROBLEM:
Console won't run the agent saying, "Run Puppet has been disabled because Node Manager cannot connect to <fqdn>".
Also 'puppet job' can't run anything on any nodes from the MoM command line.

Tried:
Turn on debugging in activemq.

cp -p /etc/puppetlabs/activemq/log4j.properties /etc/puppetlabs/activemq/log4j.properties.orig

vim /etc/puppetlabs/activemq/log4j.properties

Comment this line: #log4j.rootLogger=INFO, console, logfile

Uncomment this line: log4j.rootLogger=DEBUG, logfile, console

Bounce the service:

sv=pe-activemq; echo == $sv; puppet resource service $sv ensure=stopped

sv=pe-activemq; echo == $sv; puppet resource service $sv ensure=running

The log rotates, so use -F

tail -F /var/log/puppetlabs/activemq/activemq.log

Certificate expiration messages started appearing:

2019-12-31T08:44:01.440-06:00 | WARN | Transport Connection to: tcp://<node_ip>:44166 failed: javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_expired | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: ssl:///<node_ip>:44166

Puppet support says to disable MCollective if we're not using it:

https://puppet.com/docs/pe/2018.1/disabling_mcollective_nodes.html

Trying to rekey Orchestration on the MoM:

mv /etc/puppetlabs/orchestration-services/ssl/<files> /tmp

cp <puppet>/ssl <orchestration

chown -R pe-orchestration-services:pe-orchestration-services /etc/puppetlabs/orchestraion-services/ssl

CAUSE:
The real cause was the PXP agent. On one of the failing managed nodes, the PXP agent was throwing errors that it couldn't connect to wss://<compile_master_load_balancer>:8140/pcp2/agent

SOLUTION:
Documentation says to set master_uris and pcp_broker_list for the PE Agent and PE Infrastructure Agent groups in the console (https://puppet.com/docs/pe/2018.1/installing_compile_masters.html). For example:

Change the PXP agent to connect directly to the MoM, not the compile masters' loadbalancer.
Classification --> PE Infrastructure --> PE Agent --> Configuration tab
Class: puppet_enterprise::profile::agent
master_uris = ["https://<MoM_FQDN>/"]
pcp_broker_list = ["<MoM_FQDN>:8142"]

However, our experience is that removing master_uris and pcp_broker had a much higher success rate.

After removing them, run the agent on the managed node. Successful connections should start appearing in

tail -F /var/log/puppetlabs//pxp-agent/pxp-agent.log

Also, the new master_uris and pcp_broker_list values should appear in the pxp-agent.conf file that the MoM manages remotely on the managed node. Look on the managed node:

cat /etc/puppetlabs/pxp-agent/pxp-agent.conf | python -m json.tool

EOS

James T McBride

Thursday, January 2, 2020

Can't run remote agents from Puppet console. Also shuts down soon after starting.

No comments:

Post a Comment