Tuesday, January 7, 2020

Puppet console can't run tasks from the console

PROBLEM:
Running a task from the console returns:
Failed - error...
Error: fsxopsx1031.wrk.fs.usda.gov is not connected to the PCP broker


On a broken node, run:
/opt/puppetlabs/puppet/bin/pxp-agent --foreground --loglevel debug
tail /var/log/puppetlabs/pxp-agent/pxp-agent.log

It returns:
2020-01-03 11:48:34.877539 INFO  puppetlabs.pxp_agent.main:189 - pxp-agent 1.9.11 started at debug level
2020-01-03 11:48:34.877742 ERROR puppetlabs.pxp_agent.main:208 - Fatal configuration error: broker-ws-uri or broker-ws-uris must be defined; cannot start pxp-agent

SOLUTION:
Same solution as last post, repeated here:

Documentation says to set master_uris and pcp_broker_list for the PE Agent and PE Infrastructure Agent groups in the console (https://puppet.com/docs/pe/2018.1/installing_compile_masters.html).  For example:

Change the PXP agent to connect directly to the MoM, not the compile masters' loadbalancer.
Classification --> PE Infrastructure --> PE Agent --> Configuration tab
Class: puppet_enterprise::profile::agent
master_uris = ["https://<MoM_FQDN>/"]
pcp_broker_list = ["<MoM_FQDN>:8142"]

However, our experience is that removing master_uris and pcp_broker had a much higher success rate.

After removing them, run the agent on the managed node.  Successful connections should start appearing in 
tail -F /var/log/puppetlabs//pxp-agent/pxp-agent.log

Also, the new master_uris and pcp_broker_list values should appear in the pxp-agent.conf file that the MoM manages remotely on the managed node.  Look on the managed node:
cat /etc/puppetlabs/pxp-agent/pxp-agent.conf | python -m json.tool
EOS

Thursday, January 2, 2020

Can't run remote agents from Puppet console. Also shuts down soon after starting.

PROBLEM:
Console won't run the agent saying, "Run Puppet has been disabled because Node Manager cannot connect to <fqdn>".
Also 'puppet job' can't run anything on any nodes from the MoM command line.

Tried:
Turn on debugging in activemq.

cp -p /etc/puppetlabs/activemq/log4j.properties /etc/puppetlabs/activemq/log4j.properties.orig
vim /etc/puppetlabs/activemq/log4j.properties
Comment this line:          #log4j.rootLogger=INFO, console, logfile
Uncomment this line:     log4j.rootLogger=DEBUG, logfile, console

Bounce the service:
sv=pe-activemq;               echo == $sv; puppet resource service $sv ensure=stopped
sv=pe-activemq;               echo == $sv; puppet resource service $sv ensure=running

The log rotates, so use -F
tail -F /var/log/puppetlabs/activemq/activemq.log


Certificate expiration messages started appearing:


2019-12-31T08:44:01.440-06:00 | WARN | Transport Connection to: tcp://<node_ip>:44166 failed: javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_expired | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: ssl:///<node_ip>:44166

Puppet support says to disable MCollective if we're not using it:

Trying to rekey Orchestration on the MoM:
mv /etc/puppetlabs/orchestration-services/ssl/<files> /tmp
cp <puppet>/ssl <orchestration
chown -R pe-orchestration-services:pe-orchestration-services /etc/puppetlabs/orchestraion-services/ssl

CAUSE:
The real cause was the PXP agent.  On one of the failing managed nodes, the PXP agent was throwing errors that it couldn't connect to wss://<compile_master_load_balancer>:8140/pcp2/agent

SOLUTION:
Documentation says to set master_uris and pcp_broker_list for the PE Agent and PE Infrastructure Agent groups in the console (https://puppet.com/docs/pe/2018.1/installing_compile_masters.html).  For example:

Change the PXP agent to connect directly to the MoM, not the compile masters' loadbalancer.
Classification --> PE Infrastructure --> PE Agent --> Configuration tab
Class: puppet_enterprise::profile::agent
master_uris = ["https://<MoM_FQDN>/"]
pcp_broker_list = ["<MoM_FQDN>:8142"]

However, our experience is that removing master_uris and pcp_broker had a much higher success rate.

After removing them, run the agent on the managed node.  Successful connections should start appearing in 
tail -F /var/log/puppetlabs//pxp-agent/pxp-agent.log

Also, the new master_uris and pcp_broker_list values should appear in the pxp-agent.conf file that the MoM manages remotely on the managed node.  Look on the managed node:
cat /etc/puppetlabs/pxp-agent/pxp-agent.conf | python -m json.tool

EOS