= Debug Server to Server connections = Our servers often need to ssh into each other to carry out various tasks. Most commonly: * Each server has to rsync, over ssh, to it's designated backup servers * Each server copies data to jojobe, our nagios server, so we can get alerts if anything is amiss These connections are configured using the [https://monkeysphere.info monkeysphere], specifically: * Each server generates an OpenPGP key and corresponding authentication subkey * Each server runs a ssh agent via runit (/etc/sv/ssh-agent-root) that keeps the authentication subkey loaded in memory so it can use it to access remote servers * Each remote server is configured with the User Ids of the root OpenPGP keys that should be able to access it. Unfortunately, sometimes things go wrong and servers are not able to connect to each other. In these examples I refer to the "connecting" server and the "target" server to distinguish between the two. Here are the top causes for the failures, and the remedies. Note: you may need to repeat the first one ''after'' fixing the problem with one of the later steps. A failure to connect sometimes seem to kill the ssh-agent. * Something when wrong with ssh-agent on the connecting server. Fix: Stop and restart the service, check for existence of socket: {{{ sv stop ssh-agent-root sv start ssh-agent-root ls -l /root/.ssh-agent-socket }}} * The target server does not have the latest version of the connecting server's OpenPGP key. Fix: refresh the key, reload the credentials, and test: {{{ monkeysphere-authentication refresh-keys monkeysphere-authentication update-users cat /var/lib/monkeysphere/authorized_keys/ }}} * The connecting server has not published the latest version of it's key. Fix: determine the keyid of the server's secret key, and then publish it: {{{ gpg --list-secret-key gpg --keyserver keys.mayfirst.org --send-key }}} Then, refresh the key on the target (see above). * The connecting server's OpenPGP key is expired. Fix: extend it: {{{ mf-gpg-extend-root-expiration }}} (This will also publish it). Then, refresh the key on the target (see above) * The connecting server's has not been certified by an allowed key (or the certification has expired). Fix: On the connecting server, refresh the key's certifications: {{{ gpg --recv-key }}} Then, list the certifications: {{{ gpg --check-sigs }}} Then, on the target server, see if any of them match the allowed certifiers: {{{ monkeysphere-authentication list-id-certifiers }}} If not, get someone on the allowed list to sign the key, then run the step for ensuring the target server has the lastest version of the connecting servers OpenPGP key.