wiki:debug-server-to-server-connections

Version 4 (modified by Jamie McClelland, 5 years ago) (diff)

--

Debug Server to Server connections

Our servers often need to ssh into each other to carry out various tasks. Most commonly:

  • Each server has to rsync, over ssh, to it's designated backup servers
  • Each server copies data to jojobe, our nagios server, so we can get alerts if anything is amiss

These connections are configured using the monkeysphere, specifically:

  • Each server generates an OpenPGP key and corresponding authentication subkey
  • Each server runs a ssh agent via runit (/etc/sv/ssh-agent-root) that keeps the authentication subkey loaded in memory so it can use it to access remote servers
  • Each remote server is configured with the User Ids of the root OpenPGP keys that should be able to access it.

Unfortunately, sometimes things go wrong and servers are not able to connect to each other.

In these examples I refer to the "connecting" server and the "target" server to distinguish between the two.

Here are the top causes for the failures, and the remedies. Note: you may need to repeat the first one after fixing the problem with one of the later steps. A failure to connect sometimes seem to kill the ssh-agent.

  • Something went wrong with ssh-agent on the connecting server. Fix: Stop and restart the service, check for existence of socket:
    sv stop ssh-agent-root
    sv start ssh-agent-root
    ls -l /root/.ssh-agent-socket
    
  • The target server does not have the latest version of the connecting server's OpenPGP key. Fix: refresh the key, reload the credentials, and test:
    monkeysphere-authentication refresh-keys <username>
    monkeysphere-authentication update-users <username>
    cat /var/lib/monkeysphere/authorized_keys/<username>
    

Note: The last cat command must produce a file with the connecting server's key or it will never work.

  • The connecting server has not published the latest version of it's key. Fix: determine the keyid of the server's secret key, and then publish it:
    gpg --list-secret-key
    gpg --keyserver keys.mayfirst.org --send-key <keyid>
    
    Then, refresh the key on the target (see above).
  • The connecting server's OpenPGP key is expired. Fix: extend it:
    mf-gpg-extend-root-expiration
    
    (This will also publish it). Then, refresh the key on the target (see above)
  • The connecting server's has not been certified by an allowed key (or the certification has expired). Fix: On the connecting server, refresh the key's certifications:
    gpg --recv-key <keyid>
    
    Then, list the certifications:
    gpg --check-sigs <keyid>
    
    Then, on the target server, see if any of them match the allowed certifiers:
    monkeysphere-authentication list-id-certifiers
    
    If not, get someone on the allowed list to sign the key, then run the step for ensuring the target server has the lastest version of the connecting servers OpenPGP key.