Opened 8 years ago

Last modified 5 years ago

#4138 assigned Task/To do item

automatic nagios backup audits

Reported by: https://id.mayfirst.org/jamie Owned by: https://id.mayfirst.org/jamie
Priority: Urgent Component: Tech
Keywords: backup Cc:
Sensitive: no

Description

Currently, our only backup audit is an email I get (sent to root@…) warning me about backup failures. This email is triggered by the backup process itself, making it an unreliable audit.

I propose that we configure our nagios server to initiate audits, perhaps by grepping /var/log/backupnininja.log on each server and checking dates and success / failure entries.

It seems like nagios-nrpe-server could provide a daemon on each server that would provide access, although it's security seems to be based soley on IP address of the nagios server (and the conf file has notes saying that even this check is only rudimentary).

Alternatively, we could create a nagios user on each server and provide key-based ssh access that way.

Change History (8)

comment:1 Changed 8 years ago by https://id.mayfirst.org/jamie

At the last support meeting - dkg proposed publicly publishing our backup status via the web.

This would involve ensuring that a small footprint web server (like nginx) is installed on each server in our network and that this type of status info is published on a special port (e.g. 2525).

This would make backup status info available both to nagios and to our members or anyone in the world.

jamie

comment:2 Changed 8 years ago by https://id.mayfirst.org/jamie

We've tried and abandoned the nginx plan - to much overhead to get a secure connection. Now, we are scp'ing the files from each server to jojobe. I still haven't setup the nagios check on the files that are copied over though...

jamie

comment:3 Changed 6 years ago by https://id.mayfirst.org/ross

  • Status changed from new to assigned

comment:4 Changed 6 years ago by https://id.mayfirst.org/jamie

  • Priority changed from Medium to High

comment:5 Changed 5 years ago by https://id.mayfirst.org/dskallman

Is there an update on this, it's been a while since the last response.

comment:6 Changed 5 years ago by https://id.mayfirst.org/jamie

  • Priority changed from High to Urgent

comment:7 Changed 5 years ago by https://id.mayfirst.org/dskallman

What's the urgency?

comment:8 Changed 5 years ago by https://id.mayfirst.org/jamie

Until this is fixed, the only person that learns about failed backups is me :(.

Please login to add comments to this ticket.

Note: See TracTickets for help on using tickets.