Opened 5 years ago

Closed 5 years ago

#7044 closed Feature/Enhancement Request (wontfix)

Nagios check for syslog daemon

Reported by: https://id.mayfirst.org/gregl Owned by: https://id.mayfirst.org/ross
Priority: Medium Component: Tech
Keywords: rsyslog syslog Cc:
Sensitive: no

Description

There have been a couple of issues related to recent logging changes, most recently #7309

Since logging is pretty important, perhaps we should add Nagios checks to make sure there's some logging daemon running on each machine? At the very least, I think we should have such a check until we're sure that our logging changes are finished.

If there's a logging device/socket that syslog and rsyslog (and others?) use, checking that such a device exists and is writable would be one way to handle this.

Change History (3)

comment:1 Changed 5 years ago by https://id.mayfirst.org/ross

  • Owner set to https://id.mayfirst.org/gregl
  • Status changed from new to assigned

Hi greg,

Any chance you could research the best way to test this logging device/socket? If you figure out what to check, I'll write the monitoring interface.

~/ross

comment:2 Changed 5 years ago by https://id.mayfirst.org/gregl

  • Owner changed from https://id.mayfirst.org/gregl to https://id.mayfirst.org/ross

It looks we could use the /dev/log socket for this:

0 ken:~# ls -l /dev/log
srw-rw-rw- 1 root root 0 Apr  6 06:25 /dev/log
0 ken:~# lsof /dev/log
COMMAND    PID USER   FD   TYPE             DEVICE SIZE/OFF      NODE NAME
rsyslogd 13091 root    0u  unix 0xffff88083c70bf00      0t0 246244638 /dev/log

Here it's held open by rsyslog, and it's held open by systemd on my laptop, so it looks like a socket of this sort is held open by whatever's responsible for recording logs. That socket not being there seems like a decent first approximation for whether logging is enabled on a given machine.

comment:3 Changed 5 years ago by https://id.mayfirst.org/ross

  • Resolution set to wontfix
  • Status changed from assigned to closed

Given my recent experience with various servers, I think the major issue with logging is that logrotate isn't working properly on some servers. However, this seems to be a problem only on servers whose logging process has not been restarted. The effect of such a problem results in /var logical volumes filling up for which we already have a nagios alert.

And since most servers have at this point been rebooted, I believe logrotate should be functioning properly. Thus, I'm inclined not to implement this particular nagios check.

~/ross

Please login to add comments to this ticket.

Note: See TracTickets for help on using tickets.