Opened 3 weeks ago

Closed 11 days ago

#13843 closed Bug/Something is broken (fixed)

php-fpm5 not always restarted properly

Reported by: https://id.mayfirst.org/jamie Owned by: https://id.mayfirst.org/jamie
Priority: Medium Component: Tech
Keywords: Cc:
Sensitive: no

Description

This morning I started php5-fpm on several moshes (viewsic, rodolpho, peery, magon, kerr and a new others).

Here was the output of php5-fpm status:

1 kerr:~# systemctl --failed
0 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.
0 kerr:~# systemctl status php5-fpm
● php5-fpm.service - The PHP FastCGI Process Manager
   Loaded: loaded (/lib/systemd/system/php5-fpm.service; enabled)
   Active: inactive (dead) since Fri 2018-06-29 04:16:20 EDT; 4h 36min ago
 Main PID: 9285 (code=exited, status=0/SUCCESS)
   Status: "Processes active: 0, idle: 0, Requests: 504, slow: 0, Traffic: 0req/sec"

Jun 29 04:16:20 kerr systemd[1]: Stopping The PHP FastCGI Process Manager...
Jun 29 04:16:20 kerr systemd[1]: Stopped The PHP FastCGI Process Manager.
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
3 kerr:~# 


Change History (5)

comment:1 Changed 3 weeks ago by https://id.mayfirst.org/jamie

  • Owner set to https://id.mayfirst.org/jamie
  • Status changed from new to assigned

Because the service was stopped, systemctl --failed doesn't pick it up.

I am pretty sure the service was stopped by the needsrestart program that tries to restart services using libraries that have been upgraded, so I suspect there is a bug to report somewhere around here.

However... for now at least, I think the answer is to write a new mf-monitor script for moshes that simply ensures a set list of services are properly running at all times.

comment:2 Changed 3 weeks ago by https://id.mayfirst.org/jamie

Because http, smtp and imap/pop connections are monitored by nagios remotely, we would just need to focus on services that don't provide a remote port to probe: php, clamsmtp, clamav, mysql, spampd,

comment:3 Changed 3 weeks ago by https://id.mayfirst.org/jaimev

So if there is an indication from systemd that a service is inactive, even if it hasn't failed maybe we can still use that? Looking through systemctl man page I found a few ideas.

You cna query any service to find out if it is active.

0 floriberto:~# systemctl is-active php5-fpm

Also in theory we could get a list of inactive services.

0 floriberto:~# systemctl --type=service --state=inactive

Additionally systemctl can query specific properties of any service

0 floriberto:~# systemctl -p ActiveState -p SubState show php5-fpm
ActiveState=active
SubState=running

comment:4 Changed 3 weeks ago by https://id.mayfirst.org/jaimev

Ok, I just used the below to check on postfwd in dider that I knew had crashed but wasn't failed or inactive. The Substate was helpful.

0 didier:~# systemctl -p ActiveState -p SubState show postfwd
ActiveState=active
SubState=exited

After restarting it shows running again.

didier:~# systemctl -p ActiveState -p SubState show postfwd
ActiveState=active
SubState=running

Other useful properties to get a timestamp for the time of exit might be the following.

 ExecStart={ path=/etc/init.d/postfwd ; argv[]=/etc/init.d/postfwd start ; ignore_errors=no ; start_time=[Thu 2018-06-28 23:21:37 EDT] ; stop_time=[Thu 2018-06-28 23:21:38 EDT] ; pid=12769 ; code=exited ; status=0 }
 ExecStop={ path=/etc/init.d/postfwd ; argv[]=/etc/init.d/postfwd stop ; ignore_errors=no ; start_time=[Thu 2018-06-28 23:21:37 EDT] ; stop_time=[Thu 2018-06-28 23:21:37 EDT] ; pid=12756 ; code=exited ; status=0 }
 InactiveExitTimestamp=Thu 2018-06-28 23:21:37 EDT
 InactiveExitTimestampMonotonic=14859037174801
 ActiveEnterTimestamp=Thu 2018-06-28 23:21:38 EDT
 ActiveEnterTimestampMonotonic=14859037566933
 ActiveExitTimestamp=Thu 2018-06-28 23:21:37 EDT
 ActiveExitTimestampMonotonic=14859037124322
 InactiveEnterTimestamp=Thu 2018-06-28 23:21:37 EDT
 InactiveEnterTimestampMonotonic=14859037171914
 ConditionTimestamp=Thu 2018-06-28 23:21:37 EDT
 ConditionTimestampMonotonic=14859037172361

I'm also curious about the WatchdogTimestamp. This was at 0 before restarting.

 WatchdogTimestampMonotonic=0

comment:5 Changed 11 days ago by https://id.mayfirst.org/jamie

  • Resolution set to fixed
  • Status changed from assigned to closed

Thanks for the extra research. I just updated mf-monitor-services so it checks SubState for mysql, postfwd, clamav-daemon, spampd and all versions of php-fpm.

I've initially pushed to viewsic and stokely for testing, but we could sign a tag tomorrow to get it out to everyone.

Please login to add comments to this ticket.

Note: See TracTickets for help on using tickets.