Opened 3 years ago

Last modified 2 years ago

#12112 assigned Bug/Something is broken

systemd on rodolpho was confused about the state of postfwd

Reported by: Daniel Kahn Gillmor Owned by: Jamie McClelland
Priority: Medium Component: Tech
Keywords: postfwd rodolpho.mayfirst.org systemd yser.mayfirst.org Cc: Jamie McClelland
Sensitive: no

Description

Restarting services using systemd...
Job for postfwd.service failed. See 'systemctl status postfwd.service' and 'journalctl -xn' for details.
0 rodolpho:~# systemctl status postfwd
● postfwd.service - LSB: start and stop the postfw daemon                                                                                                        
   Loaded: loaded (/etc/init.d/postfwd)
   Active: failed (Result: exit-code) since Tue 2016-09-13 18:19:04 EDT; 1min 14s ago
  Process: 5293 ExecStop=/etc/init.d/postfwd stop (code=killed, signal=PIPE)
  Process: 3888 ExecStart=/etc/init.d/postfwd start (code=exited, status=25)

0 rodolpho:~# 

but in practice, there was a postfwd daemon running. I had to kill it off manually after looking for it in the process table, then:

systemctl stop postfwd.service
systemctl start postfwd.service

and now systemd seems to understand that things are OK:

0 rodolpho:~# systemctl status postfwd
● postfwd.service - LSB: start and stop the postfw daemon
   Loaded: loaded (/etc/init.d/postfwd)
   Active: active (running) since Tue 2016-09-13 18:22:46 EDT; 8s ago
  Process: 5293 ExecStop=/etc/init.d/postfwd stop (code=killed, signal=PIPE)
  Process: 13818 ExecStart=/etc/init.d/postfwd start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/postfwd.service
           ├─13822 /usr/sbin/postfwd --summary=600 --cache=600 --cache-rdomain-only --cache-no-size --daemon --file=/etc/postfix/postfwd.cf --interface=127.0.0.1 --port=10040 --user=postfw --group=postfw --pidfile=/var/run/p...
           ├─13823  postfwd2::cache
           ├─13824  postfwd2::policy
           ├─13825  postfwd2::policy::child
           ├─13826  postfwd2::policy::child
           ├─13827  postfwd2::policy::child
           ├─13828  postfwd2::policy::child
           ├─13829  postfwd2::policy::child
           ├─13830  postfwd2::policy::child
           ├─13831  postfwd2::policy::child
           ├─13832  postfwd2::policy::child
           ├─13833  postfwd2::policy::child
           └─13834  postfwd2::policy::child

Sep 13 18:22:46 rodolpho postfwd[13818]: Starting postfwd: Pid_file "/var/run/postfwd.pid" already exists.  Overwriting!
Sep 13 18:22:46 rodolpho postfwd2/master[13822]: Started cache at pid 13823
Sep 13 18:22:46 rodolpho postfwd2/master[13822]: Started server at pid 13824
Sep 13 18:22:46 rodolpho postfwd2/master[13823]: 2016/09/13-18:22:46 postfwd2::cache (type Net::Server::Multiplex) starting! pid(13823)
Sep 13 18:22:46 rodolpho postfwd[13818]: postfwd.
Sep 13 18:22:46 rodolpho postfwd2/master[13823]: Binding to UNIX socket file "/var/tmp/postfwd2-cache.socket"
Sep 13 18:22:46 rodolpho postfwd2/master[13824]: 2016/09/13-18:22:46 postfwd2::server (type Net::Server::PreFork) starting! pid(13824)
Sep 13 18:22:46 rodolpho postfwd2/cache[13823]: ready for input
Sep 13 18:22:46 rodolpho postfwd2/master[13824]: Binding to TCP port 10040 on host 127.0.0.1 with IPv4
Sep 13 18:22:46 rodolpho postfwd2/policy[13824]: ready for input
0 rodolpho:~# 

perhaps systemd could be smarter about this if the service was described with a .service file instead of using a wrapper around the older sysvinit script?

I'm open to suggestions.

Change History (5)

comment:1 Changed 3 years ago by JaimeV

Cc: Jamie McClelland added
Owner: set to Daniel Kahn Gillmor
Status: newassigned

comment:2 Changed 3 years ago by Jamie McClelland

Yes - a .service file would be great (and perhaps should be contributed to the debian package?).

If you can write a file I can add it to puppet and open a bug report against the debian package.

Also, for the record, postfwd has been added to our servers but has not yet been tested and implemented (#9746) - so I'm going to re-prioritize that ticket so we are actually using postfwd instead of debugging it.

And lastly, in #9746 I modified how puppet manages the service (set hasstatus => false) to avoid errors on sysvinit systems. I've updated puppet (but not yet pushed to all servers) to change this behavior on systemd machines because systemctl does have a working status function.

comment:3 Changed 3 years ago by Daniel Kahn Gillmor

I don't know anything about postfwd, either how it works generally, or how it's intended to be implemented on MF/PL servers. I can read the manual pages and #9746 and take a tab at writing up a postfwd.service and post it here.

/usr/share/doc/postfwd/versions.txt suggests that postfwd1 is faster for rate limits, which appears to be what we're using it for. yet postfwd2 appears to be what we're running on rodolpho.

Is there a reason for this? There's a bunch of stuff in postfwd itself that could easily be taken care of by systemd -- no need to specify which interface to use, or which port to open, or which user to switch to, etc.

If i'm writing a postfwd.service file, is there any reason to keep that stuff around, or should we let systemd handle it? For full socket activation, we might need a patch to postfwd itself, so maybe i'll leave the socket activation out for now?

comment:4 Changed 3 years ago by Daniel Kahn Gillmor

Owner: changed from Daniel Kahn Gillmor to Jamie McClelland

I just set up this manually on rodolpho:

0 rodolpho:~# cat /etc/systemd/system/postfwd.service
[Unit]
Description=Postfix firewall daemon
After=network.target
Documentation=man:postfwd(1)

[Service]
Type=forking
EnvironmentFile=-/etc/default/postfwd
# User=postfwd
ExecStart=/usr/sbin/postfwd $ARGS --user $RUNAS --group $RUNAS --daemon --file $CONF --interface $INET --port $PORT --pidfile /run/postfwd/postfwd.pid

[Install]
WantedBy=multi-user.target

to get it going, i did:

systemctl stop postfwd.service
systemctl daemon-reload
systemctl install postfwd.service
systemctl enable postfwd.service
systemctl start postfwd.service

This is a gross hack for many reasons, and would be better with some fixes to upstream postfwd:

  • postfwd should not try to write a pidfile if no --pidfile argument is supplied. this would remove the need for etc/tmpfiles.d/postfwd.conf entirely. We don't need a pidfile if the process is under proper supervision.
  • postfwd should not try to change user or group if it is not given a --user or --group argument and it is already not running as the superuser. This would let us get rid of the $RUNAS business and just set User=postfw in postfwd.service as i tried to do initially.
  • if no --interface or --port arguments are supplied, postfwd should check the $LISTEN_FDS environment variable and use any passed-in listening sockets if they're present instead of trying to listen directly on its default configuration. this would permit socket activation.

With the above changes, postfwd.service should look like this:

[Unit]
Description=Postfix firewall daemon
After=network.target
Documentation=man:postfwd(1)

[Service]
Type=forking
EnvironmentFile=-/etc/default/postfwd
User=postfwd
ExecStart=/usr/sbin/postfwd $ARGS --daemon --file $CONF

[Install]
WantedBy=multi-user.target

and you could make it socket-activated by adding postfwd.socket :

[Unit]
Description=Postfix firewall daemon network listener
After=network.target
Documentation=man:postfwd(1)

[Socket]
ListenStream=127.0.0.1:10040

[Install]
WantedBy=sockets.target

With these changes, the only thing we'd need to put in /etc/default/postfwd would be:

CONF=/etc/postfix/postfwd.cf
ARGS="--summary=600 --cache=600 --cache-rdomain-only --cache-no-size"

(though if those should be the defaults, i'd be even happier without an /etc/default/postfwd at all)

It looks to me like we should be running postfwd1 instead of postfwd2, but perhaps that should be handled with update-alternatives on debian? See http://postfwd.org/versions.html for more info.

Jamie, i'm reassigning this to you. for postfwd1 vs postfwd2 and puppet integration.

comment:5 Changed 2 years ago by Daniel Kahn Gillmor

Keywords: yser.mayfirst.org added

We're seeing this problem on yser as well right now.

Please login to add comments to this ticket.

Note: See TracTickets for help on using tickets.