Opened 4 years ago

Closed 4 years ago

Last modified 3 years ago

#10378 closed Bug/Something is broken (fixed)

Cannot keep amavis running on mx1

Reported by: https://id.mayfirst.org/erq Owned by: https://id.mayfirst.org/jamie
Priority: Urgent Component: Tech
Keywords: mx1 amavis mail Cc:
Sensitive: no

Description

I found 'amavis' had stopped early this morning, I followed the process suggested by Jamie here but later i found it had stopped again for a reason i cannot find yet.

If I follow the same process again it takes less than a minute for 'amavis' to stop again.

I found the following lines in the mail log

0 mx1:~# cat /var/log/mail.log |grep amavis
(...)
Feb  4 09:02:48 mx1 amavis[29116]: Creating db in /var/lib/amavis/db/; BerkeleyDB 0.42, libdb 4.8
Feb  4 09:04:29 mx1 amavis[29116]: (!!)TROUBLE in pre_loop_hook: db_init: BDB no dbS: __fop_file_setup:  Retry limit (100) exceeded, File exists. at (eval 96) line 272.

And i found this are the files in that directoy

0 mx1:~# ls -la /var/lib/amavis/db/
total 924
drwxr-x---   2 amavis amavis   4096 Feb  4 09:02 .
drwxr-x--- 257 amavis amavis 249856 Feb  4 09:02 ..
-rw-r-----   1 amavis amavis  12288 Feb  4 09:02 cache.db
-rw-r-----   1 amavis amavis   4096 Feb  4 09:02 cache-expiry.db
-rw-r-----   1 amavis amavis  24576 Feb  4 09:04 __db.001
-rw-r-----   1 amavis amavis 196608 Feb  4 09:04 __db.002
-rw-r-----   1 amavis amavis 270336 Feb  4 09:04 __db.003
-rw-r-----   1 amavis amavis 548864 Feb  4 09:04 __db.004
-rw-r-----   1 amavis amavis  12288 Feb  4 07:16 __db.snmp.db

Change History (7)

comment:1 Changed 4 years ago by https://id.mayfirst.org/erq

  • Owner set to https://id.mayfirst.org/jamie
  • Status changed from new to assigned

Jamie I'm reassigning the ticket to you hoping you can take a look. Mail queue is increasing fast. I will keep searching.

Thanks in advance Enrique

comment:2 Changed 4 years ago by https://id.mayfirst.org/erq

I'm thinking if it would be a good idea to configure postfix to not use amavis for now, while we find solutions to this problem?

comment:3 Changed 4 years ago by https://id.mayfirst.org/jamie

I'll be tied up in a PTP training all day today, but can help tonight. Turning off amavis for today seems reasonable - although it does mean there will be an explosion of undetected spam (and possibly viruses). Which seems better than not getting email at all.

Tonight I can try switching mx1 from amavis (which I think does spamassassasin and clamav processing) to running spamassassin and smtpclam directly (which is how our MOSH'es work).

jamie

comment:4 Changed 4 years ago by https://id.mayfirst.org/erq

Thanks, i just commented the content_filter line on /etc/postfix/main.cf

# AMaViS parameters; activate, if avaible/used
# commented temporarily, please see https://support.mayfirst.org/ticket/10378
# content_filter = amavis:[127.0.0.1]:10024

Unfortunately there seem to be another cause of this problem, cause the log now shows lines like this one:

Feb  4 10:54:43 mx1 postfix/error[6578]: 42AFA2230154: to=<raulhdezg@laneta.apc.org>, relay=none, delay=8815, delays=8711/103/0/1.5, dsn=4.4.2, status=deferred (delivery temporarily suspended: lost connection with 127.0.0.1[127.0.0.1] while receiving the initial server greeting)

comment:5 Changed 4 years ago by https://id.mayfirst.org/jamie

  • Resolution set to fixed
  • Status changed from assigned to feedback

Getting both clamsmtp and spamassassin working without amavis was a little more complicated than I thought, since on mx1, the mail users are virtual, whereas on the rest of our MOSH'es they are real unix users. If they are real unix users, you can invoke spamassassin via the mail delivery agent (maildrop) and not change any postfix configurations. That means postfix only has to send email through clamsmtp which is rather trivial.

However, with virtual mailboxes, we have to filter email from postfix -> clamsmtp -> postfix -> spamassassin -> postfix (I was loosely following a thread on this topic).

Given the added complexity, I took a closer look at the amavis problem and found a reference to solving it which I followed and now amavis seems to be running again without crashing.

It will probably take a few hours to get through the 8,000+ email back log, however, hopefully this will hold and we will be back working soon.

comment:6 Changed 4 years ago by https://id.mayfirst.org/erq

  • Status changed from feedback to closed

Thanks a lot Jamie, I'm checking the described details and monitoring the queue now

comment:7 Changed 3 years ago by https://id.mayfirst.org/erq

Just to document, we had a similar issue yesterday, and the same solution worked fine (that was: stoping postfix removing all amavis database files, and starting postfix again in order to have them rebuilt) Enrique

Please login to add comments to this ticket.

Note: See TracTickets for help on using tickets.