Changes between Initial Version and Version 1 of clearing-spam-backscatter-from-mailq


Ignore:
Timestamp:
Sep 18, 2012, 10:32:26 PM (12 years ago)
Author:
Nat Meysenburg
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • clearing-spam-backscatter-from-mailq

    v1 v1  
     1[[PageOutline]]
     2There have been a few tickets lately (#6166, #6199), where an email account has gotten hacked and used to send spam. What results is a ton of bounces from other mail servers that are denying a message. This is commonly referred to as [https://en.wikipedia.org/wiki/Backscatter_%28e-mail%29 backscatter].
     3
     4This is a breakdown of the process used to resolve #6166 and #6199.
     5
     6= How it starts =
     7We generally don't find out about these events until they are well underway, and members are complaining about the slowness of the mail server.
     8
     9The tool for checking the mail queue is `mailq`. Run without arguments, it will list a little about all the email that is currently in the queue. You can count the number of emails roughly with `mailq | wc -l`. The gotcha here is that each email takes three lines, meaning the number you're looking for is a third of that value. Under normal conditions (as I have observed them) there should only be about 40-100 emails in the queue.
     10
     11{{{
     120 chavez:~/ticket6199# mailq |wc -l
     13125
     14}}}
     15
     16When we looked for #6166, that number was around 985,000; meaning about 300,000 spam messages. #6199 was only about 30,000 messages.
     17
     18= Figuring out the culprit =
     19Generally, the beginning of the mailq will contain a lot more of the backscatter (though its mostly backscatter). The first thing to think about is finding out what account is receiving all of the backscatter. To get an idea for that, have a look at the first 20 or so emails.
     20
     21{{{
     22mailq |head -60 |more
     23}}}
     24
     25Look for a recurring email address. If there's a backscatter problem, there's a greater chance that the mailbox is being delivered to a user on the server (rather than forwarding it), the string to look for is something `USERNAME@chavez.mayfirst.org` (assuming `chavez` is the server that you're looking at).
     26
     27For this example lets call the user `spam-account`, with an email of `spam-account@example.com`.
     28
     29= Figuring out where all the spam lives =
     30Depending on the nature of the backscatter, the actual messages might be in several places within the spool.
     31
     32Run this to get see which dirs might be full.
     33
     34{{{
     35SPOOL=/var/spool/postfix/; for dir in $(ls $SPOOL);do echo "$dir: $(ls $SPOOL/$dir |wc -l)"; done
     36}}}
     37
     38It looks something like this when run:
     39{{{
     400 chavez:~# SPOOL=/var/spool/postfix/; for dir in $(ls $SPOOL);do echo "$dir: $(ls $SPOOL/$dir |wc -l)"; done
     41active: 2
     42bounce: 0
     43corrupt: 0
     44defer: 16
     45deferred: 16
     46dev: 1
     47etc: 6
     48flush: 0
     49hold: 0
     50incoming: 0
     51lib: 12
     52maildrop: 0
     53pid: 17
     54private: 26
     55public: 5
     56saved: 0
     57trace: 0
     58usr: 1
     59}}}
     60
     61If the server were dealing with queue issues, some of those numbers would be in the thousands.
     62
     63'''Note:''' There's a gotcha here. the "defer" and "deferred" directories actually have sub directories. So if the counts don't look promising in the incoming or active dirs, try those.
     64
     65Make a note of which directories are full of spam.
     66
     67= Getting the mail flow started again =
     68Since our members seem to actually want their emails delivered in a timely fashion, we need to non-destructively clean out the queue.
     69
     70Follow each of these steps!
     71
     72== Step 0: Disable the account ==
     73Log into the control panel, and find  `spam-account@example.com` and disable it.
     74
     75== Step 1: Stop postfix ==
     76{{{
     77service postfix stop
     78}}}
     79
     80== Step 2: Create clean spool dirs ==
     81Now that our postfix spool directories aren't being written to, we can clean out directories. Since we don't want to destroy any real emails that are in the queue, we want to move the spool directories without deleting the mail itself.
     82
     83Lets say the problem dirs in this case are `incoming` and `active`. First move them to something with a different name that postfix won't write to. For the scripts that I've written I use two variant suffixes. `.spamfull` and `.name-collisions`. I'll get to `.name-collisions` later, for now, we're just moving the dir.
     84
     85'''Note:''' Remember to change these if you're not dealing with incoming or active.
     86
     87{{{
     88mv /var/spool/postfix/incoming /var/spool/postfix/incoming.spamfull
     89mv /var/spool/postfix/active /var/spool/postfix/incoming/active.spamfull
     90}}}
     91
     92Now we need to recreate the original dirs.
     93
     94{{{
     95mkdir /var/spool/postfix/incoming
     96mkdir /var/spool/postfix/active
     97}}}
     98
     99And make sure the permissions are correct.
     100
     101{{{
     102chmod 700 /var/spool/postfix/incoming
     103chmod 700 /var/spool/postfix/active
     104chown postfix:postfix /var/spool/postfix/incoming
     105chown postfix:postfix /var/spool/postfix/active
     106}}}
     107
     108Now we have empty queue directories that postfix can write to. When it restarts all new mail should start getting handled as normal.
     109
     110== Step 3: Start Postfix ==
     111{{{
     112service postfix start
     113}}}
     114
     115Email should once again be flowing at reasonable speeds... and now we can worry about separating out backscatter spam from the ham.
     116
     117= Reinserting the good messages back into the mail queue =
     118So now that we have the new mail getting delivered, we need to get the real email messages delivered to their recipients.
     119
     120This step takes a little bit of art. What we're trying to do is come up with a couple of grep expressions that will hopefully match all of the spam messages. Thankfully, spammers usually have similar stuff in all of the emails they send out.
     121
     122Have a look at the first few emails in the `.spamfull` directory, particularly looking for mails to `spam-account@example.com`. That's your first grep expression. In #6166 the second pattern was that all the messages were pointing to various domains with a `media.php` file. In #6199 all of them had an email address in the body (`swiftvalue@yahoo.cn`).
     123
     124Figuring this out takes a little work because the emails themselves are in a binary file format, so we need to run them through strings. I tend to work on a subset of mails.
     125
     126{{{
     127for foo in $(ls incoming.spamful/ |head -20);do strings incoming.spamful/$foo > ~/ticket6199/testmails/$foo-strings-spamful;done
     128}}}
     129
     130That will give you twenty messages, hopefully containing spam and ham. '''Be sure to delete this dir when you are done, since there are some actual emails there!'''
     131
     132Test your greps. For example, lets say our two criteria are `spam-account@example.com` and `media.php`. Any mail that contains both of those is spam.
     133
     134This is an edited example of how to look for such things. Assuming that you are in the test mail dir
     135
     136{{{
     1370 chavez:~/ticket6199/testmails# for foo in $(ls); do echo "$foo: $(grep -c -e spam-account@example.com -e swiftvalue@yahoo.cn $foo)"; done
     138000338CE68-strings-spamful: 3
     1390003A711C9-strings-spamful: 3
     140001127DED8-strings-spamful: 3
     141001387199E-strings-spamful: 3
     1420014B8C579-strings-spamful: 0
     143001FD8D839-strings-spamful: 2
     144002138C8FE-strings-spamful: 3
     145002278D8B0-strings-spamful: 2
     1460023D4EF87-strings-spamful: 3
     14700285C0EC-strings-spamful: 2
     1480029970F32-strings-spamful: 0
     149002A38C95D-strings-spamful: 3
     150002CA8DCD9-strings-spamful: 2
     151002D7713F4-strings-spamful: 3
     152002FF4E7D8-strings-spamful: 2
     153003568D3ED-strings-spamful: 0
     154003FF8CB54-strings-spamful: 2
     1550048B8DBD1-strings-spamful: 2
     156004F37DCD8-strings-spamful: 0
     157004F87D469-strings-spamful: 0
     158}}}
     159
     160The lines with 0 at the end are ham, and everything else spam. Probably in this case, some results with a 1 would have counted as ham (since there could perhaps be legit emails to the spam-account address; however since it is disabled it probably doesn't matter.
     161
     162So assuming we're satisfied with our grep patterns, you now need to create a handful of scripts to do the actual move of ham back into the mail queue for delivery to their recipients.
     163
     164Remember the `.name-collisions` directories. These are there out of an abundance of caution. None of us were sure how postfix assigns names to the email files, so the script checks to make sure it doesn't exist before moving something overtop of it. If there is a name collision we move the message into that directory. With both tickets, this wasn't an issue.
     165
     166Before running the script(s) we first need to create those dirs. As per our example:
     167
     168{{{
     169mkdir /var/spool/postfix/incoming.name-collisions
     170mkdir /var/spool/postfix/active.name-collisions
     171}}}
     172
     173This script will need to be modified to meet the various criteria of you spam search. It will only work on the dirs that don't have sub directories (eg: not defer and deferred)
     174
     175{{{
     176#!/bin/bash
     177
     178# simple script by nat to cleanup the spam backscatter in mailq that was killing the
     179# server on 9/18/12
     180
     181SPAM_DIR=/var/spool/postfix/incoming.spamfull
     182HAM_DIR=/var/spool/postfix/incoming
     183COLLISION_DIR=/var/spool/postfix/incoming.name-collisions
     184
     185for message in $(ls -U $SPAM_DIR/);do
     186  if [[ $(strings $SPAM_DIR/$message |grep -c -e spam-account@example.com -e media.php) -eq 0 ]];then
     187    if [[ ! -f $HAM_DIR/$message ]]; then
     188      mv -n $SPAM_DIR/$message $HAM_DIR/;
     189    else
     190      mv -n $SPAM_DIR/$message $COLLISION_DIR/
     191    fi
     192  fi
     193done
     194exit 0
     195}}}
     196
     197If you have to traverse sub directories, something like this should work. Again, remember to change the grep patterns.
     198
     199{{{
     200#!/bin/bash
     201
     202# simple script by nat to cleanup the spam backscatter in mailq that was killing the
     203# server on 9/5/12
     204
     205SPAM_DIR=/var/spool/postfix/defer.spamfull
     206HAM_DIR=/var/spool/postfix/defer
     207COLLISION_DIR=/var/spool/postfix/defer.name-collisions
     208for dir in $(ls $SPAM_DIR);do
     209  for message in $(ls -U $SPAM_DIR/$dir);do
     210    if [[ $(strings $SPAM_DIR/$dir/$message |grep -c media.php) -eq 0
     211    || $(strings $SPAM_DIR/$dir/$message |grep -c spam-user) -eq 0 ]];then
     212      if [[ -d $HAM_DIR/$dir && ! -f $HAM_DIR/$dir/$message ]]; then
     213        mv -n $SPAM_DIR/$dir/$message $HAM_DIR/$dir/
     214      else
     215        if [[ ! -d $COLLISION_DIR/$dir ]]; then
     216          mkdir $COLLISION_DIR/$dir
     217        fi
     218        mv -n $SPAM_DIR/$dir/$message $COLLISION_DIR/$dir/
     219      fi
     220    fi
     221  done
     222done
     223exit 0
     224}}}
     225
     226Once these are done, all of the real emails should be delivered, and you now have a directory full of spam.
     227
     228These scripts are non-destructive. They move things, not delete them. So if you screwed up your greps, you could always refine or repeat.
     229
     230I have been using one script per directory that I'm cleaning. Future versions could probably do more than one at a time. Future versions may also benefit from turning the grep patterns into variables as well.