Opened 10 years ago

Closed 10 years ago

#1481 closed Question/How do I...? (fixed)

Filtering in maildrop

Reported by: https://id.mayfirst.org/workersliberty Owned by: https://id.mayfirst.org/jamie
Priority: Medium Component: Tech
Keywords: email filtering maildrop Cc:
Sensitive: no

Description

In order to not have to create the same filters in Thunderbird on multiple computers, I have today recreated my rule set on the Mayfirst server.

I then disabled my filters in Thunderbird, but now none of my emails get filtered unless I log in on the webmail first.

Is there a way to apply my filters to all emails as soon as they are delivered to the inbox without me having to access the webmail interface?

On Gmail for example this works as expected: when I open Thunderbird, I find all my e-mails already filtered into folders as per my rule set on the server.

Martin

Change History (4)

comment:1 Changed 10 years ago by https://id.mayfirst.org/jamie

  • Keywords email filtering maildrop added

Hi Martin,

The good news is that it is possible (I do it myself). The bad news is that it requires a bit more work that normal - since it is not a frequently requested setup.

The problem is that the filtering that Horde/IMP provides is triggered by the application itself (and the same with Thunderbird). So, filtering only happens when you login.

What you are after is filtering that is applied when messages are delivered to your account.

For starters - are you using IMAP with Thunderbird? You will want to use IMAP - otherwise your messages will get filtered to mailboxes that Thunderbird won't be able to access.

Our servers use a program called maildrop to deliver messages to your inbox (and other mailboxes). maildrop looks for a file in your home directory on your primary server called .mailfilter. If it finds this file, it will read it and follow whatever directions you provide it. You can create a regular text file with the name .mailfilter (the initial period is important) and sftp it to your home directory.

Here's an example from my file. The first line is pretty important when you are testing - it allows you make sure what you do doesn't cause any errors. More (generic) examples are available on the courier web site.

Depending on your imap client - your mailboxes may appear differently to the client then to maildrop. For example, some clients will show a hierarchy of mailboxes (like Sent -> 2008 -> July). To maildrop, that mailbox would be represented as: .Sent.2008.July.

In addition, maildrop uses regular expressions which is a powerful way to identify strings of text. In my example below, ".*" means match any character.

If you are unfamiliar with regular expressions - it can be a bit daunting. However, we're happy to help you craft filtering rules that work for you - just post to the ticket and we'll get through it together.

logfile "$HOME/mailfilter.log"
MAILBOX="$HOME/Maildir/"

if (/^From: .*scomp@aol.*/ )
{
        to "Maildir/.daily.feedback-loop/"
}


# get rid of spam
if (/^X-Spam-Flag: YES/)
{
        to "Maildir/.detected-spam/"
}

# filter system messages that should not get caught by spam filter
if (/^To: .*apache@mayfirst.org/ )
{
        to "Maildir/.daily.apache/"
}
if (/^To: .*logcheck@mayfirst.org/ )
{
        to "Maildir/.daily.logcheck/"
}
if (/^To: .*cron-apt@mayfirst.org/ )
{
        to "Maildir/.daily.cron-apt/"
}

comment:2 follow-up: Changed 10 years ago by https://id.mayfirst.org/workersliberty

Thanks. Wow. I know a tiny bit about regex, but daunting indeed.

A few initial questions...

ONE. This bit:

logfile "$HOME/mailfilter.log" MAILBOX="$HOME/Maildir/"

has to be the first two lines of the .mailfilter file?

But there's no set end line for that file?

TWO. In

if (/X-Spam-Flag: YES/) {

to "Maildir/.detected-spam/"

}

the name of the spam mailbox is detected-spam?

So if I want a filter to send spam to a mailbox called spam, it would be..?

if (/X-Spam-Flag: YES/) {

to "Maildir/.spam/"

}

THREE. A filter to send all mail with awl-chat in the subject line to a mailbox called awl-c would be...?

if (/Subject:.*[:wbreak:]awl-chat[:wbreak:]/) {

to "Maildir/.awl-c/"

}

FOUR. What's with the [:wbreak:]? (Google does not seem to have come across this character string).

FIVE. You have mailbox references in the form

"Maildir/.detected-spam/"

http://www.courier-mta.org/maildrop/maildropex.html

has them in the form

Mail/project

Yours is right and theirs is wrong? Or they're equally valid alternatives? Or the difference reflects some difference in the filtering that your examples are doing and the filtering that maildropex's examples are doing?

SIX. A filter to send all mail from cathyn56@… to a mailbox called a-cathy would be...?

if (/From: *cathyn56@hotmail\.com/ {

to "Maildir/.a-cathy/"

}

SEVEN. A filter that filters on condition A AND condition B has an "if" like this:

if (/From: *boss@domain\.com/ \

&& /Subject:.*[:wbreak:]project status[:wbreak:]/)

i.e. && is logical-AND.

How is logical-OR represented?

Is there a way of representing a logical-NOT, e.g. of constructing a filter which sends any messages I receive which are NOT addressed to martin@… or martin.thomas@… to a particular folder?

EIGHT. You say .* represents any character. So your filter

if (/To: .*apache@…/ ) {

to "Maildir/.daily.apache/"

}

filters all mail to addresses like siouxapache@…, navajoapache@…, chickasawapache@… into your daily.apache mailbox, which is in fact a mailbox called apache as a sub-mailbox of a mailbox called daily?

A filter like

if (/From: .*@mayfirst.org/ ) {

to "Maildir/.stuff/"

}

would filter all mail from ann@…, bill@…, cara@…... zeke@… into the mailbox stuff?

NINE. In this method of filtering, does the system stop filtering each message once it has come across a line in .mailfilter which picks up that message? Or does it put every message through all the lines (filters), one after the other?

I.e. suppose I have the filter to send all mail from cathyn56@… to mailbox a-cathy, and, after that in the .mailfilter file, a filter to send all mail with awl-chat in subject line to mailbox awl-c, what happens to a message from cathyn56@… with awl-chat in the subject line?

Is it filtered by the first filter into mailbox a-cathy, and then not touched by the second filter?

Or it is moved by both filters in turn, thus ending up in mailbox awl-c?

TEN.

Above we have if (/From:, if (/To:, and if (/Subject:.

On maildropex there is if (/Delivered-To:, which I guess is the same as filtering messages which have a particular address in their To: or Cc: or Bcc: fields.

There are also some funky ones which I won't want to use, like an "if" for messages more than a certain number of lines long.

Is there anywhere a list of all the if (/... things available?

ELEVEN.

When I sort out a .mailfilter file and am ready to sftp it to my home directory... where is my home directory? How do I find it via e.g. FileZilla?

TWELVE. Presumably once I sort out this .mailfilter thing I should disable all the filters currently on the webmail, and the (matching) ones in Thunderbird. Is there a quick way to dispose of all the filters on the webmail?

Thanks!

comment:3 in reply to: ↑ 2 Changed 10 years ago by https://id.mayfirst.org/jamie

Replying to https://id.mayfirst.org/workersliberty:

Good questions! I'm learning a lot about maildrop as I delve more into it :).

ONE. This bit:

logfile "$HOME/mailfilter.log" MAILBOX="$HOME/Maildir/"

has to be the first two lines of the .mailfilter file?

I'm not sure that they have to be the first two lines. The mailfilter man page says that the entire file is parsed before any action is taken, so I suspect they could be placed anywhere.

But there's no set end line for that file?

The man page also says that "In maildrop, the end of line is a lexical token." I take that to mean that you don't need a semi-colon or any other end of line indicator. Just hit return in your editor.

This also means that if you edit the file on a Windows machine or a Macintosh and then upload it without converting the line breaks, you may have trouble. You can learn more than you ever wanted to about line breaks on wikipedia. The upshot is that if you are copying your file from a Windows or Macintosh computer using an sftp program, be sure to indicate to the sftp program that the file is a text file. Most sftp programs will automatically translate the line breaks for text files.

TWO. In

if (/^X-Spam-Flag: YES/) {

to "Maildir/.detected-spam/"

}

the name of the spam mailbox is detected-spam?

Yes - that's correct.

So if I want a filter to send spam to a mailbox called spam, it would be..?

if (/^X-Spam-Flag: YES/) {

to "Maildir/.spam/"

}

Yes - exactly.

THREE. A filter to send all mail with awl-chat in the subject line to a mailbox called awl-c would be...?

if (/^Subject:.*[:wbreak:]awl-chat[:wbreak:]/) {

to "Maildir/.awl-c/"

}

The [:wbreak:] syntax was unfamiliar to me, although I see it in the man page for maildrop on the courier web site. And, it's also referenced in an old email thread from 2005 that claims it is in the maildrop filter man page. However, I don't see it in either the maildropfilter man page on the courier web site or in my own man page.

Furthermore, my man page says: "Versions of maildrop prior to version 2.0 (included in the Courier Mail Server 0.51, and earlier), used a built-in pattern matching engine, instead of using the PCRE library (see the “Patterns” section)."

My guess is that :wbreak: was part of the old pattern matching engine.

Since we're running a version of maildrop great than 2, I would suggest instead:

 if (/^Subject:.*[^a-zA-Z0-9_]awl-chat[^a-zA-Z0-9_]/)
 {
   to "Maildir/.awl-c/"
 }

According to the old [ttp://markmail.org/message/jg5mugy27aae6p67 thread] that should be the same thing.

FOUR. What's with the [:wbreak:]? (Google does not seem to have come across this character string).

Yes - my feelings exactly!

FIVE. You have mailbox references in the form

"Maildir/.detected-spam/"

http://www.courier-mta.org/maildrop/maildropex.html

has them in the form

Mail/project

Yours is right and theirs is wrong? Or they're equally valid alternatives? Or the difference reflects some difference in the filtering that your examples are doing and the filtering that maildropex's examples are doing?

The differences are all valid alternatives.

Wrapping in double quotes appears to be unnecessary. I'm not sure what would happen if you did not use quotes and you had a space in your mailbox name though. Since it works with quotes, I would suggest keeping them in to be safe.

The Mail vs. Maildir is based on our system configuration. We have our systems configured to use Maildir as the name of your mailbox, which is standard if your mailbox is using the Maildir format for storing email. Mbox is an older format that is traditionally uses the directory name Mail.

And the whole period thing - I'm not entirely sure how or where or why that's set to be perfectly honest. I just know that when an IMAP client tells our IMAP server: "I want a mailbox named foo" our IMAP servers puts a period in front of it.

SIX. A filter to send all mail from cathyn56@… to a mailbox called a-cathy would be...?

if (/^From: *cathyn56@hotmail\.com/ {

to "Maildir/.a-cathy/"

}

Yup - that looks good to me with the only problem that you appear to be missing the closing parenthesis on the if statement.

SEVEN. A filter that filters on condition A AND condition B has an "if" like this:

if (/^From: *boss@domain\.com/ \

&& /Subject:.*[:wbreak:]project status[:wbreak:]/)

i.e. && is logical-AND.

Yup - that's correct.

How is logical-OR represented?

OR is represented with two vertical bars:

||

Is there a way of representing a logical-NOT, e.g. of constructing a filter which sends any messages I receive which are NOT addressed to martin@… or martin.thomas@… to a particular folder?

There are a couple ways to do this.

One way would be to use the fact that maildrop runs the filters in order. So, at the bottom of your list of filters, you could specify that all remaining email messages that are addressed to martin.thomas get put in the inbox (to "Maildir/"). Then, the last line would filter everything else to another box.

Alternatively, you could do something like this:

if ( /^To:\s*(.*)/ )
{
   if ( $MATCH1 !~ /martin.thomas@workersliberty\.org )
   {
    TO "Maildir/.not-martin"
   }
}

The contents of the first parens are placed into the variable $MATCH1

EIGHT. You say .* represents any character. So your filter

if (/^To: .*apache@…/ ) {

to "Maildir/.daily.apache/"

}

filters all mail to addresses like siouxapache@…, navajoapache@…, chickasawapache@… into your daily.apache mailbox, which is in fact a mailbox called apache as a sub-mailbox of a mailbox called daily?

Yes - that's correct (I'm a little sloppy with my filters :). In fact it would also match apache@mayfirstzorg as well - it really should be /To:.*[a-zA-Z0-9-_]apache@mayfirst\.org/

A filter like

if (/^From: .*@mayfirst.org/ ) {

to "Maildir/.stuff/"

}

would filter all mail from ann@…, bill@…, cara@…... zeke@… into the mailbox stuff?

Yes - although /^From: .*@mayfirst\.org/ would be better.

NINE. In this method of filtering, does the system stop filtering each message once it has come across a line in .mailfilter which picks up that message? Or does it put every message through all the lines (filters), one after the other?

I.e. suppose I have the filter to send all mail from cathyn56@… to mailbox a-cathy, and, after that in the .mailfilter file, a filter to send all mail with awl-chat in subject line to mailbox awl-c, what happens to a message from cathyn56@… with awl-chat in the subject line?

Is it filtered by the first filter into mailbox a-cathy, and then not touched by the second filter?

Yes - by default, Maildrop will act on the first matching filter. And, after executing a To command it will exit (you can use the cc command in place of the to command if you want to deliver a message to a mailbox and take additional actions).

If you wanted to - you could put the cathy filter first and specify matches from cathy but not with the subject line awl-c (or vice versa). Or, if you are really bold, it looks like maildrop supports weighted scoring. According to the man page:

 WEIGHTED SCORING
       Patterns are evaluated by maildrop as any other  numerical  expression.
       If  a pattern is found, maildrop’s filter interprets the results of the
       pattern match as number 1, or true, for filtering purposes. If  a  pat‐
       tern  is  not  found  the results of the pattern search is zero. Once a
       pattern is found, the search stops. Second, and subsequent  occurrences
       of the same pattern are NOT searched for.

       maildrop  can  also  do weighted scoring. In weighted scoring, multiple
       occurrences of the same pattern  are  used  to  calculate  a  numerical
       score.

       To use a weighted search, specify the pattern as follows:

              /pattern/:options,xxx,yyy
       where  xxx  and yyy are two numbers. yyy is optional -- it will default
       to 1, if missing.

       The first occurrence of the pattern is evaluated  as  xxx.  The  second
       occurrence  of  the  pattern  is  evaluated  as  xxx*yyy,  the third as
       xxx*yyy*yyy, etc... All occurrences of the pattern are added up to cal‐
       culate the final score.

              Note:  maildrop  does  not recognize multiple occurrences of the
              same pattern in the same line.  Multiple occurences of the  same
              pattern in one line count as one occurence.

I can't quite tell how it works or how you evaluate the score though.

TEN.

Above we have if (/^From:, if (/^To:, and if (/^Subject:.

On maildropex there is if (/^Delivered-To:, which I guess is the same as filtering messages which have a particular address in their To: or Cc: or Bcc: fields.

I don't think delivered to will be very helpful to you. All messages coming through our system will have the delivered to be: martin.thomas@albizu.mayfirst.org.

If your address is in the bcc field - you will have no way of knowing which address the author of the email put in that field.

You might be interested in this:

hasaddr(string) - Search for an address.
                  if ( hasaddr(string) )
                  {
                     ...
                  }

              "string" is of the form user@domain. The hasaddr function
              returns 1 if this address is included in any To:,
              Cc:,Resent-To:, or Resent-Cc:, header in the message, otherwise
              this function returns 0.

              This is more than just a simple text search. Each header is
              parsed according to RFC822. Addresses found in the header are
              extracted, ignoring all comments and names. The remaining
              addresses are checked, and if "string" is one of them, hasaddr
              returns 1, otherwise it returns 0.

              The comparison is case-insensitive. This actually violates
              RFC822 (and several others) a little bit, because the user part
              of the address may be (but is not required to be) case
              sensitive.

Is there anywhere a list of all the if (/... things available?

That list is as long as the regular expressions help page :).

I think you might want to ssh into your primary server and then run the command:

man maildropfilter

It does list a lot of functions (like the hasaddr one above) that allow you to bypass regular expressions altogether. Now that I'm more familiar with the program, I plan to re-write a lot of my filter to use those functions instead of relying on regular expressions.

ELEVEN.

When I sort out a .mailfilter file and am ready to sftp it to my home directory... where is my home directory? How do I find it via e.g. FileZilla?

The first step will be to login to the members control panel as your workersliberty user. Under the services drop down, select "Server access." Then, create a new item with the username martin.thomas. This will provide your username with access to the server.

Then, run FileZilla, logging in as martin.thomas and using the password you use when getting your email.

You will automatically be placed in your home directory. So, you can simply drag your .mailfilter file over. In addition, after receiving a message, you should see a mailfilter.log file appear, which you can open to check to make sure it's all working ok.

TWELVE. Presumably once I sort out this .mailfilter thing I should disable all the filters currently on the webmail, and the (matching) ones in Thunderbird. Is there a quick way to dispose of all the filters on the webmail?

How many do you have? I think you'll simply have to click on the trash icon next to each one. Looks like you'll just need to click on confirmation for each one - so hopefully it won't be too tedious.

Thanks!

Good luck and let us know how it goes.

jamie

comment:4 Changed 10 years ago by https://id.mayfirst.org/workersliberty

  • Resolution set to fixed
  • Status changed from new to closed
  • Summary changed from Email filters not working on IMAP to Filtering in maildrop
  • Type changed from Bug/Something is broken to Question/How do I...?

Please login to add comments to this ticket.

Note: See TracTickets for help on using tickets.