Marek Aaron Sapota's web page

MailChariot

About

MailChariot's mission is quite simple:

  1. Take your incoming mail (from /var/spool/mail for example),
  2. Filter your emails through a spam filter like SpamAssassin,
  3. Depending on the verdict put your email into ham or spam mailbox,
  4. If you disagree with spam filter's choice you can move the email into notham or notspam mailbox and MailChariot will let spam filter know that it was wrong. Mail that you put into notspam folder will be moved to ham folder afterwards.
Notice that all of this is mail reader agnostic. Marking messages as spam only requires moving an email to a mailbox, so it's not important if you use Thunderbird, Mutt or whatever.

If you are on a shared host without root privileges and your admins do a bad job of keeping your mailbox spam free, you can use MailChariot to filter the mail yourself.

Reliability

MailChariot makes the best effort to keep your email safe. In no instance your email is stored only in memory, it's removed from your mailbox only if it's already saved in some other safe place. That being said if some other process uses the same mailbox that MailChariot uses and doesn't do proper locking, the mailbox might get corrupted.

Usage

If you weren't scared by the note above, below you'll find a guide how to configure and use MailChariot. Don't get frightened by all this options, probably default settings will be OK and you won't have to write anything.

Configuration

MailChariot's configuration file should be present in "${HOME}/.MailChariot.conf". If it isn't there, all configuration values will be set to defaults. Configuration file is in INI format read by Python's ConfigParser module. Following examples show default values for configuration options (used when you don't supply one yourself), "${X}" syntax means value of X environmental variable.

Mailbox paths

Queue path should start at "/", paths to mailboxes can start at "~/" if you'd like. All intermediate paths should exist before you run MailChariot, with default settings that would mean that "${HOME}/.MailChariot" folder should be created.

[Paths]
; This is where new mail arrives, MailChariot will get mail from here and put
; it in ham and spam mailboxes.
inbox: /var/spool/mail/${USER}
; Mailboxes where filtered mail will arrive.
ham: ${HOME}/.MailChariot/ham
spam: ${HOME}/.MailChariot/spam
; Mailboxes used for marking message as spam or ham.
notham: ${HOME}/.MailChariot/notham
notspam: ${HOME}/.MailChariot/notspam
; This is where your mail goes from inbox, notham and notspam mailboxes before
; MailChariot actually deals with it.
queue: ${HOME}/.MailChariot/chariot-queue

Mailbox formats

Maildir and mbox formats are supported, but Maildir is recommended as it's less likely to be corrupted.

[Format]
inbox: mbox
ham: Maildir
spam: Maildir
notham: Maildir
notspam: Maildir

Spam filter commands

Cleanup command will be run before classification or marking as ham/spam. By default it removes existing SpamAssassin markup. All commands get the message as standard input, cleanup and classify commands should return the message as standard output (probably with changed headers). All commands should exit with exit code 0 if they succeeded.

[Commands]
cleanup: spamassassin -d
classify: spamassassin -L
learn_spam: sa-learn --spam
learn_ham: sa-learn --ham

Classification

[Classification]
; If (after running classify method) given header contains given string it
; will be treated as spam.
header: X-Spam-Flag
spam: YES

SpamAssassin configuration tips

SpamAssassin (as of version 3.2) has lots of bugs and will probably cause problems. If it hangs it should be safe to kill SpamAssassin process - it won't kill MailChariot and MailChariot should retry operation that was interrupted.

If you don't want to perform time consuming synchronization while running spamassassin and sa-learn, you should add following lines to your SpamAssassin configuration.

bayes_learn_to_journal 1
bayes_journal_max_size 0

If you use this configuration it's important to run "sa-learn --sync" from time to time. It is advised to turn MailChariot off before running synchronization, as noted before SpamAssassin is of really poor quality and might deadlock with other SpamAssassin instances or do any other unpredictable thing.

Running MailChariot

chariot

Above command should run MailChariot that will start monitoring your mailboxes and moving emails around. Above command has some additional options, you can use "--help" switch to see them all. To exit MailChariot send it a SIGINT (usually C-c in terminal), it will finish whatever it's doing and shut down gracefully. Any other terminating signal as SIGKILL or SIGTERM will shut MailChariot down immediately, this shouldn't damage any emails that you have, but it's better not to try it=).

MailChariot should work with Python 2.6, 2.7 and 3.x, if it doesn't file a bug report.

Example Mutt configuration

Following configuration opens your ham folder by default (instead of your spoolfile) and adds "S" and "H" macros that mark a message as spam or ham respectively. They also work with several marked messages and ";S" or ";H".

set mbox_type=Maildir

# move message to notham or notspam folder
macro index,pager S "~/.MailChariot/notham" "mark message as spam"
macro index,pager H "~/.MailChariot/notspam" "mark message as ham"

# Open this folder by default
set spoolfile=~/.MailChariot/ham

mailboxes ~/.MailChariot/ham
mailboxes ~/.MailChariot/spam

Links