Trawling the mail archive redux

Those yesterday instructions seemed so complete

Got a job ticket this week to trawl through 7 days of email, and pull out incoming mail for 7 mailboxes.

2 hours after set up and ready to trawl, the work was complete (with only 30 minutes of that time being actual work, the rest just hanging around for the computer(s) to do their stuff.

Unfortunately, it was 3 hours before I could get working because I forgot some fundamentals about procmail, that were presumed understood in the original documentation. If you really want to understand the notes, then please read up and try a few procmail recipes. Otherwise, we’ve updated our guide for:

  • trawl the archive using procmail recipes
  • use mutt to bounce the messages to an outlook client

More Information:

[Ref: Procmail FAQ]

Now we have 6 months worth of email, and they’ve grown to between 30GB a month. Someone finally decides they want mail from the archives and there is a huge amount of mail to wade through.

How do we do it?

formail -s procmail recipe.rc < mailfile

There’s some fancy stuff out there, but I have found procmail and formail already installed because of the archiving solution.

Recipe - Mail To/From mailclient

A user has these requirements for retrieving emails from the archives:

  • time period of 5 days (specified to us)
  • sent to a particular user (specified to us)

We collect messages by the ‘day’ so this part is simplified. We use the below procmail recipe for each day’s email messages.

formail -s procmail /path-to/tofrom.recipe.rc < /path-to/mailfile

File: tofrom.recipe.rc

# Debugging

CMDFROM="^(From[ 	]|(Old-|X-)?(Resent-)?(From|Reply-To|Sender):)(.*\<)?"

## ---
## Customise Me!!
## ---
# - set MAILDIR to where you will do work (note: all relative paths go from here)

## ---



# --- Check a few paths first
* ? test -d ${MAILDIR}
* ? test -d ${TMPDIR} || mkdir ${TMPDIR}
{ }

	# Bail out if any of the above fails

# Deliver Mail to our file ${MAILBOX}

* $ ${CMD}${USER}

Customisation areas are ‘blocked off’ above.

  • MAILDIR - set the path where files are to be written (i.e. results of the process, lock file, log file) e.g. /var/data/mail/recovery
  • MAILCLIENT - recipient (e.g.
  • CMD - choose either CMDTO or CMDFROM to specify which you want (CMDTO messages sent to $MAILCLIENT or CMDFROM for messages sent from) We use ${CMDTO}
File Permissions

[Ref: Check for Permission Problems]

One aspect of procmail that is important to remember (or at least it wasted too much of my time until I re-discovered this.)

Recipe file permissions:

  • Use a path/file permission of 0640 and make sure the
  • running user ‘owns’ the recipe file


[Ref: tags , Mutt, port]

We now have a ‘maildir’ file as /var/data/mail/recovery/, but our users are Windows/Outlook users.

I can’t readily give them the files (which they would accept as individual EML files) because I don’t have free tools for that conversion, so we use another tool mutt.

  • Using
  • T to tag messages by a pattern
  • Use Patterns
  • ~A to tag all messages
  • Using “;” Bounce b, to send all messages to the appropriate person
Stepping through

Launch “mutt” with the “-f” option to open the mail message file

mutt -f /var/data/mail/recovery/
q:Quit  d:Del  u:Undel  s:Save  m:Mail  r:Reply  g:Grouop  ?:Help
1  N   Month Day From-address ( size) Subject line
2  N   Month Day From-address ( size) Subject line
3  N   Month Day From-address ( size) Subject line
---Mutt: [Msgs:xyz Old:xyz xyzM]---(date/date)---

From the email index list, the command sequence looks this:

  • T
---Mutt: [Msgs:xyz Old:xyz xyzM]---(date/date)---
Tag message matching:
  • ~A

The Mail Index should show an “*” asterisk beside all messages

q:Quit  d:Del  u:Undel  s:Save  m:Mail  r:Reply  g:Grouop  ?:Help
1  N * Month Day From-address ( size) Subject line
2  N * Month Day From-address ( size) Subject line
3  N * Month Day From-address ( size) Subject line
---Mutt: [Msgs:xyz Old:xyz Tag:xyz xyzM]---(date/date)---

and the status bar should include the number of messages tagged:

To bounce a message, we would use ‘b’, but we want to bounce all tagged messages, and therefore precede ‘b’ with the semi-colon ‘;’

  • ;
---Mutt: [Msgs:xyz Old:xyz Tag:xyz xyzM]---(date/date)---
  • b
---Mutt: [Msgs:xyz Old:xyz Tag:xyz xyzM]---(date/date)---
Bounce tagged messages to:

and we get a prompt for whom/where we wish to bounce messages, enter our destination address

and we get a confirmation prompt

Bounce messages to ([yes]/no):
  • yes
---Mutt: [Msgs:xyz Old:xyz Tag:xyz xyzM]---(date/date)---
Messages bounced.