mailagent(1) an automatic mail-processing tool

SYNOPSIS

mailagent [ -dhilqtFIVU ] [ -s{umaryt} ] [ -f file ] [ -e rule ] [ -c config ] [ -L loglevel ] [ -r rulefile ] [ -o override ] [ mailfile ]

DESCRIPTION

Mailagent allows you to process your mail automatically. Given a set of lex-like rules, you are able to fill mails to specific folders, forward messages to a third person, pipe a message to a command or even post the message to a newsgroup. It is also possible to process messages containing some commands. The mailagent is not usually invoked manually but is rather called via the filter program, which is in turn invoked by sendmail. That means you must have sendmail on your system to use this. You also must have perl to run the mailagent scripts.

There is a set of options which may be used when you invoke mailagent yourself. Please refer to the OPTIONS section for a complete description. You may use the -h option to get a cryptic usage reminder.

Product Overview

Mailagent has actually four distinct set of features, which can be used simultaneously or one at a time. This involves:

  • An @SH command processor, to remain compatible with the first implementation. In this simplest usage, all the mail messages are left in your mailbox (or the catch all folder required on Debian systems: Please see /usr/share/doc/mailagent/SECURITY for details), with special processing raised on messages whose subject is Command. Please refer to the section entitled USING THE DEFAULT RULES if you wish to use this feature.
  • A complete mail filter, which helps you sort your mail based on various sorting criteria and actions. Filtering is specified in a rule file and supersedes the default Command mail processing (which may be turned on again by explicitly setting up a rule for it). This should be the most common use of mailagent and is fully documented under the section entitled USING THE FILTER. You may deliver mail to plain Unix-style folders but also to MMDF and MH ones.
  • A replacement for the vacation program, which will automatically answer your mail while you are not there. You only need to supply a message to be sent back and the frequency at which this will occur. Some simple macro substitutions allow you to re-use some parts of the mail header into your vacation message, for a more personalized reply. See the VACATION MODE section for more details.
  • A generic mail server, which will let you implement a real mail server without the hassle of the lower-level concerns like error recovery, logging or command parsing. The full documentation can be found in the section GENERIC MAIL SERVER at the end of this manual page.

It is possible to extend the mailagent filtering commands by implementing them in perl and then having them automagically loaded when used. Those extended commands will behave exactly like built in ones, as documented in the EXTENDING FILTERING COMMANDS section.

Learning From Examples

It is quite possible that you will find this manual page too complex for you. Unfortunately, it is not really meant to be a tutorial but rather a reference material. If you wish, you may start by looking at the examples held in the distribution source tree under agent/examples. This directory contains two examples of rule files (look at the README file first) and are verbosely commented.

GETTING STARTED

First, you need to install a minimum configuration and see how it works. It would be useless to fully install the program and then discover that it does not work as advertised...

To start the installation, you have to set up a ~/.mailagent file which is the main configuration file, and choose the right filter program.

Choosing The Filter Program

The distribution comes with two filter programs. One written in shell and one in C. The shell version might be the one to use if you can receive your mail on many different platforms where your home directory is NFS-mounted (i.e. shared among all those platforms). The C version is safer and much faster, but you need to install it to a fixed location.

On some platforms, sendmail does not correctly reset its UID when processing mails in its own queue. In that case, you need to get a private copy of the C filter program and make it setuid to yourself. The filter will then correctly reset its UID if invoked with an effective UID different from yours (it may also require the setgid bit to reset GID as well). If this is indeed the case on your system, make sure you use the path configuration variable to set a proper PATH, as the filter will spawn a perl process with the '-S' option, looking for a mailagent script.

Even if you do not need to get a setuid copy of the filter program, it is wise to set up a proper path: someone might break into your account by putting a mailagent Trojan horse in the appropriate location. Also make sure the mailagent program is protected against writing, as well as the directory which holds it, or someone might substitute his own version of the script and break security. I believe the setuid filter program to be safe, but overlooking is always possible so please report any security hole to me.

The filter script can be found in the Lib/mailagent directory. It needs some tailoring so you should copy it into your home directory and edit it to suit your needs. Comments held in it should be self explanatory. There is only a small section at the head of the script which needs to be edited. You'll have to delete shell comments in the filter script by yourself if your shell cannot deal with them.

As of version 3.0 PL44, I advise you to prefer the C version if you are concerned about security. If you are in a position where multiple architectures can process your .forward, then a shell wrapper selecting the proper executable based on the architecture will be required.

Configuring Mailagent

If mailagent is in your path, you may automatically configure a default installation by running:

        mailagent -I

which will create a ~/.mailagent file from an existing template, customize some important variables for your site, and make some basic sanity checks. Everything the command does is output on the screen for checking purposes, and any problem found is reported.

Otherwise, you have to copy the mailagent.cf file held in the mailagent sub-directory /usr/share/mailagent (hereafter named Lib) as a .mailagent in your home directory. Edit it to configure the whole processing. In particular, you have to choose a spool directory (hereafter named Spool) and a log directory (hereafter named Log).

Note that using the automatic installation procedure above does not prevent you from going through the file and modifying it as you wish. In fact, you are greatly encouraged to do this, especially for the home directory setting, the logging level and the path or p_host variables. Once you are done, rerun the mailagent -I command to make sure everything is fine. Still, you will have to plug in mailagent by creating a ~/.forward file, as explained in a few sections.

Following is a description of each of the fields you will find in the ~/.mailagent file, followed by a suggested value, when applicable. Fields marked as optional may not be present in the configuration file. Some fields have a close relationship with others, and that is given too.

agemax
Period after which an entry in the database should be removed (suggested: 1y) This field is optional, but needed if autoclean is on.
authfile
Remote sending authorizations (not implemented yet).
autoclean
Set to ON (case insensitively), mailagent will perform automatic cleaning of the database entries under hash by removing all the items older than agemax. This is an optional field, omitting it defaults to OFF. (suggested: OFF, unless you use ONCE, UNIQUE or RECORD commands, or activate the vacation mode.)
biff
Whether or not biffing is wanted when mailagent delivers mail to a folder. Set it to ON (case insensitively) to allow local biffing if you are logged in. (optional, defaults to: OFF)
biffhead
When biffing is enabled, this variable lists which headers should be printed out. Headers should be given in their normalized format and be separated with commas. (optional, defaults to: From, To, Subject, Date).
bifflen
The maximum length of the message body that should be printed when biffing. (optional, defaults to 560).
bifflines
The maximum number of lines of the message body that should be printed when biffing. Actually, mailagent attempts to print that amount of lines, provided the total amount of characters printed is less than bifflen. (optional, defaults to 7).
biffmh
When turned ON, the body of the message is compacted before biffing by removing consecutive spaces and replacing newlines with a single space. The message itself is not altered physically of course, only the output on the screen is concerned. Since this may yield to a difficult-to-read message, I suggest you also turn on biffnice when using this option. (optional, defaults to: OFF).
biffmsg
The path to a file describing the format biffing should use. If not set, a default hardwired format is used. Season to taste. (suggested: ~/.biffmsg).
biffnice
Whether the message should be reformatted to nicely fit into the terminal. (optional, defaults to OFF, suggested: ON when biffmh is also ON).
biffnl
Controls whether "blank" body lines should be printed or not. By "blank" lines, we mean lines not containing words. Set it to ON to print such blank lines, to OFF if you wish to get a more compact view of the body within the limits fixed by bifflen and bifflines. (optional, defaults to ON).
biffquote
Controls whether the leading attribution line introducing a trimmed quotation should be part of the biff message or not. When turned OFF, the attribution line is trimmed along and this is reported in the trimming message, when bifftrim is ON. (optional, defaults to ON).
bifftrim
Controls whether trimmed lines within the biff message should be replaced by a message stating how many of them were trimmed. Only used by the %-T biffing macro. When turned OFF, it automatically turns off biffquote as well. (optional, defaults to ON).
bifftrlen
States how many lines long a leading quotation should be before performing any trimming. Only used by the %-T biffing macro. (optional, defaults to 2).
callout
The name of the callout queue file where batched jobs are kept. This parameter must be defined when using the AFTER command. (suggested: $spool/callout)
cleanlaps
Cleaning period for database entries. The value of the last clean up is saved into the context file. This is optional, but needed if autoclean is on. (suggested: 1M)
comfile
Name of the file containing authorized commands. Needed when PROCESS is used. (suggested: $spool/commands).
compress
Name of the file containing the list of compressed folders. See section about folder compression. This is an optional parameter. (suggested: ~/.compress).
compspecs
Name of the file containing specifications for how to handle different types of compression formats. See section about folder compression. This is an optional parameter. (suggested: $spool/compressors).
comptag
The default compression tag when creating new folders. If not specified, the default is 'gzip'.
comserver
Name of the file containing authorized SERVER commands and their definition. This is an optional parameter if you don't plan to use the generic mail server. (suggested: $spool/server).
context
File holding the mailagent context. The context saves some variables which need to be kept over the life of the process. Needed if auto cleaning is activated. (suggested: $spool/context)
distlist
A list of all the available distributions. See the sample held in Lib/mailagent/distribs. Needed by PROCESS only. (suggested: $spool/distribs)
domain
Your domain name, without the leading dot, as in example.com. The value is appended to the value of email when that variable does not have any '@', to construct a fully qualified e-mail address. See also the hidenet variable. (optional, defaults to the domain name determined at build time).
email
Your electronic mail address. If left unspecified, mailagent will try to guess it. This address is used by mailagent when trying to send something to the user (you!). (suggested: specify your e-mail address).
emergdir
Name of the directory which should be used for dumps, preferably. This is optional. (suggested: ~/tmp/lost+mail)
execsafe
Whether to be strict before using exec() to launch a new process or not. The value of this variable is used in place of secure when checking executable files. (defaults to OFF, suggested: ON if possible).
execskip
Whether to skip the exec() security checks alltogether. Don't turn this ON unless you really trust all the users having access to your machine or file server. (optional, default to OFF, suggested: OFF).
fromall
Whether or not mailagent should escape all the From lines in the message, not only those it thinks should appear dangerous (i.e. a From after a blank line). This option only makes sense when fromesc is also activated. It is ignored otherwise, and therefore is optional. By default, it is assumed to be OFF. (suggested: OFF, until you have reasons to believe your mail user-agent is confused in this mode: when it happens, your user agent will split mail for no apparent reason).
fromesc
Whether or not mailagent should escape potentially dangerous From lines in mail messages. If you use MH or if your mail reader does not use those lines to separate messages, then you may set it to OFF. (suggested: ON)
fromfake
Whether or not mailagent should fake a From: line into the message header when it is absent. Naturally, it requires a valid leading From line to operate! (optional, defaults to ON, suggested: ON).
groupsafe
If turned OFF, then group-writable files will be managed as if they were secure, from a security point of view. Leave it to ON if possible, or you may pass by a huge security hole without your noticing (optional, defaults to ON, suggested: ON).
hash
The directory used for name hashing by the built-in database used by ONCE, UNIQUE and RECORD commands. Optional, unless you make use of those commands or activate auto cleaning. The directory is placed in the spool area. (suggested: $spool/dbr).
helpdir
Directory where help files for SERVER commands are kept. (suggested: $spool/help)
hidenet
When set to ON, the value of the variable domain is the fully qualified name used. When OFF, the hostname is prepended to the domain. If the hostname is already fully qualified, then the value of domain is ignored. Assuuming domain is set to example.com and the hostname is host, then the fully qualified name will be host.example.com if hidenet is OFF, and example.com if ON. (optional, defaults to whatever was determined at build time)
home
Defines where the home directory is. This must be accurate.
level
Log level, see below for a definition of available levels (suggested: 9).
linkdirs
When set to ON, carefully checks symbolic links to directories when performing security checks on sensitive files. This will (recursively) check for each symbolic link level that the target directory is not world writable or group writable and that the parent directory of each target link is not world writable. If the secure option is OFF, this parameter is ignored. (optional, defaults to: ON, suggested: ON when secure is also ON).
lockdekay
The delay in seconds between two locking attempts. (optional, defaults to: 2).
lockhold
The maximum delay in seconds for holding a lock. After that time, the lock will be broken. (optional, defaults to: 3600).
lockmax
Maximum number of locking attempts before giving up. (optional, defaults to: 20).
locksafe
When locking a file, mailagent normally makes lockmax attempts separated by lockdelay seconds, and then gives up. When facing a delivery to a mailbox, it may make sense to continue even if no lock was grabbed, or even if only a partial locking was done (e.g. one of the .lock or flock()-style locking succeeded). This variable controls how safe you want to be. Set it to OFF to let mailagent continue its mailbox delivery even though no locking was done, to ON if you want strict locking, to PARTIAL if you can live with partial locking. Messages not saved in a folder are dumped to an emergency mailbox. (optional, defaults to ON). On Debian systems, since mailagent can not grab locks,it should always be left ON, or else mail garbling may occur. See /usr/share/doc/mailagent/SECURITY for details.
lockwarn
This variable controls the time after which mailagent should start emiting a warning when busy trying to acquire a lock. It is a comma separated list of values, in seconds. If two values are given, the first is the initial time threshold, the second is the repeat period. For instance, a value of "15,60" would cause a warning after 15 seconds, then every 60 seconds until the lock is taken or the locking attempt time is expired (see lockmax and lockdelay). If only one value is given, it is taken as being both the initial threshold and the period. (optional, defaults to: 20,300).
log
Name of the log file which will be put in Log directory. (suggested: agentlog).
logdir
Logging directory. (suggested: ~/var/log).
mailbox
The name of the system mailbox file, which by default is the value of the user configuration variable. This is an optional parameter.
maildrop
Location of the system mail spool directory. If none is provided, then the mailagent will use the value determined by Configure.
mailopt
Options to be passed to the mailer (see sendmail). (optional, suggested: -odq -i, when using sendmail).
maxcmds
Maximum number of commands that are allowed to be executed by a SERVER command before flushing the remaining of the mail message. (suggested: 10).
maxerrors
Maximum number of errors for the SERVER command before flushing the remaining of the mail message. (suggested: 10).
maxsize
Maximum size in bytes of files before using kit for sending files. This is used by PROCESS. (suggested: 150000).
mboxlock
The format to be used for locking mailboxes before delivering to them. This string goes through a small macro substitution mechanism to make it more general. The file name derived after macro substitution is the name of the lock that will be used, given the name of the file that is to be locked. Available macros are:

%D: the file directory name
%f: the file name to be locked (full path)
%F: the file base name (last path component)
%p: the current process pid number
%%: a plain % character

Common locking formats are "%f.lock" and "%D/.%F.lock". Of course, to be able to use this feature, mailagent must not have been configured to use flock()-style locking only. (optional, defaults to: %f.lock). This has no effect on Debian systems, since mailagent can not get a lock anyway, since it is not sgid mail.
mhprofile
The name of the MH profile to be used. This is needed only when attempting to save in an MH folder. If this optional parameter is not set, the default value ~/.mh_profile is used.
mmdf
Set this to ON if you wish to be able to save mail in MMDF-style mailboxes. (suggested: OFF, unless you use MMDF or MH). This is invalid on a Debian system.
mmdfbox
The value of this variable only matters when mmdf is on. If set to ON, then new folders will be created as MMDF ones. This variable is not used when saving to an existing folder, since in that case the mailagent will automatically determine the type and save the message accordingly. (suggested: OFF, unless you use MMDF or wish to use MH's mshf).
msgprefix
Name of the file to put in directory folders, specifying the message prefix to be used. Optional, defaults to .msg_prefix.
name
First name of the user, used by mailagent when referring to you. This sets the value of the %U macro.
newcmd
Name of the file describing new filtering commands. See section Extending Filtering Commands for more details. Leave this optional parameter out unless you are a mailagent expert. (suggested: $spool/newcmd).
newsopt
Options to be passed to the news posting program (see sendnews). (optional, suggested: leave empty when using inews).
nfslock
Set it to ON to ensure NFS-secure locks. The difference is that the hostname is used in conjunction with the PID to obtain a lock. However, mailagent has to fork/exec to obtain that information. This is an optional parameter which is set to OFF by default. (suggested: OFF if you deliver mail from only one machine, even though it's via NFS).
passwd
File where SERVER power passwords are kept -- encrypted usually. (suggested: $powers/passwd).
path
Minimum path to be used by C filter program. To set a specific path for a machine host, set up a p_host variable. This will be prepended to the default PATH variable supplied by other programs. (suggested: /bin:/usr/bin:/usr/ucb). Note that the host name must be specified without any domain name appended to it (e.g. for an host name of lyon.eiffel.com, use variable p_lyon). If your host name contains an '-' in it, you must write it as a '_', since '-' is not a valid character for a perl variable name.
perlib
This variable may be used to change the perl search path for required files. Directories should be separated using a ':' character, just like a shell PATH. This path is prepended to the default perl search path. Any directory not starting with a '/' (after ~name substitution) is taken relatively to the mailagent private lib directory determined at configuration time.
plsave
Name of the file used to save the patchlevels for archived distributions. This is only used by the commands invoked via PROCESS. (suggested: $spool/plsave).
powerdir
Directory listing user clearances for SERVER powers. (suggested: $powers/clearance)
powerlist
Name of file containing SERVER power aliases. Since power names can be arbitrary long but some filesystems still have a 14 character limitation on filename length, internal aliases are created and maintained by mailagent. (suggested: $powers/aliases).
powerlog
File where SERVER power requests are logged, in addition to the agentlog. Since those are a security concern, it is a good idea to log them separately. If not defined, log them only in agentlog. (suggested: $logdir/powerlog).
powers
Directory for SERVER power administration. (suggested: $spool/powers)
proglist
A small description for the available distributions. See the sample held in Lib/mailagent/proglist. This is used by PROCESS only. (suggested: $spool/proglist)
queue
Queue directory (messages waiting to be processed). Required, of course. (suggested: $spool/queue)
queuehold
Maximum number of seconds a mail can sit in the mailagent queue before being actually processed. During that time, mailagent will not try to process the message even when -q is used. (optional, defaults to: 1800).
queuelost
Maximum number of seconds after which mailagent should flag messages still in its queue as being old. (optional, defaults to: 86400, i.e. a day).
queuewait
Time in seconds telling the C filter program how long it must wait before launching mailagent. (optional, defaults to: 60, but can be lowered to 0 if you don't want to wait to delay getting new messages).
rulecache
The name of the file used to cache the latest compiled rules. Since usually mailagent works mainly with one same rule file, this saves the overhead of recompiling all the rules each time. (optional, suggested: $spool/rulecache).
rulemac
Set this to ON to enable macro substitutions in rule patterns. (optional, defaults to: OFF).
rules
The name of the file holding the filtering rules (optional on non Debian systems, suggested: ~/.rules). On Debian systems, one must have a minimal rules file to prevent mailagent from trying to put messages into /var/spool/mail/$USER, since mailagent can't lock that directory to prevent mail from being garbled. This is because Debian policy requires all entities attempting locks on that directory to be sgid mail, and making mailagent sgid anything would be a security loophole.
    { SAVE incoming };
 is the suggested minimal rules file.
runmax
Timeout for RUN commands and friends. (optional, defaults to: 3600).
scriptcc
Flag indicating whether a copy of the SERVER session transcript should be send to the user running mailagent. (suggested: OFF).
secure
When set to ON, mailagent and the C filter will perform extensive security checks on sensitive files. This includes checks for group writability, ownerships and protection testing on the directory where the file resides, and checks on symbolic links to directories (mailagent only, when linkdirs is ON too). Note that secure is assumed to be ON, whatever its real setting, when running as super-user. (suggested: ON).
sendmail
The name of the program used to send mail. That program must accept the mail message with headers on its standard input and a list of recipients on the command line. If not specified, will use the mailer chosen at configuration time (sendmail usually). The command line used to mail a message will be sendmail mailopt address(es). (optional, suggested: /usr/lib/sendmail).
sendnews
The name of the program used to post news. That program must accept the news article with headers on its standard input. If not specified, will use the news posting program chosen at configuration time (inews usually). The command line used to post an article will be sendnews -h newsopt. (optional, suggested: /usr/local/bin/inews).
seq
File used to compute job numbers (suggested: .seq).
servdir
The directory name where shell and perl server commands are stored. This is the default lookup place. Optional parameter unless SERVER is used. (suggested: $spool/cmds).
servshell
This is the name of the shell used to launch SERVER shell commands (actually to process the wrapper file that will ultimately exec() the command). On some systems like HPUX 10.x, this has to be set to /usr/old/bin/sh to get the plain old Bourne shell, because /bin/sh is a braindead POSIX shell that closes file descriptors greater than 2 upon exec(), whereas the Bourne shell does not. (optional, suggested: /bin/sh unless you're on HPUX 10.x, as explained before).
spool
Spool directory, required (suggested: ~/var/mailagent).
statfile
File where statistics should be gathered. If no such file exists, no statistics will be recorded (suggested: $spool/mailagent.st).
tofake
Whether or not mailagent should fake a To: line into the message header when it is absent, which will be used for filtering purposes (no physical alteration of the header occur). It uses Alternate-To: headers if found, otherwise it assumes the message was send to the user and takes the value from the user configuration variable. (optional, defaults to ON, suggested: ON; turn it OFF only if you want to identify missing To: lines to detect SPAM).
tome
This optional variable may contain a comma separated list of alternate logins that are also valid for the user (mail aliases). This is used in vacation mode to check whether the mail was sent to the user or to a mailing list. Matching is anchored on the login name, so saying "ro*" will match both root and rom.
track
Set to on (case insensitively), this turns on the -t option which tracks all the rule matches and the actions on standard output. This is optional (suggested: OFF).
timezone
The time zone value for environment variable TZ (optional).
tmpdir
Directory for temporary files. Required (suggested: /tmp).
umask
Default umask which is reset by mailagent before processing a message. Assumed to be decimal unless starting with '0' (for octal) or '0x' (for hexadecimal). The octal format is the easiest way to specify it nonetheless. (optional, defaults to: 077).
user
Login name of the user who runs mailagent. This sets the value of the %u macro.
vacation
A flag set to ON or OFF to switch the vacation mode accordingly.
vacfile
The name of the file to be sent back in vacation mode (suggested: ~/.vacation).
vacfixed
When ON, all changes to the vacation file (even locally) by means of the VACATION command are forbidden. This is useful if you usually have many customized vacation messages for different people but temporarily want to force one unique message (optional, defaults to: OFF).
vacperiod
The minimum time elapsed between two vacation messages to a given address (suggested: 1d).

Available Logging Levels

The following log levels can be used while running mailagent:

0       No logging
1       Major problems only
2       Failed deliveries
3       Successful deliveries
4       Deferred messages
5       Successful filter actions
6       Unusual but benign incidents
7       Informative messages
8       Non-delivery filter actions
9       Mail reception
12      Debug
19      Verbose
20      Lot more verbose

Plugging Mailagent

Once you have configured mailagent in a ~/.mailagent (where ~ stands for your home directory), you must tell sendmail how to invoke it. This is done by setting a ~/.forward file which looks like this (leading and trailing double quotes are a mandatory part of it):

"| exec /users/ram/mail/filter >>/users/ram/.bak 2>&1"

This will pipe all your mails to the filter program, redirecting all unusual messages to ~/.bak. A sample filter shell script may be found in Lib/mailagent, as well as a C filter program. On some systems, it may be necessary to move the '|' character before the leading quote, but don't try this unless you have no other choice (i.e. only as a last resort). Also, apparently Exim takes exeption to the exec, and even perhaps to the redirection -- which would be a pity.

It is very important to redirect error messages to some file within your home directory. For one thing, that will get you out of trouble if strange things start to happen, but more to the point, it makes your .forward file unique. Older sendmail program, in an heroic attempt to "optimize" delivery, will silently remove duplicate recipients, and if a recipient has a .forward, its literal content is used in place of his e-mail address. Therefore, two local recipients with the same filtering string will be considered as one unique recipient and only one of them will get the message...

If your system does not allow shell redirection from within the .forward, you can use this instead (only supported by the C filter):

"| exec /users/ram/mail/filter -o /users/ram/.bak"

which in effect redirects stdout and stderr to the specified file for you, appending data at the end of the file. If the filter runs setuid or setgid, you will not be allowed to create the file, nor to append to it unless the owner of the file is the real uid invoking the program (for security reasons).

Note that the .forward file only pipes the mail to the filter program and does not leave any copy in the mailbox. It is up to you to decide in the rule file whether you want to trash the mail away or leave it in the mailbox.(Note that on Debian systems mailagent can not lock the spool directory, and letting it leave mail in mailbox may cause it to get garbled). If you do not have a rule file (i.e. you left a blank entry in your ~/.mailagent, or you named a non-existent file, or your file is simply empty), the default action is to leave the mail in the mailbox, which is not a good idea for Debian machines. Please onstall a minimal rules file in any case,
 { SAVE incoming };
 is the suggested minimal rules file.

Allowed Commands

The allowed command file (as specified by the comfile variable in your ~/.mailagent) contains all the recognized and allowed commands. The file commands held in directory Lib/mailagent should be copied as-is into your Spool directory.

Testing Your Installation

Now, assuming you have set a proper ~/.mailagent file and edited the configuration section of the filter, it is time to test your installation. Make sure your .forward is world readable and that the filter has the execution bits set (there is no reason to make the filter world readable). Set a log-level of 20 and disable vacation mode (the vacation entry in the ~/.mailagent should be OFF). Set the name of the rule file to an file containing a catch-all rule:
     { SAVE incoming };
 You are ready to proceed...

Send yourself a mail and give mailagent time to process your mail. The subject of the message should be 'test' (in fact, anything but 'Command'). You may want to run a "tail -f logfile" to see what's happening. At the end of the processing, the logfile should contain something like the following (names of temporaries may -and will- of course differ; timestamps have been removed):

got the right to process mail
building default rules
parsing mail
analyzing mail
in mode 'INITIAL' for ALL
selector 'All' on '<1,->', pattern '/^Subject: [Cc]ommand/'
matching '/^Subject: [Cc]ommand/' on 'All' (<1,->) was false
selector 'All'  on '<1,->'
matching . on 'All' (<1,->) was true
saving in folder incoming
XEQ (LEAVE)
starting LEAVE
starting SAVE /home/ram/mail/incoming
SAVED [qm7831] in folder incoming
FILTERED [qm7831] from ram (Raphael Manfredi)
mailagent continues
mailagent exits

If you do not get that, there is a problem somewhere. Start by looking at the ~/.bak file (or whatever file the .forward uses to redirect output of the filter). If you see something like:

FATAL no valid queue directory
DUMPED in ~/mbox.filter

then it means the queue parameter in your ~/.mailagent does not point to a valid directory. Your mail has been dumped in an emergency mailbox.

The ~/.bak file may also contain error messages stating that perl was not found. In that case, there should be an error message in the logfile:

ERROR mailagent failed, [qm7886] left in queue

In that case, make sure the mail has correctly been queued in a file qm7886. The queue will be processed again when another mail arrives or when the mailagent is invoked with -q (however, to avoid race conditions, only mails which have remained for a while will be processed).

Queuing of mail also happens when another mailagent is running. If the logfile says:

denied right to process mail

then remove the perl.lock file in the Spool directory. Old lock files are automatically discarded by the mailagent anyway (after one hour).

If none of these occurs, then maybe sendmail did not process your ~/.forward at all or the file has a syntax error. Check your mailbox, and if your mail is in there, your .forward has not been processed. Otherwise, ask your system administrator to check sendmail's logfile. A correct entry would appear as (with leading timestamps and syslog stamps removed):

message-id=<[email protected]>
from=ram, size=395, class=0, received from local
to="| /york/ram/mail/filter >>/york/ram/.bak 2>&1", delay=00:00:05, stat=Sent

If you still cannot find why the mail was not correctly processed, you should make sure you normally receive mail by removing (or renaming) your ~/.forward and sending yourself another test mail. Also make sure your home directory is world readable and "executable".

If you are using the C filter, make sure it is running on the right platform. There may be a low-level routing of all your mail to a mailhost machine, responsible for the final delivery, and the filter program will run on that machine, which may be a different platform than the one you compiled filter on. Also make sure your home directory is mounted on that machine, or the mail transport agent will be unable to locate your .forward file, less process it.

This kind of centralized mail delivery is good only when a few people have mail processing hooks (i.e. .forward files piping mail to a program); otherwise it's better to route mail to each user's workstation or machine, for local processing, to avoid an excessive workload on the mailhost machine, especially if it is a dedicated NFS server. If you are a system administrator installing mailagent and expect many people to use it, keep this in mind.

OPTIONS

There is a limited set of options which may be used when calling the mailagent directly. Only one special option at a time may be specified. Invoking mailagent as mailqueue is equivalent to using the -l option.
-c file
Specify an alternate configuration file (~ substitution occurs). The default is ~/.mailagent.
-d
The mailagent parses the rule file, compiles the rules and dumps them on the standard output. This option is mainly used to check the syntax of the rule file and make sure the rules are what the user really thinks they are.
-e rule
This option lets you specify some rules on the command line, which will override those specified via the ~/.mailagent, if any. There may be as many -e as necessary, all the rules being concatenated together as one happy array, which is then parsed the same way a rule file is. If only one rule is given and there is no action specified between {...} braces, then the whole line is enclosed between braces. Hence saying -e 'SAVE foo' will be understood as -e '{SAVE foo}', which will always match and be executed. Using the -d option in conjunction with this one is a convenient way to debug a set of rules.
-f mailfile
Using mailfile as a UNIX-style mailbox (i.e. one where each mail is preceded by a special From line stating the sender and the date the message was issued), extract all its messages into the queue and process them as if they were freshly arrived from the mail delivery subsystem.
-F
Force processing on already seen messages. Usually, mailagent enters the special _SEEN_ state when it detects an X-Filter: line issued by itself, but this option will have it continue as usual (although vacation messages are disabled). Use this option when post-processing mail already filtered. Also look at the -U switch if you are using the RECORD or UNIQUE actions in some rules.
-h
Print out a usage message on the standard error and exit.
-i
Interactive mode, directs mailagent to print a copy of all the log messages on stderr.
-I
Install a ~/.mailagent file from template, or merge new configuration variables into an existing file; then perform sanity checks and create mandatory files or directories. This option may be viewed as an help into setting up mailagent's environment. In any case, the created/merged ~/.mailagent file should be manually verified before letting mailagent deal with your mail by hooking it into ~/.forward.
-l
List the mailagent queue. Recently queued mails which are waited for by the filter are skipped for about half an hour, to avoid race conditions. This may be configured via the queuehold variable. Really old messages (more than queuelost seconds old) are flagged with a '#' character. Messages out of the queue (queue variable) are flagged with a '*', whilst old messages out of the queue are signaled by an '@'. Locked messages have a '*' appended to their status.
-L level
Override the log level specified in the configuration file.
-o override
This option lets you override a specific configuration option. The option must be followed by a valid configuration line, which will be parsed after the configuration file itself. For instance, the -L 4 option is completely equivalent to -o 'level: 4'. Note that any white space must be protected against shell interpretation by using the appropriate quoting mechanism. There may be as many -o options on the command line as necessary.
-q
Force processing of mailagent's queue. Only the mails not tagged as skipped by the -l option will be processed.
-r file
Specify an alternate rule file.
-s {umaryt}
Build a summary of all the statistics gathered so far. The output can be controlled by appending one or more letters from the set {umaryt}. Using -summary is a convenient way to get the whole history of the filter actions. The u modifier will print only used rules. The m will merge all the statistics at the end while a reports the mode the filter was in when the command was executed. The r asks for rule-based statistics and the y is pretty useless and is here only to get a nice mnemonic option. Note that specifying an option more than once has no effect whatsoever on the option itself (i.e. you may put three Uu and only one m, but you'll still get the summary!). The t letter may be followed by digits specifying how many rule file versions relative to the topmost (most recent) rule file we should extract from the statistics, that amount defaulting to 1: using -surat will print a complete statistics report for the last version of your rules, while -surt12a would do the same for the last twelve versions of those same rules.
-t
Put mailagent in a special tracking mode where all the rule matches and executed actions are printed on the standard output. This is mostly useful for debugging a rule file. See also the track parameter in the configuration file.
-V
Print version number and exit.
-U
Prevent the UNIQUE and RECORD commands from rejecting an already processed Message-ID the first time they are run on a given message. This is useful when processing messages that have been dropped in the emergdir directory due to some abnormal (but transient) condition and you wish to reprocess the message. Also see the -F switch if you are re-processing messages.

If you invoke mailagent without options and without any arguments, the program waits for a mail on its standard input. If an argument is provided, it is the name of a file holding one mail to be processed. This is the normal calling procedure from the filter, the argument being the location of the queued mail.

USING THE DEFAULT RULES

If you do not want to use the filtering feature of mailagent, (NOTE: This may cause mail to be garbled on Debian systems, since mailagent can not lock the spol directory under Debian policy restrictions) then the default built-in rules will be used. Those are really simple: all the mails are left in your mailbox and mails with a line "Subject: Command" anywhere in the message will be processed. Commands are looked for on lines starting with "@SH". The remaining of the line is then given to a shell for execution.

Available commands are read from a file (entry comfile in your configuration file), one command name per line. Only those listed there will be executed, others will produce an error message. The mailagent traps the exit status and will send an error report if a command fails (provided that the command does not issue a message by itself, in which case it should return a zero exit status).

If you do not want to use the default rules, you may skip the remaining of this section.

Configuring Help

The help text mailagent will send to people must be copied from Lib/mailagent/agenthelp into your own spool directory, as specified in your ~/.mailagent. Two macros may be used:

=DEST=
This will be expanded to the sender's address (the one who sent you the mail currently processed by mailagent).
=MAXSIZE=
This stands for the maximum size set before kit is used to send files back (parameter maxsize in your ~/.mailagent file).

You may use the default help file or design one that will give even more details to the poor user.

Distribution Files

The two files proglist and distribs held in Lib/mailagent describe the distributions your mailagent will be able to distribute. The samples given show the expected syntax. In order to clarify things, here is what the format should be:

File proglist contains a small description for programs. The name of the program appears after a single star. It is followed by lines in free format. An optional three-dashes line separates each program's description. Note that a leading tab will be added to each line of description.

The distribs file holds lines of the following form:

progname version path archived compressed patches


where:
progname
is the program name (the same as the one mentioned in proglist).
version
is the current version number. If none, a three-dashed line may be used.
path
is the path where the distribution is stored. The ~ will be expanded into your home directory. Note that if the distribution is stored in archived form, the path name is the one of the archive without the ending extension (which may be .cpio.Z or .tar.Z).
archived
is either y or n depending on whether the distribution is archived or not.
compressed
is either y or n depending on whether the distribution is compressed or not. This could be guessed from the extension's name, but we must think of file systems with short names.
patches
is y or n depending on whether the distribution is maintained or not by you. If you put a p, this means official patches are available, although you do not maintain the distribution. Finally, an o means that this is an old version, where only patches are available, but maildist will not work. In that case, assuming the version number is 1.0, old patches are expected in a bugs-1.0 directory.

You may include comments in both files: all lines starting with a leading # will be ignored.

Testing Your Mail Agent

It is now time to make sure your mailagent works. Send yourself the following mail:

Subject: Command
@SH mailhelp

You should receive back a mail from yourself with the subject set to: "How to use my mailagent". If you don't, check the file ~/.bak (or whatever file you set in your .forward). If it is empty, look at the log file. If the log file is not empty, then perhaps the mail has been queued. Check the sendmail queue. Also make sure that you removed the '#' comments in the filter script. On some systems, they cause some trouble. If you are using the C filter, maybe your sendmail is broken and you need to make your own setuid copy (or perl might complain that you have a kernel bug, etc...).

If you have done everything right but it still does not work properly, increase log level to 20 and resend your command mail. Then check the log file. The diagnosis should be easier.

Once this works, you should check your distribs and proglist files by sending yourself the following mail:

Subject: Command
@SH maillist

If the list you have in return is incorrect, then your distribution files are wrongly written. If you do not get the list, there is a problem with your mailagent's configuration. Retry with a log level set to 20 and look at the issued log messages in your Log directory. Make sure that the file listed in the plsave entry of your ~/.mailagent is correctly updated after a maillist has been run.

USING THE FILTER

The mailagent can also be used as a filter: mail is parsed and some actions are taken based on simple lex-like rules. Actions range from a simple saving in a folder, a forwarding to another person, or even spawning of a shell command. Before going further, here is a small example of a valid rule file:

From: root { FORWARD postmaster };
To: [email protected] { POST mail.gue };
Subject: /metaconfig/ { SAVE dist };
{ SAVE incoming };

There are three distinct rules. Rules are applied in sequence, until one matches (so the order is important). Any mail coming from root will be forwarded to user postmaster. A mail addressed to [email protected] is a mail coming from a mailing list. The mail is posted on a local newsgroup mail.gue. Mails whose subject contains the word "metaconfig" will be saved in a folder dist for delayed reading and will not appear in the main mailbox. If no rule matched, the mail is left in the folder incoming.

Rule File Syntax

Here is a non-formal description of the rule file. Parsing of the file is done lexically, hence the choice of non-ambiguous tokens like '{' or ';' which are easily parsed. This introduces some limitations which are silently applied: for instance, no '{' may be used as part of an address.

Comments are introduced by a leading '#' , which must be on the left margin. Unlike shell comments, a '#' which is not left justified will not be understood as a comment. However, spaces or tabs are allowed in front of '#'.

All the statements in the rule file must end with a ';'. There are mainly four parts in each line. A list of comma separated modes, between '<' and '>', which give the set of modes in which the rule applies. The special mode ALL will match everything. The filter begins in the mode INITIAL. Omitting the mode defaults to "<ALL>". It is possible to guard a rule against some specific mode by negating it, which is done by prefixing the mode with '!'. Negated modes take precedence other plain modes, meaning "<!ALL>" will never be matched, ever, and that "<MODE, !MODE>" is equivalent to "<!MODE>".

Then comes a list of selectors. Those selectors must be space separated and end with ':'. They represent the names of header fields which must be looked at by the forthcoming pattern. An empty selector list defaults to "Subject:". Special selectors "All:", "Body:" and "Head:" apply to the whole message, its body or its header. A commonly used selector list is "To Cc:" which tests the recipient fields of the header. If the selector name is preceded by an exclamation mark '!', then the logical value of the test for that selector is negated.

The list of selectors may end with an optional range specification, given as <min, max>, before the final ':' character marking the end of the selector list. The minimum or the maximum may be given as '-', in which case it is replaced with the minimal or maximal possible value. Indices for selection begin at 1 (not 0), for instance: <3, 7>. If no range selection is given, then the default <1, -> is used. Ranges normally select lines within the matching buffer, unless the selector is expecting a list in which case it operates on the list items. For instance, Body <3, 5>: would select lines #3 to #5 (included) from the mail body, whereas To Cc <1,3>: would focus on the first three addresses on each To: or Cc: header lines. Negative values refer to that many lines or addresses back from the end, i.e. Cc <-2,->: selects the last two addresses on the Cc: line. A single number such as <2> is understood as <2, 2>, i.e. it select only one item in the list, <-> meaning everything (and being therefore redundant).

The selector is then followed by a pattern within '/' or by a single name. In order to ease the writing of the rules, the semantic of a single name varies depending on the selector used. For the special selectors "From:", "To:", "Cc:", "Sender:", their associated "Resent-" fields, "Reply-To:", "Envelope:" and "Apparently-To:", a single name is understood as a match on the login name of the address. Note that if no "To:" field is present in the header, one will be forged from the "Apparently-To:" for the purpose of filtering only (i.e. no physical modification on the header is done). If the login name of the address is a full name of the form First.Last, only the last name is kept, and is lower-cased. If only a single name is given, only shell metacharacters * and ? are allowed, as well as intervals [].

If the pattern is preceded by a single exclamation mark '!', then the matching status is negated (i.e. it will succeed if the pattern is not found). If a single word is used for non-special selectors, the same rules apply but the pattern is anchored at the beginning and the end for an exact match. With a pattern starting with '/', any regular expression understood by perl may be used and your pattern will not be modified in any way. The other special selector "Newsgroups:" works as "To:", excepted that newsgroups names are expected and a match is attempted on every item in the list. Every pattern match on a single name for an address-type field (i.e. "Newsgroups:" excluded), are made in case-insensitive mode. Otherwise, you can force a case-insensitive match by appending a trailing i option, as in /pattern/i.

There is also a little magic involved when matching on an address field. Namely, if the pattern is not a single word and is anchored at the beginning, then only the address part of the field will be kept. For instance, if we have a From: field whose value is Raphael Manfredi <[email protected]>, then the pattern /Raphael/ would match, but not /^Raphael/. Instead, /^ram@.*$/ would match, but this is more easily done with a single word pattern ram, for it only focuses on the login name of the address and would also match if the address was written as eiffel.com!ram. A single address in Internet form, as in [email protected] is implicitely matching on the address part of the field, and you must not escape the '.' as you would have to in a regular expression.

This may sound a little complex, but this design is meant to make things easier for the user. Here are some other examples:

# Match [email protected] as well as [email protected].
From: ram
# Match [email protected], ram but not [email protected]
From: r[oa]*
# Match [email protected] but not [email protected]
To Cc: /^gue@eiffel\.fr/
# This will match [email protected] as well as [email protected]
To Cc: /gue@eiffel/
# Match comp.lang.perl but not comp.lang.perl.poetry (?)
Newsgroups: comp.lang.perl
# Accept anything but messages coming from root
From: !root

When attempting a match on "To:", "Cc:" or "Apparently-To:", a list of addresses separated by a comma is expected, whereas only one address is expected after "From:". If you omit the pattern, it will be understood as * (recall that a single word uses shell meta-characters), which will match anything.

Then comes the action to be taken when a match occurs. There are only a limited set of valid actions which will be described soon in detail. The action is enclosed in curly braces '{' and '}' and actions are separated or terminated (depending on your taste) by a ';'. Action names are spelled in upper-case for readability, but case is irrelevant. If you want to put a ';' within the rule, it must be escaped by preceding it with a backslash. A double backslash is translated into a single one, and any other escape sequence involving the backslash character is ignored (i.e. \n would be kept verbatim).

Note that a rule should be ended by a single ';' after the last '}'. It is possible to omit this final ';', but that single token is the re-synchronizing point for error recovery. One could argue however that there should be no syntax error, and thus the ';' ought to be safely omitted. Whenever in doubt, check your rule file with the -d option.

Here is a prototypical rule (using perl regular expressions; please refer to the subsection Regular Expressions for more information):

<ROOT> From: /^\[email protected]$/ { SAVE eiffel };

That rule will only be taken into account when the filter is in the mode ROOT (recall that the processing starts in mode INITIAL; use BEGIN to change the mode, as in lex). So in mode ROOT, anything which comes from a user located in the eiffel.com site is saved in folder eiffel for deferred reading. The mail will not appear in the mailbox.

It is possible to have more than one selection for a rule. Identical selectors are logically or'ed while different ones are and'ed. The selections are comma separated. For instance,

From: root, To: ram, From: ram, Subject: /\btest\b/ { DELETE };

will delete a mail from root or ram if it is sent to ram and has the word test in its subject. It is also possible to write the previous rule as:

From: root, ram, To: ram, Subject: /\btest\b/ { DELETE };

because if no selector is given, the previous one is used (with the first selector being "Subject:" by default).

Anywhere in the rule file, it is possible to define some variables. The list of recognized variables is given later. For now, let's say that maildir is the default folder directory. This variable is used by the SAVE command when the argument is not an absolute path. Setting

maildir = ~/mail;

will direct the filter to use ~/mail as the folder directory (default is ~/Mail). Note the ~ substitution and the final ';'. It is not possible (currently) to modify the environment by setting PATH for instance.

Finally, there is a special construct to load patterns from a file. A pattern enclosed in double quotes means that the patterns to be applied should be taken from the specified file. The file is expected to be in the directory mailfilter if it is not an absolute path (~ substitution occurs). If the variable is not set maildir will be used. If by chance (!) maildir is not set either, the home directory is used. The file should contain one pattern per line, shell comments (#) being allowed at the beginning of each line.

An action may be followed by other rules. Hence the following is perfectly valid:

From:
        ram             { SAVE ram }
        /plc/i          { SAVE plc }
        root            { SAVE ~/admin }
        /xyz/           { DELETE }
        "users"         { LEAVE }
        ;

Note the use of the file inclusion: all the users listed in file users will have their mail left in the system mailbox. The usual rules apply for these loaded patterns.

Selector Combination

A single rule may have a various set of selectors. For instance, in the following rule:

From: ram, To Cc: root, !Subject: /test/, From: raphael

we have the following set { From, To Cc, !Subject }. The first two selectors are called direct selectors, !Subject: is called a negated selector. The To Cc: selector is a group selector decomposing into two direct selectors, while From: is an atomic selector. Finally, From: is also a selector with multiple occurrences. The value of a selector is its matching status logical value.

Let D be the set of direct selectors and N the set of negated selectors, which form a partition of R, the set of all the selectors in the rule. That is to say, R is the union of D and N, and D intersected with N is the empty set (trivial proof: a selector is either direct or negated). If either D or N is empty, then it's not a partition but in that case we have either D = R or else N = R.

Let's define the logical value of a set S as being the logical value the filter would return if those rules were actually written. Then the logical value of D is the logical value of each of its item with the AND logical operator distributed among them, i.e. the logical value of { a, b, c } is the value of (a AND b AND c). Let's write it AND(D). The logical value of each of the items is the logical value of the selector itself if it is not multiple, or it is the logical value of all the occurrences of the multiple selector within the rule, with the logical OR operation distributed among them. That is to say, in the above example, the value of From is true iff the From: fields contains ram OR raphael. Let's write that OR[From].

To be sound, we have to apply De Morgan's Law on N, hence the following rules: the logical value of N is OR(N) and given a negated selector s, its logical value is AND[s]. And finally, the logical value of R is that of D AND N, with by convention having the logical value of the empty set be true.

For those who do not know De Morgan's Law, here it is: given two logical propositions p and q, then the following identities occur:

NOT (p AND q) <=> (NOT p) OR (NOT q)
NOT (p OR q) <=> (NOT p) AND (NOT q)

While we are in the logic of the propositions, note also that OR and AND are mutually distributive, that is to say, given three logical propositions p, q and r, we have:

p AND (q OR r) <=> (p AND q) OR (p AND r)
p OR (q AND r) <=> (p OR q) AND (p OR r)

To be complete, OR and AND are associative with themselves and commutative. And the B set { 0, 1 } equipped with the set of operations (NOT, OR, AND) is an algebra (a Boolean one). I will spare you the definition of an algebra, which really has nothing to do in this manual page (which is for a mail agent, in case you don't remember :-).

The attentive reader will certainly have noted that I have not specified the logical value of a group selector. Well, given a group selector G, we decompose it into a DG and NG partition, DG being the subset of (atomic) direct selectors of G and NG being the subset of (atomic) negated selectors. Then the logical value of DG is OR(DG) and the logical value of NG is AND(NG); the global logical value of G being that of DG OR NG. In case either DG or NG is empty, then we don't have a partition, but by convention the value of the empty set is false, and one of the sets is equal to G. Note that within a group selector, the rules are exactly the dual of the rules within R.

Now the only rule which is not logical is whether a group selector belongs to D or N. I've chosen, for analogy reasons, to make the group selector belong to D if it does not start by '!' and to N otherwise. That is, !To Cc: belongs to N whilst Cc !To: belongs to D. Apart from that, order within the group selector is irrelevant: To Cc: is equivalent to Cc To:, so the behavior in the quotient set is sound.

Here are some examples:

# Match anything: (not from ram OR not from root) is always true.
From: !ram, !root
# Match anything but reject mails coming from ram OR root
!From: ram, root
# Reject mails whose headers matching /^Re.*/ contain the word test
!^Re.*: /\btest\b/
# Keep mails whose subject contains test AND host
!Subject: !/test/, !/host/
# Matches if ram is listed in the To OR the Cc line
To Cc: ram

Minimal Header

A minimal set of selectors are guaranteed to be set, regardless of the actual header of the message. This is for the purpose of filtering only, no physical alteration is performed.

Envelope:
This is the address found in the mail envelope, i.e. the address where the mail seems to originate from. This can be different from the From: address field if the mail originates from a trusted user, in sendmail's terminology. If you don't know what that is, simply ignore it.
From:
User who wrote the mail. If this line is missing, uses the address found in the first From line.
Length:
The physical length of the body, in bytes, once content-transfer-encoding (if any) has been removed.
Lines:
The amount of lines in the body (decoded, if necessary).
To:
The main recipient(s) of the message. If this line is missing but a set of Apparently-To: lines is found, then those addresses are used instead. If no such line exists, then assume the mail was directed to the user (which seems a reasonable assumption :-).
Sender:
User who sent the mail. This may differ from the From: line. If no such field exists, then the address in the first From line is used (mail envelope).
Relayed:
This computed header is a comma-separated list of all the hosts where the message was relayed, in the proper transmission order. Each item in this list can be a machine name such as mail.hp.com or an IP address such as [15.125.38.12]. The list is derived from the Received: lines present in the message.
Reply-To:
Where any reply should be sent. If no Reply-To: field is present, then the Return-Path is used (with <> stripped out), or the From: line is parsed to extract the e-mail address of the author.

Variables

The mailagent supports user-defined variables, which are globals. They are set via the ASSIGN command and referred to with the %# macro. Assuming we set a variable host, then %#host would be replaced by the actual value of the variable. This enables some variable propagation across the rules.

For example, let's say the user receives cron outputs from various machines and wishes to save them on a per-machine basis, differentiating between daily outputs and weekly ones. Here is a solution:

Subject: /output for host (\w+)/        { ASSIGN host '%1'; REJECT };
Subject: /^Daily output/        { SAVE %#host/daily.%D };
Subject: /^Weekly output/       { SAVE %#host/weekly.%m-%d };

Besides variable interpolation via the %# escape, it is also possible to perform substitutions and translations on the content of a variable (or a back-reference, i.e. a number between 1 and 99). The two commands SUBST and TR will respectively perform in-place substitutions and translations. In that case however, the name of the variable must be preceded by a single #. This differentiates the back-reference 1 from the variable #1, although 1 is a funny name for a variable. The need for # also prevents the common mistake of writing %#, as mailagent will loudly complain if the first parameter of SUBST or TR is not a digit between 1 and 99 or does not start with a #.

Here are some actions to canonicalize the host name into lower case and strip down the domain name, if any:

{ TR #host /A-Z/a-z/; SUBST #host /^([^.]*)\..*/$1/ };

Those actions are directly translated into their perl equivalent, and any error in the specification of the regular expression will be reported.

If the variable name begins with a colon ':', then the variable is made persistent. That is to say it will keep its value across different mailagent invocations. The variable is simply stored (with the leading ':' removed) in mailagent's database and is thus subject to the aging policy set up in the ~/.mailagent.

Within PERL commands or mail hooks using perl (see the MAIL HOOKS section), you can manipulate those (so-called) external variables via a set of interface functions located in the extern package (i.e. you must prefix each of the function name with its package name, set becoming extern'set). The following three interface functions are provided:

val(name)
Return the value of the variable name (the leading ':' is not part of the name, in any of these three interface functions).
set(name, value)
Set the external variable name to hold value. No interpretation is done by the function on the actual content of the value you are providing.
age(name)
Returns the age of the variable, i.e. the elapsed time in seconds since the last modification made by set.

There is currently no way for erasing a variable from the database. But if you do not use the variable any more, it will be removed when its age becomes greater than the maximum age specified by the agemax configuration variable.

Regular Expressions

All the regular expressions follow the V8 syntax, as in perl, with all the perl extensions. If a bracketing construct (...) is used inside a rule, then the %digit macro matches the digit's substring held inside the bracket. All those back-references are memorized on a per-rule basis, numbered from left to right. However, great care must be taken when using a back-reference in multiply present selectors, as all the matches will be performed up-to the first match, and back-references are computed on the fly while doing pattern matching.

For instance:

To: /(.*)/, Subject: /Output from (\w+)/ { ASSIGN to '%1'; SAVE %2 };

will save the To: field in variable 'to' and save the mail in a folder derived from the host name specified in the subject. However, if we say:

Subject: /host (\w+)/, /from (\w+)/ { ASSIGN match '%1' };

then there will be only one back-reference set, and it will come from the first pattern matching if it succeeds, or from the second. Should the second or the first pattern have no bracketing construct and still match, then the back-reference would not be recorded at all, which means the following is probably not what you want:

Subject: /from/, /host (\w+)/, To: /(.*)/ { SAVE %1; REJECT };

as if the /from/ pattern matches then /host (\w+)/ will not be checked (identical selectors are or'ed and that is optimized), then %1 would refer to the To: field whereas if /host (\w+)/ matches, then %1 will be the host name.

However, this behavior can be used to selectively store a news article which has been mailed to you in a folder whose name is the newsgroup name in dot form. Assuming we want to give priority to comp.lang.perl, we could say:

Newsgroups:
        /(comp.lang.perl)/,
        /(comp.mail.mh)/,
        /(comp.compilers)/,
        /([^,]*)/               { SAVE %1 };

An article cross-posted to both comp.lang.perl and comp.mail.mh would be saved in a comp.lang.perl folder, since this is what would match first. The last rules takes care of other articles: the folder used being whatever newsgroup appears first.

There is also a special macro %&, which lists (it's a comma separated list) all the selectors specified via a regular expression which indeed matched. For instance:

Re.*: /york/            { ASSIGN which '%&' };

would assign to which the list of all the fields matching the /Re.*/ pattern which contained 'york', be it a Received: field or a Resent-From: field (as both match the selector specification). Assuming both those fields contained the word york, the value of %& would be 'Received,Resent-From;' (the fields are alphabetically sorted).

Should you have more than one such specified selector within a single rule, then it might be worth knowing that all the set of matching selectors are recorded within %&, each set terminated with a ';'. If a negated selector is used, then %& will record all the fields which did not contain the pattern, assuming the selection succeeded (otherwise nothing is recorded).

Available Actions

The following actions are available as filtering commands. Case is irrelevant although the recommended style is to spell them upper-cased. As explained later, most of the actions record their exit status in a special variable which may be tested via the -t and -f options of ABORT, REJECT and RESTART. For every command returning such an exit status, the failure or success conditions are given at the end of each description. If nothing is specified, then the command does not return a meaningful status.

ABORT [-tf] [mode]
Abort application of filtering rules immediately. See REJECT for the meaning of the optional parameters. (Does not modify existing status)
AFTER [-sanc] (time) action
Records a callback for after the specified time, where action will be performed. By default, a mailagent filtering action is assumed (-a option), on the current mail message. A shell command (-c) may be given instead, receiving the current mail message as standard input. Finally, a plain shell command may be run (with no input) using the -s option. The option -n may be used when the current mail message does not need to be kept for input. For instance:

AFTER -an (1 day) DO ~/process:proc'run(%u)

would call proc'run defined in the ~/process file in one day from now, without giving any input (the action here does not require any).

When running mailagent commands, the initial working mode is set to _CALLOUT_. This may matter if you call APPLY for instance. If the recorded time is less or equal than the current time (which is now), the callback will occur when mailagent is done with the messages in its queue, before exiting. This allows for the following cute trick, found out by Randal Schwartz:

AFTER (now)             # fork a copy I can mangle
        STRIP Reply-To \; RESYNC \;
        ANNOTATE -du Reply-To %2 \; RESYNC \;
        NOTIFY message %r \; DELETE \;
        ;

Note that the command is not called AT because the call will only be performed at the next mailagent invocation after the specified time has elapsed. Dates are specified using the same format as in SELECT. (Fails if the action cannot be recorded in the callout queue).
ANNOTATE [-du] field value
Annotate message by adding field into the mail header, with the supplied value. This is like the MH command anno, but the annotation is performed at the end of the header, whereas MH does it at the top. Normally, an extra field is added, with the current date as field value.

This can be suppressed by using the -d option. If value is omitted, only the date field is generated (hence it is an error to use the -d option without supplying a value). As with all the commands which alter the header, a RESYNC is necessary for the filter part to actually see the new header.

The -u option means "unique", and prevents ANNOTATE from executing if the specified field is already present in the header. Don't forget to RESYNC between successive ANNOTATE commands using this option if the field refers to a previous ANNOTATE target. (Fails when no annotation takes place)

APPLY rulefile
Get the rules held in rulefile and apply them to the current message. The filter will begin in whatever mode you were when using this command, but no feed back will occur, i.e. any mode changing will be lost when returning from the command.

Variables (see the %# macro) are propagated back and forth through APPLY, meaning you see variables set by the caller, and you may change their values or create new variables for the caller to later use.

If mail is saved during the application of the rules, then the corresponding flag is set in the main filter (the one that started the APPLY command). You may nest them, of course. (Fails if mail is not saved by the rules held in rulefile)

ASSIGN var value
Assign the value to the user-defined variable var, which may further be accessed as '%#var' for macro substitution or #var in the TR and SUBST commands in place of the variable name. Note that there is no leading # in front of the variable name. The value you provide is first ran through perl to see if it contains some arithmetic operations. If the evaluation is successful, the resulting value is used instead. If an error occurs in this evaluation process, then the literal value provided is used. To avoid the evaluation, you may enclose the whole value in simple quotes. Those will be trimmed before the assignment takes place. If you actually want simple quotes in the first AND last position, you have to double each of them. (Does not modify existing status)
BACK command
Execute command and take its output as new actions to be performed on the mail (hence performing something analogous to `command` in shell). If there is no output, nothing is done. BACK commands can be nested, although this may lead to surprises this manpage will not disclose (but I assure you it will be funny, assuming we have the same sense of humor... :-). Note that both the standard output and the standard error from the command are used.

If the command fails, the output is mailed back to the user and no action is performed. Furthermore, normal feedback does not occur here: any output from the command is taken as filter actions, which means the semantics of PASS, for instance, is changed: we do not take a body back but commands. (The execution status is that of the command)

BEEP [-l] count
This command may be used to tune the amount of beeps emitted when biffing on the terminal, for each %a expansion. By default, that amount is set to 1. Using the -l option alters the beep count locally for the rule. Otherwise, the default amount is changed.

Note that this simply expands %a into the suitable amount of Ctrl-G characters. Your terminal must be allowed to issue consecutive bells for this to work. Very often, terminals are configured so that the first bell received disables further beeps for some period, to avoid cascades of bells. If you use xterm for instance, you should use:

xterm -xrm "XTerm*BellSuppressTime: 0"

to enable consecutive bells. Otherwise, xterm will swallow them during 200 ms, hence making the BEEP command ineffective, apparently. (Does not modify existing status)
BEGIN [-ft] state
Enter a new state. An explicit REJECT or RESTART is necessary to abort the processing of the current rule. The processing begins in the state INITIAL. If the -f (resp. -t) flag is specified, then the state change only occurs if the last command status indicated a failure (resp. a success). A state name can contain alphanumeric characters and underscores. (Does not modify existing status)
BIFF [-l] on|off|path
Allow or disallow biffing dynamically. When biffing is turned on via the configuration file or via this command, a message is printed on some of the terminals where the user is logged when mail is received, as explained under the section MAIL BIFFING.

Instead of on or off, you can specify a file name (~ substitution allowed) being the new path to be used for the biffing format template.

If you use the -l option, changes are made locally, for the duration of the rule only. If you REJECT to go to some other rule, your changes will be lost. The global value of the altered parameters is changed on the first local usage and restored when a new rule is entered. (Does not alter execution status)

BOUNCE address(es)
Bounce the message to the specified address(es) and acts as if a save had been done. The only difference with FORWARD is that no Resent-like lines are added to the header. If an address is specified in double quotes, it is taken as the name of a file to be loaded to get addresses (one address per line, shell comments (#) allowed). The file name resolving is the same as the one used for pattern loading. (Fails if mail cannot be resent)
DO routine [(arg1, arg2, ... , argn)]
Calls the perl routine, with the supplied arguments if any. This is a very low level hook into mailagent's internal. The routine can be specified by itself (package'name, package being main by default), or identified by a leading tag, followed by a ':', then the routine name as before. The tag can be a path to a file where the routine is defined, or a command name (for user-defined commands which are loaded dynamically). For instance

DO UNKIT:newcmd'unkit('true')

would lookup the user-defined UNKIT command, load the file where it is defined (in the newcmd package), then call the routine with 'true' as argument. The package specified determines where the loading is done, so be sure it is consistent with the definition in the file where the routine is defined. (Fails if the routine cannot be located and executed)
DELETE
Delete the current message. Actually, this does not do anything, it just marks the mail as saved. If no further action involving saving is done, then the mail will never show up in the mailbox. (Never fails)
FEED [-be] program
Feed the whole message to a program and get the output back as the new message. Hence the program appears as a filter for the whole message. It does not tag the message as having been saved. A RESYNC is automatically done upon return. (Returns the status of program)

WARNING: Your program must be able to properly parse a MIME message and must deal with transfer-encoded bodies by itself. To make the program task simpler, you can supply the -b switch which will let mailagent decode the whole body for you, suppressing any Content-Transfer-Encoding header (implying "binary"). This is an invalid message format for sending the message, but it makes processing easier. You still have to parse the MIME parts yourself though.

Using -b does not prevent your program from outputing a valid message back, one that can be possibly sent on the network so you have two options: either you do not supply any Content-Transfer-Encoding in the headers, and mailagent will recode the body for you using the initial transfer encoding present in the message (a relatively safe option if you make only changes in the body at well-defined spots without introducing 8-bit chars), or you can supply the Content-Transfer-Encoding yourself and perform the body encoding manually.

To be completely safe and minimize the work in your program, the -e switch will let mailagent analyse the message body you are returning and select the proper transfer encoding automatically. Since this will cause the whole body to be analysed, and it can be potentially huge, that behaviour must be explicitly asked for. If you need -e then you probably want -b as well (you can supply both by saying -be naturally).

If you do not supply any switch, mailagent will give you the message as-is and will get your message as-is without any additional magic.

FORWARD address(es)
Forward mail to the specified address(es). This acts as if a save had been done, in order to avoid the DELETE. Usually when you forward a mail, you do not wish to keep it. The command adds Resent-like lines in the header. As for BOUNCE, file inclusion is possible (i.e. use an address "forward_list" to forward a mail to all the users listed in the file forward_list). (Fails if mail cannot be resent)
GIVE program
Give the body of the message to the specified program by feeding its standard input. Any output is mailed to the user who runs the mailagent. Note that the message is not tagged as having been saved. (Returns the status of program)

NOTE: If the message had a body that was encoded for transport (using one of the base64 or quoted-printable transfer encoding), mailagent will transparently decode it and supply a version that can be properly handled. In other words, the program does not need to care about the body being encoded in the message, as it will get a plain one. (Since no headers are supplied, this is the only possible option).

Caution though for MIME messages: you should use PIPE for them to give a chance to the program to properly handle the body, but then it needs to be fully MIME-aware.

KEEP header_fields_list
Keeps only the corresponding lines in the header of the mail. For instance, a "KEEP From To Cc Subject" will keep only the principal fields from the mail message. This is suitable for archiving mailing lists messages. You may add a ':' after each header field name if you wish, but that is not strictly necessary. Headers may be specified using shell-style regular expressions, and file inclusion is allowed to get headers from a file. (Does not modify existing status)
LEAVE
Leave incoming mail in the system mailbox. This is the default action if no rule matched or if no saving occurred. This is not recommended on Debian systems. (Fails if mail cannot be saved)
MACRO [-rdp] name [= (value, type)]
Lets you specify user-defined macros, of the form %-(name). See the paragraph on user-defined macros for explanation about the available types (SCALAR, EXPR, CONST, FN, PROG, PROGC). A perl interface to the underlying user macros is available for your perl commands. The -r option is used to replace an existing macro (instead of pushing a new instance on the stack), the -d is to delete all the instances of a named macro (in that case it takes only the first argument), and -p pops the last instance of the macro from the stack and reverts to the previous definition, if any (otherwise, it acts as -d). If you wish to define a simple SCALAR macro, you may omit the = (value, type) part and simply continue with the macro value. (Does not modify existing status)
MESSAGE file
Send message file back to the sender of the message (as derived from the header of the message). The text of the message is run through the macro substitution mechanism (described later on). (Fails if message cannot be sent)
NOP [-ft]
No operation. If this seems a bit odd, think of it in terms of a ONCE command. (Does not alter existing status unless -f or -t is used, in which case it forces a false --failure-- or true success status)
NOTIFY file address(es)
Send a notification message file to a given address list. The text of the message is run through the macro substitution mechanism (described later on). As with FORWARD, file inclusion for address specification is possible. (Fails if message cannot be sent)
ON (day list) command
Execute the specified filter command only on the specified day list. That list is a space-separated list of days, specified using the English names. Only the first three characters are taken into account, case-insensitively. Therefore, the shortest valid day specifications are Mon, Tue, Wed, Thu, Fri, Sat and Sun.

This command can be used in conjunction with SELECT to do time-based selective bouncing of messages to, for instance, your home address:

ON (Mon Tue Wed Thu) SELECT (18:30 .. 23:00) BOUNCE [email protected];
ON (Fri) SELECT (18:30 .. 23:59) BOUNCE [email protected];
ON (Sat Sun) BOUNCE [email protected];

That would bounce messages only on week-ends and during the week, after 18:30, and until 23:00 (assuming that's bed time, other messages will be seen at work the next day). Note that on Fridays, we go as far as 23:59. (Propagates status from command. If the command is not executed, always return success)
ONCE (name, tag, period) command
Execute the specified filter command once per period. The name and tag fields are used to record timestamps of the last ONCE command. More on this later. (Propagates status from command. If the command is not executed, always return success)
PASS program
Feed the body of the message to the specified program and get a new body back from the output of the program. Note that the message is not tagged as having been saved. (Returns the status of program)

NOTE: If the message had a body that was encoded for transport (using one of the base64 or quoted-printable transfer encoding), mailagent will transparently decode it and supply a version that can be properly handled. The body generated by the program will then be automatically encoded back using the same transfer encoding.

Caution though for MIME messages: you should use FEED for them to give a chance to the program to properly handle the body, but then it needs to be fully MIME-aware.

PERL script [arguments]
Escape to a perl script to perform some actions on the message. This is fully described further in the manpage, and is very different from a RUN perl script command. (Returns failure if the script did not compile or returned a non-zero status).
PIPE [-b] program
Pipe the whole message to the specified program, but do not get anything back. Any output is mailed to the user who runs the mailagent. The message is not tagged as having been saved in any case, so you must explicitly DELETE it if piping was enough and it did not fail: "REJECT -f" is your friend here to avoid unwanted deletion. (Returns the status of program)

WARNING: Your program must be able to properly parse a MIME message and must deal with transfer-encoded bodies by itself. To make the program task simpler, you can supply the -b switch which will let mailagent decode the whole body for you, suppressing any Content-Transfer-Encoding header (implying "binary"). This is an invalid message format for sending the message, but it makes processing easier. You still have to parse the MIME parts yourself though.

POST [-lb] newsgroup(s)
Post the message to the specified newsgroup(s) after having cleaned-up the header: mail-related fields like Received: or In-Reply-To: are removed, a valid From: line is generated, the original To: and Cc: are renamed with an X- prefix, the References: line is updated/generated if necessary based on existing In-Reply-To, and NNTP-specific fields are stripped so that the server can add its own.

Running POST successfully acts as a saving.

If the first name is -l as in "POST -l comp.mail.mh", then a "Distribution: local" header is added to force a local delivery. Otherwise, the default inews distribution will be used (world, usually).

When the -b switch is given, a successful POST will result in biffing being activated (see section MAIL BIFFING) for the resulting news article.

If more than one newsgroup is specified, they should be space separated. It is possible to get a newsgroup list via file inclusion. (Fails if message cannot be posted)

PROCESS
Run the mailagent processing which looks for @SH commands and executes them. This was described before in the section dealing with default rules. The action associated by default to a mail having [Cc]ommand as its subject is PROCESS. (Always returns success)
PROTECT [-lu] mode
Sets the default protection mode that should be set on created folders (or created files when saving into an MH folder or a directory). By default, permissions are governed by the UMASK command, but this lets you override the default. The specified mode should be preceded by a 0 as in 0644 to give the familiar octal permissions. Otherwise, it is interpreted as a decimal number, so beware!

The -l option may be used to specify a mode locally for one rule. Otherwise, the protection mode is set globally. The -u option unsets the global (or local when combined with -l) mode, reverting to the default behaviour where only the umask is taken into account by the system.

Note that when saving into an MH folder, the PROTECT command takes precedence over the Msg-Protect field from your ~/.mh_profile file. (Does not alter execution status)

PURIFY program
Feed the header into a program and get new header back. RESYNC is done automatically upon return. This may be used to indeed purify the header by removing all the verbose stuff added by so many mail transport agents (X-400 like lines for instance). Obviously, this does not flag the message as having been saved. (Returns the status of program)

If your program removes the Content-Transfer-Encoding header in a MIME message, mailagent will properly transform the message to have a non-encoded body. If you change the value of the Content-Transfer-Encoding header, mailagent will also correctly recode the body for you. The only supported encodings are base64 and quoted-printable.

QUEUE
Queue mail again. A successful queuing counts as if mail has been saved. Mail queued that way will not be processed during the next 30 minutes. Note that unless mailagent is invoked on a regular basis by cron, the mail will remain in the queue until another mail arrives. (Fails when mail cannot be queued)
RECORD [-acr] [state] [(tag-list)]
Record message in the history and enters state _SEEN_ if the message was already present there. If the message is recorded for the first time, processing continues normally. Otherwise a REJECT is performed. This behavior may be somewhat modified by using some options. See UNIQUE for a complete description of the options and arguments. Naturally, when a state is specified, that overrides the default _SEEN_. A state name can contain alphanumeric characters and underscores.

When a tag-list (comma-separated list of names) is specified, the message is only recorded and checked against all those tags, but only them. Not specifying any tag list means any occurrence, whether it is tagged or not. See paragraph Using Tags in Record and Unique for more information. (Returns a failure status if mail was already recorded)

REJECT [-tf] [state]
Abort execution of current action, and continue matching. If -t is specified, the reject will occur only if the previous action was successfully completed (return status of true), whilst -f would cause the reject only when a failure occurred. If a state is specified, we enter that state before rejection. REJECT resets the matching flag, which means that if no further match occurs, the default action will apply. A state name can contain alphanumeric characters and underscores. (Does not alter execution status)
REQUIRE file [package]
Behaves like the perl require operator by loading a perl file into memory. By default, the file is read in the newcmd package, but you may specify whatever package you wish to load it in. This command will only perform the loading once per (file, package) tuple. Unlike its perl equivalent, the file "value" is not important, i.e. it does not have to end with a statement returning a true value. (Fails if file cannot be loaded)
RESTART [-tf] [state]
Abort execution of current action and restart the matching process from the beginning. To avoid loops, each rule may be walked through once in a given state. See REJECT for the meaning of the optional parameters. RESTART resets the matching flag, which means that the default action will apply, should no further match occur. (Does not alter execution status)
RESYNC
Re-synchronize header used for matching with the header of the mail. This is probably useful only when a SUBST or ANNOTATE command was run. (Does not alter execution status)

NOTE: At RESYNC time, mailagent will check whether the Content-Transfer-Encoding header was changed and will transparently recode the body if required, so that the whole message remains valid despite header mangling. It will also take care of updating Content-Length if required. Whenever you do change these important headers via SUBST or ANNOTATE, be sure to call RESYNC before disposing of the message or you run the risk of saving a corrupted version that will not be properly understood by your mail user agent.

RUN program
Run the specified program and mail any output to the user who runs mailagent. This action does not flag the message as having been saved. (Returns the status of program)
SAVE folder
Save message in the specified folder. If folder name starts with a '+', it is handled as an MH-style folder and rcvstore is emulated to deliver the message into that folder. If folder is a directory, message is delivered in a single file within that directory. See the FOLDERS section. (Fails if message cannot be saved)
SELECT (start .. end) command
Execute the command only within the time selection period specified. Dates can be specified in a wide range of formats. The output of the date(1) command is an example of a valid specification. If the date, the year or the month is missing, then the current one is substituted in place of it. The following dates are valid specifications: '10:04:25', 'now' ,'April 1 1992', 'Dec 25', 'July 14 1789, 07:40' (err... it's valid according to the grammar, but it's before the Epoch so it does not mean anything). Other fancy dates like 'last month - 5 minutes' or '3 weeks ago' are also enabled. (Isn't that great to have a real parser? The filtering rules could have been more elaborated if only I had known about this Berkeley yacc producing a perl parser...). (Returns the status of command, if run, otherwise returns true).
SERVER [-t] [-d disabled commands]
Activate server processing. The body of the message is interpreted as a list of commands to execute. See section GENERIC MAIL SERVER for more information about the server itself. The -t option turns the server into trusted mode, where powers may be gained. The -d option must be followed by a list of disabled commands, separated by commas with no intervening spaces between them.
SPLIT [-adeiw] folder
Split a mail in digest format into the specified folder (same naming conventions as in SAVE). If no folder is specified, each digest item is queued and will be analyzed as a single mail by itself. The -d option deletes the digest header. The -i option means split is done in-place and the original mail is discarded. All the options may be used simultaneously provided they are stuck together at the beginning (option parsing being really rudimentary).

If the mail is not in digest format and a folder is specified, then it is saved in that folder. Otherwise, the SPLIT action fails and nothing occurs (the filter continues its processing though). The SPLIT command will correctly burst RFC-934 digest messages and will try to do its best otherwise. If the digest was not RFC-934 compliant and there is a chance SPLIT might have produced something incorrect, then the original message is also saved if -i, otherwise it is not tagged as saved (so that the default LEAVE command may apply). The -w (watch) requests special care and will detect every non RFC-934 digest, even when the non-compliance is otherwise harmless; furthermore, any trailing garbage longer that 100 bytes will be saved as a digest item by itself.

The -a option annotates every digest item with an X-Digest-To: header line, which is the concatenation of the To: and Cc: fields of the original digest message. This may be used for instance to burst the digest into the queue and then re-process each of its items according to this added field. Finally, the -e option will discard the digest header only if its body is empty (i.e. the moderator did not include any leading comment). (Returns success if mail was in digest format and correctly split without any error)

STORE folder
Save message in the specified folder and leave a copy in the system mailbox. The folder parameter follows the same naming conventions as in SAVE. Again, because of locking issues, leaving mail in the mailbox is not recommended on Debian machines. (Fails if message cannot be saved either in the folder or in the mailbox)
STRIP header_fields_list
Remove the corresponding lines in the header of the mail. For instance, a "STRIP Newsgroups Apparently-To" will remove the appropriate lines to wipe out any Newsgroups: or Apparently-To: header. You may add a ':' after each header field name if you wish, but that is not strictly necessary. Headers may be specified via shell-style regular expressions or via "file" inclusion. (Does not alter execution status)
SUBST var/header expression
Substitutes the expression on the specified user-defined variable (name starting with a #) or back-reference (digit), or header field (optionally ending with ':'). For instance

SUBST #foo /w/y/g

would substitute in user-defined variable foo all the w by y. See also ASSIGN and TR.

For substitutions on header fields, like:

SUBST Subject: /\[foo\]\s+//;

matching header lines will be reformatted when the substitution is successful, which likely means original continuations will not be preserved. The target of the substitution is the whole header, with continuations normalized to one space. You are therefore guaranteed to be independent from the actual header formatting in the original.

Do not forget to issue a RESYNC after a header field SUBST, since some routines (like POST) probe into the parsed header hash table to generate the saved message.

(Fails if error in expression)

TR var/header translation
Perform the translation on the specified variable, back-reference or header field. For instance

TR 1 /A-Z/a-z/

would canonicalize content of reference 1 into lowercase. Successfully transliterated headers are reformatted, even when their overall size is not changed. See also ASSIGN and SUBST. (Fails if error in translation)
UMASK [-l] mode
Changes the process's umask to the specified mode, which can be decimal, octal (if preceded by '0') or hexadecimal (starting with '0x'). The octal notation is the clearest way to specify the umask anyway. Aren't rumors saying that octal was invented for that purpose only? ;-) Use the -l option to change the umask for the duration of the current action rule only. Note that the default umask specified in your config file is used to reset mailagent's umask at the start of each mail processing. (Does not alter execution status)
UNIQUE [-acr] [state] [(tag-list)]
Record message in the history and tag message as saved if it was already present there. If the message is recorded for the first time, processing continues normally. Otherwise a REJECT is performed. If -r was used, a RESTART is used instead whilst -a would run an ABORT. For instance, to remove duplicate messages from mailing lists, run a UNIQUE -a before saving the mail. The -c option may be used alone to actually prevent the command from disturbing the execution flow, and to later use the return status to see what happened: UNIQUE returns a failure status if the message was already recorded. If an optional state argument is given, then the automaton will enter that state if the mail was previously in the database. See also RECORD, and the paragraph entitled Using Tags in Record and Unique for more information about the tag-list. (Fails if mail was already recorded)
VACATION [-l] on|off|path [period]
Allow or disallow a vacation message. When vacation mode is turned on via the configuration file, a message is sent whenever the user receives a mail meeting some requirements, as explained under the section VACATION MODE. One of the conditions is that the vacation flag modified by this command be true. This makes it easy to disallow vacation messages, ever, to a group of people for instance.

Instead of on or off, you can specify a file name (~ substitution allowed) being the new path to be used for locating the vacation file. Optionally, you may specify a last parameter, which will be taken as the period to apply when sending the vacation message. Changes to the vacation message path are forbidden when the configuration variable vacfixed is set to ON.

If you use the -l option, changes are made locally, for the duration of the rule only. If you REJECT to go to some other rule, your changes will be lost. The global value of the altered parameters is changed on the first local usage and restored when a new rule is entered. (Does not alter execution status)

WRITE folder
Write the message in the specified folder, removing any pre-existing folder with the same name. Hence, successive WRITE commands will overwrite the previous one. This is useful to store output of system commands ran by cron. Don't try to use it with an MH folder or a directory folder or it will behave like SAVE. (Fails if message cannot be written)

Execution Status

Almost all the actions modify a variable which keeps track of the execution status (analogous to the $? variable in the shell). This variable can be tested via the -t or -f option of the REJECT command for instance. To give but a single example, the SAVE action would return failed if it could not save the mail in the specified folder. If that SAVE command was followed by a "REJECT -f FAILED", then the execution of the current rule would stop and the automaton would continue to analyze the mail in the FAILED state.

Some of the actions however do not modify this last execution status. Typically, those are actions which make decisions based on that status, or simply actions which may never fail. Those special actions are: ABORT, ASSIGN, BEGIN, KEEP, MACRO, NOP, REJECT, RESTART, RESYNC, STRIP and VACATION.

It is unfortunate that ONCE or SELECT commands cannot make the difference between a non-execution and a successful execution of the specified command. There may be a change in the way this scheme works, but it should remain backward compatible.

Perl Escape

By using the PERL command, you have the ability to perform filtering and other sophisticated actions directly in perl. This is really different from what you could do by feeding your mail to a perl script. First of all, no extra process is created: the script is loaded directly into mailagent and compiled in a special package called mailhook. Secondly, you have a perl interface to all the filtering commands: each filtering action is associated to a perl function (spelled lower-cased). Finally, some pre-defined variables are set for you by mailagent.

Before we go any further, please note that as there is no extra process created, you must not call the perl exit function. Use &exit instead, so that the exit may be trapped. &exit takes one argument, the exit code. If you use 0, this is understood as a success, any other value meaning failure (i.e. the PERL command will return a failure status). Using the perl exit function directly would kill mailagent and would probably incur some mail losses.

The scripts used should remain simple. In particular, you should avoid the use of the package directive or define functions with a package name other than mailhook (i.e. the package where your script is loaded). Failure to do so may raise some name clashes with mailagent's own routines. In particular, avoid the main package. Note that since the compilation environment is set-up to mailhook, not specifying package names in your variables and subroutine is fine (in fact, it's meant to work that way).

Your script is free to do whatever it wants to the mail. Most of the time however, you end up using the mailagent primitives to save the mail or forward it (but you are free to redesign your own and call them instead, of course). The interface is simple: each function takes but one argument, a string, which is the arguments to the command, if any. For instance, in a perl escape script, you would express:

{ SAVE list; FORWARD "users"; FEED ~/bin/newmail -tty; REJECT }

with:

&save('list');
&forward('"users"');
&feed('~/bin/newmail -tty');
&reject;

The rule is simple: each command is replaced by a function call, with the remaining parameters enclosed in a string, if any. Alternatively, you may specify parameters as a list: all the arguments you provide are joined into a big happy string, using a space character as separator. The macro substitution mechanism is then ran on this resulting argument string.

Each function returns a boolean success status of the command (i.e. 1 means success). For those functions which usually do not modify the filter's last execution status variable, a success is always returned. This makes it possible to (intuitively) write:

&exit(0) if &save('uucp');
&bounce('root') || &save('emergency');

and get the expected result. The mail will be saved in the emergency folder only when saving in uucp folder failed and the mail could not be bounced to root.

It is important to understand that these commands have exactly the same effect on the filtering process when they are run from a perl escape script or from within the rule file as regular actions. A &reject call will simply abandon the execution of the current perl script and the filter automaton will regain control and attempt a new match. But perl brings you much more power, in particular system calls, control structures like if and for, raw regular expressions, etc...

The special perl @INC array (which controls the search path for require) is slightly modified by prepending mailagent's own private library path. This leaves the door open for future mailagent library perl scripts which may be required by the perl script. Furthermore, the following special variables are set-up by perl before invoking your script:

@ARGV
The arguments of the script, which were given by the PERL command. This array is set up the exact same way you would expect it to be set up if you invoked the command directly from the shell, excepted that @ARGV[0] is the name of the script (since you cannot use perl's $0 to get at it; that would give you mailagent's name).
$address
The address part of the From: line.
$cc
The raw content of the Cc: line.
@cc
The list of addresses on the Cc: line, with comments suppressed.
$envelope
The mail envelope, as computed using the first From line of the message.
$friendly
The comment part of the From: line, if any.
$from
The content of the From: line, with address and comment part.
%header
This table, indexed by field name, returns the raw content on the corresponding header line. See below.
$msgpath
The full path name of the folder (or message within an MH folder) where the last saving operation has occurred. This is intended to be used if you wish to construct your own mail reception notification.
$length
The message length, in bytes.
$lines
The number of lines in the message.
$login
The login name of the address on the From: line.
$precedence
The content of the Precedence: line, if any at all.
@relayed
The list of host names (possibly raw IP addresses if no DNS mapping) listed in the (computed) Relayed: header line.
$reply_to
The e-mail address where a reply should be sent to, with comment suppressed.
$sender
The sender of the message (may have a comment), derived in the same way the Sender: line is computed by mailagent.
$subject
The subject of the message.
$to
The raw content of the To: line.
@to
The list of addresses on the To: line, with comments suppressed.

The associative array %header gives you access to all the fields in the header of the message. For instance, $to is really the value of $header{'To'}. The key is specified using a normalized case, i.e. the first letter of each word is uppercased, the remaining being lowercased. This is independent of the actual physical representation in the message itself.

The pseudo keys Head, Body and All respectively gives you access to the raw header of the message, the body and the whole message. The %header array is really a reference to the mailagent's internal data structure, so modifying the values will influence the filtering process. For instance, the SAVE command writes the Head, the X-Filter: line, the end of header (a single newline) and then the Body (this is an example only, not a documented feature :-). The =Body= key is special: it is a Perl reference to a scalar containing the body with any content transfer encoding removed.

Note that the $msgpath variable holds only a snapshot of the folder path at the time where the PERL escape was called. If you perform your own savings in perl, then you need to look at the $main'folder_saved variable instead to get the up-to-date folder path value.

As a final note, resist the temptation of reading the internals of the mailagent and directly calling the routines you need. If it is not documented in the manual page, it may be changed without notice by any further patch. (And this does not say that documented features may not change also... It's just more unlikely, and patches would clearly state that, of course.)

Program Environment

All the programs started by mailagent via RUN and friends inherit the following environment variables: HOME, USER and NAME, respectively set from the configuration parameters home, user and name. If the mailagent is invoked by the filter, then the PATH is also set according to the configuration file (if you are using the C filter) or to whatever you set PATH (if you are using the shell filter).

All the programs are executed from within the home directory. This includes scripts started via the PERL command and mail hooks. The latter will be described in detail further down.

File inclusion

Some commands like FORWARD or KEEP allow you to specify a file name between double quotes to actually load parameters from this file. Unless a full path is given, the following method is used to locate the file: first in the location pointed to by the mailfilter variable if set, otherwise in maildir and finally in the home directory. Note that this is not a search path in the sense that if mailfilter is defined and the file is not there, an error will be reported.

The file should list each parameter (be it an address, a header or a pattern) on a line by itself. Shell-style comments (#) are allowed within that file and leading white spaces are trimmed (but not trailing spaces).

Macros Substitutions

All the commands go through a macro substitution mechanism before being executed. The following macros are available:

%%
A real percent sign
%A
The internet address extracted out of the From: field (a.b.c in [email protected]), converted to lower-case.
%C
CPU name on which mailagent runs. That is a fully qualified hostname with the domain name, e.g. lyon.eiffel.com.
%D
Day of the week (0-6)
%H
Host name (name of the machine on which the mailagent runs), without any domain name. Always in lower-case, regardless of the machine name.
%I
The internet domain name extracted out of the From: field (b.c in [email protected]), converted to lower-case.
%L
Length of the body part, in bytes, with content-transfer-encoding removed.
%N
Full name of the sender (login name if none)
%O
The organization name extracted out of the From: field (b in [email protected]), converted to lower-case.
%R
Subject of the original message with leading Re: suppressed
%S
Re: subject of original message
%T
Time of the last modification on mailed file (commands MESSAGE and NOTIFY)
%U
Full name of the user
%Y
Full year, with four digits (so-called yyyy format)
%_
A white space (useful to put white spaces in single patterns)
%&
List of selectors which incurred match (among those specified via a regular expression such as 'X-*: /foo/i'. If we find the foo substring in the X-Mailer: header line, then %& will be set to this value). Values in the list are comma separated.
%~
A null character, wiped out from the resulting string.
%digit
Value of the corresponding back reference from the last match.
%#var
Value of user-defined variable var
%=var
Value of the mailagent configuration variable var as specified in the ~/.mailagent file.
%d
Day of the month (01-31)
%e
The user's e-mail address (yours!).
%f
Contents of the "From:" line, something like %N <%r> or %r (%N) depending on how the mailer is configured.
%h
Hour of the day (00-23)
%i
Message ID, if available (otherwise, this is a null string)
%l
Number of lines in the message, once content-transfer-encoding has been removed
%m
Month of the year (01-12)
%n
Lower-case login name of sender
%o
Organization (where mailagent runs)
%r
Return address of message
%s
Subject of original message
%t
Current hour and minute (in HH:MM format)
%u
Login name of the user
%y
Year (last two digits)
%[To]
Value of the header field (here To:)

User-defined Macros

The mailagent lets you define your own macros in two ways: at the filter level via the MACRO command, or at the perl level in your own commands or perl actions.

Once defined, a user macro (say foo) can be substituted by using %-(foo). In the case of a single-letter macro, that can be optimized into %-f for instance, i.e. the parenthesis can be omitted.

There are six types of macros:

SCALAR
A scalar value is given, e.g: red. The macro's value is the literal scalar value, no further interpretation is performed on the data.
EXPR
A perl expression will be evaled to get the value, e.g: $red. Note that the evaluation will be performed within the usrmac package, so if you are referring to a variable in another package, it would be wise to specify it, as in $foo'bar.
CONST
It's really the same as EXPR, but the value is known to be a constant. So the first time a substitution is made, the expression will be evaluated, and then its result is cached.
FN
A perl function name (without the leading &), such as main'do_this. The function will be called with a single parameter: the name of the macro itself. That leaves the door open for further user-defined conventions by forcing evaluation through one single perl function.
PROG
A program to run to get the actual value. Only trailing newline is chopped, others are preserved. The program is forked each time. In the argument list given to the program, %n is expanded as the macro name we are trying to evaluate. If you specify that in the filtering rules, don't forget to escape the first %.
PROGC
Same as PROG really, but the program is forked only once and the value is cached for later perusal.

At the perl level, four functions let you manipulate and define your macros (all part of the usrmac package):

new(name, value, type)
Replace or create a %-(name) macro. For instance:

new('foo', "$mailhook'header{'X-Foo'}", 'EXPR');

would create a new macro foo that would expand into the value of an hypothetical X-Foo header.
delete(name)
Delete all values recorded for the macro.
push(name, value, type)
Stack a new macro, creating it if necessary.
pop(name)
Remove last macro definition on the stack.

One macro stack is allocated for each macro, so that some kind of crude dynamic scoping may be implemented. Creating a macro via push is like taking a local variable in perl, while creating one by new is simply assigning to a variable. Likely, pop is like exiting a block with a local variable definition and delete frees all the macro bearing that name, i.e. it deletes the whole stack.

At the filter level, the MACRO command has three options. By default, the command defines a new macro by using push, and the other options each let you access one of the other interface functions. Note that macro definitions persist across APPLY commands.

User-defined Logging

Most of the time when writing a new mailagent filtering command or an perl hook, you will have a need for specific logging, either to report a problem or to keep track of what you are performing.

Normally, logs are appended into the agentlog file by calling &main'add_log(string) (see subsection General Purpose Routines). For plain mailagent actions, this is fine.

But mailagent lets you define alternate logging files, referred to by name. This generic logging interface is defined in the usrlog package:

new(name, file, flag)
Records a new log file known as name and done in file. If the pathname given for this file is not absolute, it is rooted under the logdir directory. If flag is set to true, any logging done to this file will also be copied to the default system-wide logfile. Nothing is done if a logfile with the same name has already been defined.
delete(name)
Deletes the logfile known as name. Further logging done to that file is redirected to the default logfile.
main'usr_log(name, string)
Adds an entry to the logfile name. The default logfile is known as default and cannot be redefined nor deleted. Note that this function is available from the main package. Calling it with name set to the string 'default' is mostly equivalent to calling directly main'add_log with the notable exception that the -i mailagent option will not be honored in that case. This may or may not be useful to you.

If you call &main'usr_log with a non-existent logfile name, logging is redirected to the default system-wide logfile defined in your ~/.mailagent.

Dynamically Loading New Code

In you perl routines (user-defined commands, perl hooks, etc...), you may feel the need to dynamically load some new code into mailagent. You have direct access to the internal routine used by mailagent to implement the REQUIRE command or load your new filtering commands for example.

Using the so-called dynload interface buys you some extra features:

  • The mailagent public library path is automatically prepended to the @INC array, which lets you define your own system-wide or private perl library files (the private library path is defined by the perlib configuration variable, the public library path was defined at installation time).
  • Like perl's require, mailagent keeps track of which files were loaded into which packages and will not reload the same file in the same package twice.
  • It is possible to make sure that a specific function be defined in the loaded file, with an error reported if this is not the case.
  • You benefit from the default logging done by dynload when some error occurs.

In order to do all this, you call:

&dynload'load(package, file, function)


specifying the package into which you wish to load the file, and optionally the name of a function that must be defined once the file has been loaded (leave this field to undef if you do not have such a constraint). The routine returns undef if the file cannot be loaded (non-existent file, most probably), 0 if the file was loaded but contained a syntax error or did not define the specified function, and 1 for success.

Using Once Commands

The ONCE constructs lets you specify a given command to be run once every period (day, week...). The command is identified by a name and a tag, the combination of the two being unique. Why not just a single identifier? Well, that would be fine, but assume you want to send a message in reply to someone once every week. You could use the e-mail address of the person as the command identifier. But what if you also want to send another message to the same address, this time once a month?

Here is a prototypical usage of a ONCE, which acts like the vacation program, excepted that it sends a reply only once a day for a given address:

{ ONCE (%r, message, 1d) MESSAGE ~/.message };

This relies on the macro substitution mechanism to send only once a day the message held in ~/.message. Do not use the tag vacation, unless you know what you are doing: this is the tag used internally by mailagent in vacation mode. Recall that no selector nor pattern is understood as "Subject: *", hence the rule is always executed because that pattern always matches.

The timestamps associated with each commands are kept in files under the Hash directory. The name is used as a hashing key to compute the name of the file (the two first letters are used). Inside the file, timestamps are sorted by name, then by tag. Of course, you could say (inverting tag and name):

{ ONCE (message, %r, 1d) MESSAGE ~/.message };

but that would be likely to be less efficient, as the first hashing would be done on a fixed word, hence all the timestamps would be located in the file Hash/m/e (where Hash is the name of your hashing directory, which is the hash parameter in the configuration file).

Using Tags in Record and Unique

Both the RECORD and UNIQUE commands let you specify a comma-separated tag list between '(' and ')'. For each tag present in the list, there is a separate entry in the database associated with the message ID. When the message is recorded for at least one of the tags, the command "fails". Not specifying any tags means looking for any occurrence of that message ID, whether it is tagged or not.

This is very useful when receiving mail cross-posted to distinct mailing lists and you want to save one instance of the message in each folder, but still guard against duplicates. You may say:

To Cc: unix-wizards     {
        UNIQUE (wizards);
        SAVE wizards;
        REJECT;
};
To Cc: majordomo-users  {
        UNIQUE (majordomo);
        SAVE majordomo;
        REJECT;
};

and only one instance of the message will end up in each folder. When you have folders with conflicting interests, you might use a tag list, instead of a single tag. For instance, assuming you wish to keep a single copy for messages cross-posted to both dist-users and agent-users, but have a separate copy if also cross-posted to majordomo-users, then say:

To Cc: majordomo-users  {
        UNIQUE (majordomo);
        SAVE majordomo;
        REJECT;
};
To Cc: dist-users {
        UNIQUE (dist, agent);
        SAVE dist-users;
        REJECT;
};
To Cc: agent-users {
        UNIQUE (dist, agent);
        SAVE dist-users;
        REJECT;
};

If you have some rule using UNIQUE without any tags, it will match when at least one instance of the message has been recorded, no matter what tag (if any at all) was used in the first place.

Specifying A Period

The period parameter of the ONCE commands or the vacperiod parameter of your configuration file has the following format: a number followed by a modifier. The modifier is an atomic period like a day or a week, the number is the number of atomic periods the final period should be equal to. The available modifiers are:

m
minute
h
hour (60 minutes)
d
day (24 hours)
w
week (7 days)
M
month (30 days)
y
year (365 days)

All the periods are converted internally in seconds, although you do not really care... Examples of valid periods range from "1m" to "136y" on a 32 bits machine (why ?).

Timeouts

In order to avoid having a mailagent waiting for a command forever, a maximum execution time of one hour is allowed by default. Past that amount of time, the child is sent a SIGTERM signal. If it does not die within the next 30 seconds, a SIGKILL is sent. Output from the program, if any so far, is mailed back to the user. This default behaviour may be altered by setting a proper runmax variable in your configuration file to allow more time for the command to complete.

There is also a filter queue timeout. In order to moderate system load, the C filter program waits 60 seconds by default (or whatever queuewait was set to in the config file) before launching mailagent. To avoid conflicts, messages queued by the first filter (which will then sleep for queuewait seconds) are not processed by mailagent's -q option until they are at least queuehold seconds old. Another queue-related parameter is queuelost, the amount of seconds after which mailagent will flag messages as "lost" when listing the queue.

Finally, the locking timeout policy may also be configured. By default, a lock is broken when it is one hour old (configured by the lockhold variable) and mailagent will only make lockmax attempts, spaced by lockdelay seconds to acquire the lock. It will then proceed whether or not it got that lock. If you want a secure locking policy, make sure lockmax times lockdelay is greater than lockhold, that parameter being "large" enough.

Avoiding Loops

The mailagent leaves an "X-Filter:" header on each filtered message, which in turn is used to detect loops. If a message already filtered is to be processed, the mailagent enters a special state _SEEN_. This state is special in the sense it is built-in, it is not matched by ALL, and some actions are not made available, namely: BACK, BOUNCE, FEED, FORWARD, GIVE, NOTIFY, PASS, PIPE, POST, PURIFY, QUEUE and RUN. Also note that although the ONCE and SELECT constructs are enabled, they will not let you execute disallowed commands. Otherwise, the _SEEN_ state behaves like any other state you can select or negate, so a <!_SEEN_> guard will not select the rule when we are in state _SEEN_.

The _SEEN_ state makes it easy to deal with mails which loop because of an alias loop you have no control on. If no action is found in the _SEEN_ state, the mail is left in the mailbox, as usual. Moreover, if no saving is done, a LEAVE is executed. This is the normal behavior.

The "X-Filter:" header is only added when the message is saved. Actions such as PIPE or GIVE do not flag the message as being saved and therefore they do not add that header line. You can add one via ANNOTATE if you wish to prevent loops, in case the program to which you are feeding the message might return it to you in some strange way.

Message Files

The text of the message to be sent back (for MESSAGE or NOTIFY) is read from a file and passed through the macro substitution mechanism. The special macro %T is set to the date of last modification made on that file. The format is month/day, and the year is added before the month only if it differs from the current year.

At the head of the message, you may put header lines. Those lines will overwrite the default supplied lines. That may be useful to change the default subject or add some additional fields like the name of your organization. The end of your header is given by the first blank line encountered. If the top of the message you wish to send looks like a mail header, you may protect it by adding a blank line at the very top of the file. This dummy line will be removed from the message and the whole file will be sent as a body part.

Here is an example of a vacation file. We add a carbon copy as well as the name of our organization in the header:

Cc: ram
Organization: %o
Precedence: bulk
[Last revision made on %T]
Dear %N:
I've received your mail regarding "%R".
It will be read as soon as I come back from vacation.
Sincerely,
--
%U <%u@%C>

VACATION MODE

When it's time to take some vacation, it is possible to set up mailagent in vacation mode. Every vacperiod, the message vacfile will be sent back to the user (with macros substitutions) if the user is explicitly listed in the To or Cc field and if the sender is not a special user (root, uucp, news, daemon, postmaster, newsmaster, usenet, Mailer-Daemon, Mailer-Agent or nobody). Matches are done in a case insensitive manner, so MAILER-DAEMON will also be recognized as a special user. Furthermore, any message tagged with a Precedence: field set to bulk, list or junk will not trigger a vacation message. This built-in behavior can of course be overloaded by suitable rules (by testing and issuing the vacation message yourself via MESSAGE).

Internally, mailagent uses a ONCE command tagged (%r, vacation, $vacperiod). This implies you must not use the vacation tag in your own ONCE commands, unless you know what you are doing.

Besides, the vacation message is sent only if no "VACATION off" commands were issued, or if another "VACATION on" overwrote the previous one. Note that whether a rule matched or not is irrelevant to the algorithm. By default, of course, the vacation message is allowed when the vacation configuration parameter is set to on.

If you are not pleased by the fact that a vacation message is sent to people who addressed you a carbon copy only, then you may write at the top of your rule file:

Cc: ram  { VACATION off; REJECT };

Of course, you have to substitute your own login name in place of ram. You cannot use the same scheme to allow vacation messages to special users like root, because the test for "specialness" occurs after the vacation mode flag. This is construed as a feature as it prevents stupid mistakes, like using r* instead of ram in the previous rule.

You may also want to setup a different vacation message, meant only for people in your organization given the sensitive nature of the information revealed ;-). A simple way of doing that is:

From: /^\w+$/, /^\w+@\w+$/, /^[\w.-]+@.*\.hp\.com$/i
        { VACATION ~/.hp_vacation 1w; REJECT HP };

Assuming the domain of my organization is .hp.com and that messages not bearing any domain are local messages, the above rule sets up the file ~/.hp_vacation, sent once a week, for all HP employees.

The VACATION command will not let you change the message path (but will allow frequency changes anyway) when the vacfixed configuration variable is set to ON. This is meant to be used in emergency situations, when only one vacation message will fit. For instance, when you are on a sick leave, a simple trigger message to your mailagent from home could change your ~/.mailagent configuration to force the ~/.i_am_sick message, regardless of what the various rules have to say. Actually, this is precisely why this feature was added, amazing... :-)

VARIABLES

The following variables are paid attention to: they may come from the environment or be set in the rule file:
mailfilter
indicates where loaded patterns are to be looked for, if the name of the file is not fully qualified. If it is not set, maildir will be used instead. If maildir is not set either, the home directory is used.
maildir
is the location of your mail folders. Any relative path is understood as starting from maildir. If it is not set, ~/Mail is used.

Those variables remain active while in the scope of the rule file. Should an alternate rule file be used (via rules hook or the APPLY command), the current values are propagated to the new rule set unless overridden in the alternate rule file. In any case, the previous value is restored when control is transferred back to the previous set of rules. That is, those variables are dynamically instead of statically scoped.

AUTOMATIC ACKNOWLEDGMENTS

Anywhere in the mail, there can be an @RR left-justified line which will send back an acknowledgment to the sender of the mail. The @RR may optionally be followed by an address, in which case the acknowledgment will be sent to that address instead. In fact (but let's keep that a secret), this is a way for me to be able to see who runs my mailagent program and who doesn't...

The sendmail program usually implements such a feature via a Return-Receipt-To: header line, which sends the whole header back upon successful delivery. However, this is not implemented on all mail transport agents, and @RR is a good alternative :-).

NOTA BENE

Throughout this manual page, I have always written header fields with the first letter of each word uppercased, as in Return-Receipt-To. But RFC-822 does not impose this spelling convention, and a mailer could legally rewrite the previous field as return-receipt-to (and in fact so does sendmail in its own private mail queue files).

However, you must always specify the headers in what could be called a normalized case (for headers anyway). The mailagent will correctly recognize cc:, CC: or Cc: in a mail message and will allow you to select those fields via the normalized Cc: selector. In fact, it operates the normalization for you, and a cc: selector would not be recognized as such. Of course, no physical alteration is ever made on the header itself.

This is also true for headers specified in the STRIP or KEEP command. If you write STRIP Cc, it will correctly remove any cc: line. Likewise, if you use regular expressions to specify a selector, Re.*: would match both original received: and Return-path: fields, internally known through their normalized representation.

MAIL HOOKS

The mail hooks allow mailagent to transparently invoke some scripts or perform further processing on the message. Those hooks are activated via the SAVE, STORE or LEAVE commands. Namely, saving in a folder whose executable bit is set will raise a special processing. By default, the folder is taken as a program where the mail should be piped to. If the "folder" program returns a zero status, then the message is considered saved by the mailagent. Otherwise, all the processing attached to failed save commands is started (including emergency saving attempts). Executable folders provide a transparent way (from the rule file point of view) to deal with special kind of messages.

In fact, five different types of hooks are available. The first one is the plain executable folder we have just spoken about. But in fact, here is what really happens when a saving command detects an executable folder: the mailagent scans the first line of the folder (in fact, the first 128 bytes) and looks for something starting with #: and followed by a single word, describing a special kind of hook. This is similar in the way the kernel deals with the #! hook in executable programs. If no #: is found or #: is followed by some garbage, then mailagent decides it is a simple program and feeds the mail message to this program. End of the story.

But if the #: token is followed (spaces allowed, case is irrelevant) by one of the following words, then special actions are taken:

rules
The file holds a set of mailagent rules which are to be applied. A new mailagent process is created to actually deal with those and the exit status is propagated back to the original mailagent.
audit
This is similar in spirit to what Martin Streicher's audit.pl package does, hence the name of this hook. The special variables which are set up by the PERL filter commands are initialized and the script is loaded in the special mailhook package name space, which also gives you an interface to the mailagent's own routines. You may safely use the exit function here, since an extra fork is done. This is the only difference between an audit and a perl hook.
deliver
Same thing as for the audit hook, but the standard output of your script is monitored by mailagent and understood as mailagent filtering commands. Upon successful return, a mailagent process will be invoked to actually execute those commands on the message. Again, this is similar in spirit to Chip Salzenberg's deliver package and gave the name of this hook.
perl
This hook is the same as audit but it is executed without forking a new mailagent, and you have the perl interface to mailagent's filtering commands. There is no difference with the PERL command, because it is implemented that way, by calling a mailagent and forcing the PERL command to be executed. This is similar in spirit to Larry Wall's famous perl language and it is responsible for the name of this hook :-).

As mentioned earlier in this manual page, the hook is invoked from with the home directory specified in your ~/.mailagent (which may differ from your real home directory, as far as mailagent or mailhook are concerned).

For those hooks which are finally ran by perl, the special @INC array has mailagent's own private library path prepended to it, so that require first looks in this place.

FOLDERS

A folder is a file or a directory which can be the target of a delivery by the mailagent, that is to say the argument of SAVE-like commands.

Folder Format

By default, mails are written into folders according to the standard UNIX-style mailbox format: each mail starts with a leading From line bearing the sender's address and the date. However, by setting the mmdf parameter from the ~/.mailagent to ON, the mailagent will be able to save messages in MMDF format: each message is sandwiched between two lines of four Ctrl-A characters (ASCII code 1) and the leading From line is removed.

When MMDF mode is activated, each folder will be scanned to see if it is a UNIX-style or MMDF-style mailbox and the message will be saved accordingly. When saving to a new folder, the default is to create a UNIX-style mailbox, unless the mmdfbox configuration variable was set to ON, in which case the MMDF format prevails.

Note that the MMDF format is also the standard for MH packed folders, so by enabling the MMDF mode, you can actually deliver directly to those packed folders. The MH command inc is able to incorporate mail from either form anyway, i.e. it does not matter whether the folder is in UNIX format (also called UUCP-style) or in MMDF format.

MH-style folders are also supported. It is mainly a directory in which messages are stored in individual files. To save directly into an MH folder, simply prefix the folder name with '+', just as you would do with MH commands. The unseen sequences specified in your MH profile (the mhprofile parameter in your ~/.mailagent, default is ~/.mh_profile) will be correctly updated, as rcvstore would.

When the target folder is a directory, mailagent attempts the delivery in an individual numbered file. If a prefix file is present (config parameter msgprefix, default is .msg_prefix), its first line is used to specify the base name of the message, then a number is appended to give the name of the message file to use. That is, if there is no such file, the folder will look like an MH one, without any MH sequence file though.

Folder Compression

If you have one or more of the widely available file compression utilities such as compress or gzip in your PATH (as set up by ~/.mailagent), then you may wish to use folder compression to save some disk space, especially when you are away for some time and do not want to see your mail fill-up the filesystem.

To achieve folder compression, you have to set up a file, referred to by the compress configuration variable. This file must list folder names, one per line, with blank lines ignored and shell-style (#) comments allowed. You may use shell-style patterns to specify the folders, and the match will be attempted on the full pathname of the folder (~ substitution occurs). If you do not specify a pattern starting with a leading '/' character, then the match will be attempted on the basename of the folder (i.e. the last component of the folder path). If you want to compress all your folders, then simply put a single '*' inside this file.

Mailagent uses the filename extension to determine what compression scheme is used for a particular folder. The file referred to by the compspecs configuration variable (default is $spool/compressors) is used to define the commands that mailagent will use to perform the compress, uncompress, and cat operations for a particular extension.

The compressors file holds lines of the following form:

tag extension compression_prog uncompress_prog cat_prog


where:
tag
is the logical name for the compression scheme. This is typically the same as the name of the program used to provide the compression, but could be different for some unforeseen reason. This must be unique across all records in the file.
extension
is the extension to recognize as belonging to the specified tag. This must be unique across all records in the file.
compression_prog
is the name of the command to run to compress a folder. The program must replace the uncompressed file with the compressed one with the extension appended to the filename (like compress or gzip).
uncompression_prog
is the name of the command to run to uncompress a folder. The program must replace the compressed file with the uncompressed one without the extension (like uncompress or gunzip).
cat_prog
is the name of the command to output the uncompressed contents of a compressed folder to stdout (like zcat or gzcat).

The fields are separated by TABS to allow for the use of space characters in the command fields.

If the file referred to by the compspecs configuration variable cannot be accessed for whatever reason, a default entry is hard-wired into mailagent (knows about both compress and gzip programs):

compress <TAB> .Z <TAB> compress <TAB> uncompress <TAB> zcat
gzip <TAB> .gz <TAB> gzip <TAB> gunzip <TAB> gunzip -c

If you wish to add more compressors, you can copy the default compressors file from mailagent's private library directory and setup a correct entry for your alternate compressor. Keep in mind that the trailing extension needs to be unique amongst all the listed programs, since that extension is used to determine the type of compression performed on the folder.

If the folder is created without any existing compressed form around, a default compressor is selected for you, as defined by the comptag configuration variable. That refers to the tag name of the compspecs file, i.e. the first word on the line (usually the name of the compression program, but not necessarily).

When attempting delivery, mailagent will check the folder name against the list of patterns in the compress file. If there is a match, the folder is flagged as compressed. Then mailagent attempts decompression if there is already a compressed form (ie. the file has a recognized filename extension) and if no uncompressed form is present. Delivery is then made to the uncompressed folder. However, re-compression is not done immediately, since it is still possible to get messages to that folder in a single batch delivery. Should disk space become so tight that decompression of other folders is impossible, mailagent will re-compress the folders it has already uncompressed. Otherwise, it waits until the last moment.

If for some reason there is a compressed folder which cannot be decompressed, mailagent will deliver the mail to the plain folder. Further delivery to that folder will be faced with both a compressed and a plain version of the folder, and that will get you a warning in the log file, but delivery will be made automatically to the plain file.

On newly created folders the comptag configuration variable is referenced to determine the compression type to use for the folder.

MAIL BIFFING

If you are receiving and processing mail on your own machine, then you have access to local mail biffing where mailagent can warn you about new messages and tell you about where they have been saved, printing a small subset of the header and the first few lines of the body.

To use biffing, all you need is the setting of the few biff parameters in your ~/.mailagent and make sure biff is set to ON. Actually, this is the only parameter you need to set to get minimal default biffing behaviour. Don't forget to run the shell command "biff y" on the terminals where you want to get notification (you may do that on several ttys, one for each virtual display for instance).

Upon mail reception and saving on a folder or posting to a newsgroup, mailagent locates all the ttys where you are logged on, then selects those where biffing was requested, finally emitting a message and making a beeping sound (if your terminal supports this and you are using the standard format--see below).

Customizing Biffing Output

Should the default format not suit your needs, you may customize the biffing message freely, setting the biffmsg parameter to point to the file where the format is stored. Standard macros substitutions will be performed on your message, the following macro set superseding and completing the standard set:

%-A
Same as writing %-H, new line, %-B
%-B
The body part of the biffing message, with content-transfer-encoding removed. If the message is a MIME multipart one, the text/plain part is shown. If only a text/html part is available, the HTML markup is stripped for biffing.
%-H
The header part of the biffing message. If shows only From:, To: Subject: and Date: headers, or whatever you have set the biffhead configuration variable to. All headers are showed as one line of text, regardless of their actual length. There will be three trailing dots at the end to signal that truncation occurred. For a news article (biffing after a POST -b), the To: and Cc: fields are never shown, even if specified in biffhead.
%-T
Same as %-B, but trimming is activated. The purpose of trimming is to remove any leading quotation in the message, to get only the most meaningful part. This assumes the quoting character is a single non-alphanumeric character. The leading attribution line that may introduce the quotation can be also removed, and a minimum length for the quotation can be set in the configuration file.
%B
The relative path under %d of the message folder, full path (%p) if not saved under that directory. The newsgroup name for news articles.
%D
The directory where the message is stored. If an MH folder, this is the folder full path. The home directory is replaced by a ~. Empty for news articles.
%F
The base name (last path component) of the message. For an MH message, this is the message number. Empty for news articles.
%P
The folder path. It has the correct semantics for MH and directory folders, i.e. it points to the folder directory itself. Otherwise, the same as %p.
%a
Alarm characters (^G). May expand to more than one under the control of the BEEP filtering command. Use %b if you only want a single bell.
%b
A beeping character (^G). As opposed to %a, this only expands to give one bell.
%d
Full path where folders such as the one being saved into are stored if not qualified (i.e. your MH path for MH folders, of something like ~/Mail for other folders). Empty for news articles.
%f
Folder where mail was saved, home replaced by ~ for short. The newsgroup when article was posted for news.
%m
A '+' sign if the folder is an MH one, empty otherwise.
%p
The full path name (same as %f) of the message, but without any ~ shortcut. The newsgroup name for news articles.
%t
The type of message: usually "mail", but set to "article" for biffing after a POST command.

You can get the standard macro expansion by using %:f for instance, since the %f macro is superseded. The %: form lets you obtain the standard macro definition anyway, no matter what, so you don't have to remember whether a given macro is superseded in this context or not. Besides, it is safer since new macros may be added here without notice. Note that macros related to the message content all start with %- and therefore are not conflicting with standard one.

Here is the format you need to use to get the same behaviour as the default hardwired format:

%b
New %t for %u has arrived in %f:
----
%-A
----%b

Note that the string ...more... appears at the end of the body when it has not been completely printed out on the screen and the remaining lines are not blank or similar.

Trimming Leading Quotation

It is a standard practice, when replying to a message, to include an excerpt of the sentences being replied-to, using a non-alphanumeric character such as '>' to prefix quoted lines. Something like:

Quoting John Doe:
> This is quoted material.
> Another line from John's mail.
This is part of the reply to John.

The leading "Quoting ..." line, called the attribution line, is optional and may be missing or take another free form.

However, when biffing, this may be seen as useless noise, especially nowadays where people freely quote more and more in their replies. Since the biff message only shows the top lines of the message, it may be desirable to automatically trim those quoted lines.

Via the %-T macro in the customized biff format, you may request trimming of the leading quotation material, keeping the attribution line or not, and even replace trimmed material with a notification that so many lines have been removed.

All this customization is done from the ~/.mailagent configuration file, using the bifftrim, bifftrlen and biffquote variables.

You first need to turn trimming on by using a customized biff format using the %-T macro. By setting bifftrlen to 3, you may request that only quotations of at least 3 lines be trimmed. Turning bifftrim off will remove the trimming notification, whilst turning biffquote off will also strip the attribution line, when present.

For instance, assuming the following settings:

bifftrim : ON
bifftrlen: 2
biffquote: OFF

then the above example would produce the following biffing output (header of the message not withstanding):

[trimmed 3 lines starting with a leading '>' character & attribution line]
This is part of the reply to John.

because the blank line following the quoted material is counted as being part of the quotation. The "[trimmed ..]" message can be turned off by setting bifftrim to OFF.

The trimming algorithm considers the first line of the body to see if it starts with a non-alphanumeric character. If it does, then all the following lines starting with that same character, or any blank line is removed, up to the first non-blank line starting with another character. Optionally, the first line (and that line only) is skipped if the second one starts with a non-alphanumeric character, and the first line is taken as being the attribution line.

Using Compact MH-style Biffing

The so-called MH-style biffing is a way of presenting a compacted body where all the lines are joined together into a big happy string with successive spaces turned into a single space character. To enable it, you need to set the biffmh variable to ON.

Since this compacting is output verbatim on the tty, line breaks will occur randomly and this may make reading difficult. You may request an automatic reformatting of the compacted body by turning biffnice to ON and the biff output will fit nicely within the terminal.

Unfortunately, it is not possible to customize the amount of columns that should be used for formatting: since you may biff to any tty you are logged on, that would force mailagent to probe the tty for its column size, for each possible tty where output may go, and there is no reliable portable way of doing that. Sorry.

EXTENDING FILTERING COMMANDS

Once you've reached the expert level, and provided you have a fair knowledge of perl, you may feel the need for more advanced commands which are not part of the standard set. This section explains how you can achieve this dynamically, without the need of diving deep inside the source code.

Once you have extended the filtering command set, you may use those commands inside the rule file as if they were built-in. You may even choose to redefine the standard commands if they do not suit you (however, if you wish to do that, you should know exactly what you are doing, or you may start losing some mail or get an unexpected behavior -- this also voids your warranty :-).

The ability to provide external commands without actually modifying the main source code is, I believe, a strong point in favor of having a program written in an interpreted language like perl. This of course once you have convinced yourself that it is a Good Thing to customize and extend a program in the same language as the one used for the core, meaning usually a fairly low-level language with fewer user-friendly hooks.

Overview

In order to implement a new command, say FOLD, you will need to do the following:

  • Write a perl subroutine to implement the FOLD action and put that into an external file. Say we write the subroutine fold and we store that in a fold.pl file. This is naturally the difficult part, where you need to know some basic things about mailagent internals.
  • Choose where you want to store your fold.pl file. Then check the syntax with perl -c, just to be sure...
  • Edit the newcmd file (as given by the configuration file) to record your new command. Then make sure this file is tightly protected. You must own it, and it should not be writable by any other individual but you.
  • Additionally, you may want to specify whether FOLD is to modify the existing execution status and whether or not it will be allowed within the special _SEEN_ state.
  • Write some rules using the new FOLD command. This is the easy part! Note that your command may also be used within perl hooks as if it were a builtin command (this means there is an interface function built for you within the mailhook package).

In the following sections, we're going to describe the syntax of the newcmd file, and we'll then present some low-level internal variables which may be used when implementing new commands.

New Command File Format

The newcmd file consists of a series of lines, each line describing one command. Blank lines are ignored and shell-style comments introduced by the sharp (#) character are allowed.

Each line is formed by 3 principal fields and 2 optional ones; fields are separated by spaces or tabs. Here is a skeleton:

<cmd_name> <path> <function> <status_flag> <seen_flag>

The cmd_name is the name of the command you wish to add. In our previous example, it would be FOLD. The next field, path, tells mailagent where the file containing the command implementation is located. Say we store it in ~/mail/cmds/fold.pl. The function field is the name of the perl function implementing FOLD, which may be found in fold.pl. Here, we named our function fold. Note that if your function has its name within the newcmd package, which is the default behavior if you do not specify any, then there is no need to prefix the function name with the package. Otherwise, you must use a fully qualified name.

The last two fields are optional, and are boolean values which may be specified by true or yes to express truth, and false or no to express falsehood. If status_flag is set to true, then the command will modify the last execution status variable. If seen_flag is true, then the command may be used when the filter is in _SEEN_ state. The default values are respectively true and false.

So in our example, we would have written:

FOLD  ~/mail/cmds/fold.pl  fold  no  yes

to allow FOLD even in _SEEN_ state and have it executed without modifying the current value of the last-command-status variable.

Writing An Implementation

Your perl function will be loaded when needed into the special package newcmd, so that its own name-space is protected and does not accidentally conflict with other mailagent routines or variables. When you need to call the perl interface of some common mailagent functions, you will have to remember to use the fully qualified routine name, for instance &mailhook'leave to actually execute the LEAVE command.

(Normally, in PERL hooks, there is no need for this prefixing since the perl script is loaded in the mailhook package. When you are extending your mailagent, you should be extra careful however, and it does not really hurt to use this prefixing. You are free to use the perl package directive within your function, hence switching to the mailhook package in the body of the routine but leaving its name in the newcmd package.)

Since mailagent will dynamically load the implementation of your command the first time it is run, by loading the specified perl script into memory and evaluating it, I suggest you put each command implementation in a separate file, to avoid storing potentially unneeded code in memory.

Each command is called with one argument, namely the full command string as read from the filter rules. Additionally, the special @ARGV array is set by performing a shell-style parsing of the command line (which will fail if quotes are mismatched, but then you can do the parsing by yourself since you get the command line). At the end of your routine, you must return a failure status, i.e. 0 for success and 1 to signal failure.

Those are your only requirements. You are free to do whatever you want inside the routine. To ease your task however, some variables are pre-computed for you, the same ones that are made available within mail hooks, only they are defined within the newcmd package this time. There are also a few special variables which you need to know about, and a set of standard routines you may want to call. Please avoid calling something which is not documented here, since it may change without prior notice. If you would like to use one routine and it is not documented in this manual page, please let me know.

Each command is called from within an eval construct, so you may safely use die or call external library routines that use die. If you use require, be aware that mailagent is setting up a special @INC array by putting its private library path first, so you may place all your mailagent-related library files in this place.

Special Variables

The following special variables (some of them marked read-only, meaning you shouldn't modify them, and indeed you can't) made available directly within the newcmd package, are pre-set by the filter automaton, and are used to control the filtering process:

$mfile
The base name of the mail file being processed. This variable is read-only. It is mainly used in log messages, as in [$mfile] to tag each log, since a single mailagent process may deal with multiple messages.
$ever_saved
This is a boolean, which should be set to 1 once a successful saving operation has been completed. If at the end of the filtering, this variable is still 0, then the default LEAVE will be executed.
$folder_saved
The value of that variable governs the $msgpath convenience variable set for PERL escapes. It is updated whenever a message is written to a file, to hold the path of the written file.
$cont
This is the continuation status, a variable of the utmost importance when dealing with the control flow. Four constants from the main package can be used to specify whether we should continue with the current rule ($FT_CONT), abandon current rule ($FT_REJECT), restart filtering from the beginning ($FT_RESTART) or simply abort processing ($FT_ABORT). More on this later.
$lastcmd
The last failure status recorded by the last command (among those which do modify the execution status). You should not have to update this by yourself unless you are implementing some encapsulation for other commands, like BACK or ONCE, since by default $lastcmd will be set to the value you return at the end of the command.
$wmode
This records the current state of the filter automaton (working mode), in a literal string form, typically modified by the BEGIN command or as a side effect, as in REJECT for instance.

All the special variables set-up for PERL escapes are also installed within the newcmd package. Those are $login, %header, etc... You may peruse them at will.

Other variables you might have a need for are configuration parameters, held in the ~/.mailagent configuration file. Well, the rule is simple. The value of each parameter param from the configuration file is held in variable $cf'param. Variable $main'loglvl is the copy of $cf'level, since it's always shorter to type in $'loglvl after each call to the logging routine &add_log.

There is one more variable worth knowing about: $main'FILTER, which is the suitable X-Filter line that should be appended in all the mail you send via mailagent, in order to avoid loops. Also when you save mails to a folder, it's wise adding this line in case a problem arises: you may then identify the culprit.

Rule Environment

An action might have a legitimate desire of altering the environment for the scope of one rule only, reverting to the previous value when exiting the rule. Or you might want to change the value forever.

When we speak about altering the environment, we refer to the one set up via the configuration file, whose values end-up in the cf package. Well, some of those variables are copied in the env package before filtering of a message starts (under the control of the @env'Env array).

All rules should then refer to the version in the env package, and not in the cf package, to see alterations. Global changes are made by affecting directly to the variable in the env package, while local changes are requested by calling the &env'local routine.

For instance, the cf'umask value is copied as env'umask because umask is held in @env'Env. Global changes are made by setting that copy directly, while local changes may be made with:

        &env'local('umask', 0722);

to set-up a new local value. The first time &env'local is called on a variable, its value is saved somewhere, and will be restored upon exiting the scope of the rule. Then the new value is affected to the variable.

Variables requiring a side effect when their value is changed (such as the umask variable, which requires a system call to let the kernel see the change) may specify it by accessing the %env'Spec array, the key being the name of the variable requiring a side effect, the value being interpreted as a bit of perl code ran once the original value is restored. For instance, we say somewhere (in &env'init):

        package env;
        $Spec{'umask'} = 'umask($umask)';

to update the kernel view when leaving scope. Note that the side effect is evaluated once the variable has recovered its original value, and within the env package.

Internally, the &analyze_mail routine calls &env'setup before starting its processing to initialize the env package, and &env'cleanup at the end before returning. Before running the actions specified on a rule match, &apply_rules calls &env'restore to ensure a coherent view of the environment while running the actions for that particular rule.

Altering Control Flow

When you want to alter control flow to perform a REJECT, a RESTART or an ABORT, you have three choices. If you wish to control that action via an option, the same way the standard UNIQUE does (with -c, -r or -a), you may call &main'alter_execution(option, state) giving it two parameters: the option letter and the state you wish to change to before altering the control flow.

You may also want to directly alter the $wmode and $cont variables, but then you'll have to do your own logging if you want some. Or you may call low-level routines &main'do_reject, &main'do_restart and &main'do_abort to perform the corresponding operation (with logging).

Remember that the _SEEN_ state is special and directly handled at the filter level, and the filter begins in the INITIAL state. The default action is to continue with the current rule, which is why there is no routine to perform this task.

The preferred way is to invoke the mailhook interface functions, &mailhook'begin, &mailhook'reject, etc..., and that will work even if you redefine those functions yourself. Besides, that's the only interface which is likely not to be changed by new versions.

General Purpose Routines

The following is a list of all the general routines you may wish to call when performing some low-level tasks. Note that this information is version-dependent. Since I document them, I'll try to keep them in new versions, but I cannot guarantee I will not have to slightly change some of their semantics. There is a good chance you will never have to worry about that anyway.

&header'format(rfc822-field)
Return a formatted RFC822 field to fit in 78 columns, with proper continuations introduced by eight spaces.
&header'normalize(rfc822-header-name)
Normalize case in RFC822 header and return the new header name with every first letter uppercased.
&header'reset
This is part of an RFC822 header validation, mainly used when splitting a digest. This resets the recognition automaton (see &header'valid).
&header'valid(line)
Returns a boolean status, indicating if all the lines given so far to this function since the last &header'reset are part of a valid RFC822 header. The function understands the first From line which is part of UNIX mails. At any time, the variable $header'maybe may be checked to see if so far we have found at least one essential mail header field.
&main'acs_rqst(file)
Perform a .lock locking on the file, returning 0 on success and -1 on failure. If an old lock was present, it is removed (time limit set to one hour). Use &main'free_file to release the lock.
&main'add_log(string)
Add the string to the logfile. The usual idiom is to postfix that call with the if $'loglvl > value, where value is the logging level you wish to have before emitting that kind of log ($'loglvl is a short form for $main'loglvl).
&main'free_file(file)
Remove a .lock on a file, obtained by &main'acs_rqst. It returns 0 if the lock was successfully removed, -1 if it was a stale lock (obtained by someone else).
&main'header_found(file)
Scan the head of a file and try to determine whether there is a mail header at the beginning or not. Return true if a header was found.
&main'history_record
Record the message ID of the current message and return 0 if the message had not been previously seen, 1 if it is a duplicate.
&main'hostname
Return the value of the hostname, lowercased, with possible domain name appended to it. The hostname is cached, since its value must initially be obtained by forking. (see also &main'myhostname)
&main'internet_info(email-address)
Parse an e-mail internet address and return a three-element array containing the host, the domain and the country part of the internet host. For instance, if the address is [email protected], it will return (c, b, a).
&main'login_name(email-address)
Parse the e-mail internet address and return the login name.
&main'macros_subst(*line)
Perform in-place macro substitution (line passed as a type glob) using the information currently held in the %main'Header array. Do not pass *_ as a parameter, since internally macros_subst uses a local variable bearing that name to perform the substitutions and you would end up with an unmodified version. If you really want to pass *_, then you must use the returned value from macros_subst which is the substituted text, but that's less efficient than having it modified in place.
&main'makedir(pathname, mode)
Make directory, creating all the intermediate directories needed to make pathname a valid directory. Has no effect if the directory already exists. The mode parameter is optional, 0700 is used (octal number) if not specified.
&main'myhostname
Returns the hostname of the current machine, without any domain name. The hostname is cached, since its value must initially be obtained by forking.
&main'run_command(filter-command)
Execute the single filter command specified and return the continuation status, which should normally be affected to the $cont variable. You will need this routine when trying to implement commands which encapsulate other commands, like ONCE or SELECT.
&main'seconds_in_period(period)
Return the number of seconds in the period specified. See section Specifying A Period to get valid period strings.
&main'shell_command(program, input, feedback)
Run a shell command and return a failure status (0 for OK). The input parameter may be one of the following constants (defined in the main package): $NO_INPUT to close standard input, $BODY_INPUT to pipe the body of the current message, $MAIL_INPUT to pipe the whole mail as-is, $MAIL_INPUT_BINARY to pipe the whole mail after having removed any content transfer-encoding and $HEADER_INPUT to pipe the message header. The feedback parameter may be one of $FEEDBACK or $NO_FEEDBACK depending whether or not you wish to use the standard output to alter the corresponding part of the message. If no feedback is wanted, the output of the command is mailed back to the user. The $FEEDBACK_ENCODING is handled like $FEEDBACK but will tell mailagent to look at the best suitable body encoding when the input is the whole message.
&main'parse_address(rfc822-address)
Parse an RFC822 e-mail address and return a two-elements array containing the internet address and the comment part of that address.
&main'xeqte(filter-actions)
Execute a series of actions separated by the ';' character, calling run_command to actually perform the job. Return the continuation status. Note that $FT_ABORT will never be returned, since mailagent usually stops after having executed one set of actions, only continuing if it saw an RESTART or a REJECT. What ABORT does is skipping the remaining commands on the line and exiting as if all the commands had been run. You could say xeqte is the equivalent of the eval function in perl, since it interprets a little filter script and returns control to the caller once finished, and ABORT is perl's die.

You may also use the three functions from the extern package which manipulate persistent variables (already documented in the section dealing with variables) as well as the user-defined macro routines.

Example

Writing your own commands is not easy, since it requires some basic knowledge regarding mailagent internals. However, once you are familiar with that, it should be relatively straightforward.

Here is a small example. We want to write a command to bounce back a mail message to the original sender, the way sendmail does, with some leading text to explain what happened. The command would have the following syntax:

SENDBACK reason


and we would like that command to modify the existing status, returning a failure if the mail cannot be bounced back. Since this command actually sends something back, we do not want it to be executed in the _SEEN_ state. Here is my implementation (untested):

sub sendback {
        local($cmd_line) = @_;
        local($reason) = join(' ', @ARGV[1..$#ARGV]);
        unless (open(MAILER, "|/usr/lib/sendmail -odq -t")) {
                &'add_log("ERROR cannot run sendmail to send message")
                        if $'loglvl;
                return 1;
        }
        print MAILER <<EOF;
From: mailagent
To: $header{'Sender'}
Subject: Returned mail: Mailagent failure
$main'FILTER
  --- Transcript Of Session
$reason
  --- Unsent Message Follows
$header{'All'}
EOF
        close MAILER;
        $ever_saved = 1;        # Don't want it in mailbox
        $? == 0 ? 0 : 1;        # Failure status
}

Assuming this command is put into ~/mail/cmds/sendback.pl, the line describing it in the newcmd file would be:

SENDBACK  ~/mail/cmds/sendback.pl  sendback  yes  no

Now this command may be used freely in any rule, and will be logged as a user-defined command by the command dispatcher. Who said it was not easy to do? :-)

Note the use of the $ever_saved variable to mark the mail as saved once it has been bounced. Indeed, should the SENDBACK action be the only one action to be run, we do not want mailagent to LEAVE the mail in the mailbox because it has never been saved (this default behavior being a precaution only -- better safe than sorry).

Conclusion

If along the way you imagine some useful commands which could be made part of the standard command set, please e-mail them to me and I'll consider integrating them. In the future, I would also like to provide a standard library of perl scripts to implement some weird commands which could be needed in special cases.

Note that you may also use the information presented here inside the perl escape scripts. Via the require operator, it is easy to get the new command implementation into your script and perform the same task. You will maybe need to set up @ARGV by yourself if you rely on that feature in your command implementation.

Command extension can also be viewed as a way to reuse some other perl code, the mailagent providing a fixed and reliable frame and the external program providing the service. One immediate extension would be mailing list handling, using this mechanism to interface with some mailing list management software written in perl.

GENERIC MAIL SERVER

One nice thing about mailagent is that it provides you with the basic tools to implement a generic mail server. Indeed, via the SERVER command, you can process a mail message, extract and then execute some predefined commands. For instance, you may implement an archive server, or a mailing list manager, etc...

The major limitation currently is that only plain commands are accepted, or commands taking some additional info as standard input or equivalent. There is no notion of modes, with separate command sets for each mode or limited name-space visibility, at least for now, so it is not easy (albeit possible) to implement an ftpmail server, for instance, since this implies the notion of mode.

Overview

In order to implement a mail server command (say send file, which would send an arbitrary file from the file system in a separate mail message), you need to do the following:
  • Think about the command from a security point of view. Here, the command we want to implement is a potentially dangerous one since it can give access to any file on the machine the individual running mailagent has access to. So we want to restrict that command to a limited number of trusted people, who will be granted the power to run this command. More on this later.
  • Choose whether you want to implement the command in perl or in another programming language. If you do the latter, your command will be known as a shell command (i.e. a command runnable directly from a shell), while in the former case, you have the choice of making it appear as a shell command, or have it hooked to the mailagent in which case it is known as a perl command. In that last case, your command will be dynamically loaded into mailagent with all the advantages that brings you. Here, we are going to write our command as a shell script.
  • Write the command itself. That's the most difficult part in this scheme. Later on, we will see a straightforward implementation of the send command.
  • Edit the comserver file (defined in your ~/.mailagent) to record your new command. Then make sure this file is tightly protected. You must own it, and be the only one allowed to modify it.
  • Additionally, you may want to hide some of the arguments in the session transcript (more on this later), allow the command to take a flow of data as its standard input, assign a path to the command, etc... All those parameters take place in your comserver file.
  • Start using the command... which of course is the nicest part in this scheme!

In the following sections, we'll learn about the syntax of the comserver file, what powers are, how the session transcript is built, what the command environment is, etc...

Builtin Commands Overview

The mail server has a limited set of builtin commands, dealing with user authentication and command environment settings. User authentication is password based and is not extremely strong since passwords are specified in clear within the mail message itself, which could be easily intercepted.

The server maintains the notion of powers. One user may have more than one power at a time, each power granting only a limited access to some sensitive area. A few powers are hardwired in the server, but the user may create new ones when necessary. Those powers are software-enforced, meaning the command must check for itself whether is has the necessary power(s) to perform correctly.

Powers are protected by a password and a clearance file. Having the good password is not enough, you have to be cleared in order to (ab)use it. The clearance file is a list of e-mail address patterns, using the shell metacharacters scheme, someone being cleared if and only if his e-mail address matches at least one of the patterns from the clearance file. The more use you will make of metacharacters, the weaker this clearance scheme will be, so be careful.

Your commands and the output resulting from their execution is normally mailed back to you as a session transcript. For security reasons, passwords are hidden from the command line. Likewise, failure to get a power will not indicate whether you lacked authorization or whether your password was bad.

A user with the system power is allowed to create new powers, delete other powers, change power passwords, and list, remove or change power clearances. This is somehow an important power which should be detained by a small number of users with very strict clearance (no meta-characters in the address, if possible). A good password should also protect that power.

However, a user with the system power is not allowed to directly get another power without specifying its password and being allowed to do so by the associated clearance file. But it would be possible to achieve that indirectly by removing the power and creating a new one bearing the same name. In order to control people with the system power and also for some tricky situation, there is another more god-like power: the root power.

A user with the root power can do virtually anything, since it instantly grants that individual all the powers available on the server (but security). The only limitation is that root cannot remove the root power alone. One needs to specify the security password (another hardwired power) in order to proceed. Needless to say, only one individual should have both root and security clearance, and only one individual should know the security password and be listed in the clearance file. The system power cannot harm any of those two powers. Eventually, more than one user could have the root power, but do not grant that lightly...

Getting the root power is necessary when system has messed with the system configuration in an hopeless way, or when a long atomic sequence of commands has to be issued: root is not subject to the maximum number of command that can be issued in one single message.

In case you think this mailagent feature is dangerous for your account, do not create the root and security powers, and do not write any sensitive commands.

Builtin Commands Definition

Now let's have a look at those builtin commands. Passwords of sensitive commands will be concealed in the session transcript. Some commands accept input by reading the mail message up to the EOF marker, which is a simple EOF string on a line by itself (analogous with shell's here documents).

addauth power password
Add users to clearance file for power. If the power password is given, no special power is needed, otherwise the system power is required. For root or security powers, the corresponding power is required, or the password must be specified. The command reads the standard input up to the EOF marker to get the new users.
approve password command
Records the password in the command environment, then executes the command. If a power is required and not yet obtained, the command will look for the password in the environment and try to get the relevant power using that password. Hence, approved command (with proper password) will transparently execute without the hassle of requesting the power, issuing the command and then releasing the power. It is up to the command to perform the approve password test by looking at the approve variable in the command environment (see below). Since clearance checks (such as those performed when requesting a power) are not performed, no sensitive command should ever deal with the approve construct.
delpower power password [security]
Delete a power from the system, and its associated clearance list. The system power is required to delete most powers except root and security. The security power may only be deleted by itself and the root power may only be deleted when the security password is also specified.
getauth power password
Get current clearance file for a given power. No special power required if the password is given or the power is already detained. Otherwise, the system power is needed for all powers but root or security where the corresponding power is mandatory.
newpower power password [alias]
Add a new power to the system. The command then reads the standard mail input until the EOF marker to get the power clearance list. The system power is required to create a new power, unless it's root or security: The security power is required to create root and the root power is required to create security.
passwd power old new
Change power password. It does not matter if you already hold the corresponding power, you must give the proper old password. See also the password command.
password power new
Change power password. The corresponding power is required, or you have to get the system power. To change the root or security passwords, you need the corresponding power.
power name password
Ask for a new power. Of course, root does not need to request for any other power but security, less give any password. This command is not honored when the server is not in trusted mode, unbeknownst to the user: the error message in the transcript file is no different from the one obtained with an invalid password.
powers regexp
List all the powers matching the perl regular expression, along with their respective clearance file. The system power is required to get the list. The root or security power are required to get access to the root or security information, respectively. If no arguments are given, all the powers are listed.
release power
Get rid of some power.
remauth power password
Remove users from clearance file, getting the list by reading the standard mail input until the EOF marker. This command does not require any special power if the proper password is given or if the power is already detained. Otherwise, the system power is needed. For root and security clearance, the corresponding power is needed as well.
set variable value
Set the variable to the corresponding value. Useful to alter internal variables like the EOF marker value, or change some command environment. The user may define his own variables for his commands. For flag-type variable, a value of on, yes or true sets the variable to 1, any other string sets it to 0 (false). Used all by itself as set, the list of all the defined variables along with their respective values is returned.
setauth power password
Replace power clearance file with one obtained from standard mail input up to the EOF mark. The system power is needed unless you specify the proper password or the power is already yours. As usual, root or security clearances can only be changed when the power is detained.
user [e-mail [command]]
Execute command by assuming the e-mail identity specified. Powers are lost while executing the command. The e-mail identity may be checked by the command itself, which may impose further restrictions on the execution, like getting user-defined powers. Note that this command only modifies the global environment, and that it's up to the command implementation to make use of that information. If no command is specified, the new identity is assumed until changed by another user command and all the powers currently held by the user are released. If no e-mail address is given, the original user ID is restored.

Command Environment

There are six types of commands and variables that can be specified in server mode. Two of them, end and help types are special and handled separately. Two types var and flag refer to variables and the last two types perl and shell refer to commands.

Whenever mailagent fires a server command, it sets up an environment for that command: if it is a perl-type command, then a set of perl variables are set before loading the command; if it is a shell-type command, some environment variables are initialized and file descriptor #3 is set up to point directly to the mailagent session transcript.

A shell-type command is forked, whilst a perl-type command is loaded directly in mailagent within the cmdenv package. This operates much like the PERL filtering command, only the target package differs and a distinct set of variables is preset.

Some commands collect additional data up to an end-of-file marker (by default the string EOF on a line by itself) and those data are fed to shell commands via stdin and to perl commands via the @buffer variable set up in the environment package named cmdenv (in which the command is loaded and run).

If you define your own variables (types var or flag), you may use the builtin set command to modify their values. Note that no default value can be provided when defining your variable. A suitable default value must be set within commands making use of them, with the advantage that different default values may be used by different commands.

The following environment variables are defined. Most are read-only, unless notified otherwise, in which case the builtin set command may be used on them.

approve
The approve password for approve commands, empty if not within a builtin approve construct.
auth
A flag set to true when a valid envelope was found in the mail message. When this flag is false, the server cannot be put in trusted mode.
cmd
The command line, as written in the message.
collect
Internal flag set to true while collecting input from a here-document. It is normally reset to false before calling the command.
debug
True when debug mode is activated (may be set).
disabled
A comma separated list of disabled commands, with no space between them. This is initialized when the SERVER command is invoked and the -d option is used.
eof
The current end-of-file marker for here-document commands. By default set to 'EOF' (may be changed).
errors
Number of errors so far.
jobnum
The job number assigned to the current mailagent.
log
What was logged in the transcript, with some args possibly concealed.
name
The command name.
pack
Packing mode for file sending (may be set).
path
Destination address for file sending or notification (may be set).
powers
A colon (:) separated list of powers the user currently has successfully requested and got.
requests
Number of requests processed so far.
trace
True when shell commands want to be traced in transcript (may be set).
trusted
True when server is in trust mode, where powers may be gained. This is activated by the -t option of the SERVER command, provided a valid mail envelope was found.
uid
Address of the sender of the message, where transcript is to be sent. By extension, the real user ID for the server, which is the base of the power clearance mechanism.
user
The effective user ID, originally the same as the uid, but may be changed via the user builtin command.

Session Transcript

A session transcript is mailed back automatically to the user who requested a server access. This transcript shows the commands ran by the user and their status: OK or FAILED. Between those two lines, the transcript show any output explicitly made by the command to the transcript. Typically, the transcript may be used to forward error messages back to the user, but even commands executing correctly may want to issue an explicit message, stating what has just been done.

A perl command may access the transcript via the MAILER file handle, defined in the cmdenv package, whilst a shell command may access it via its file descriptor #3.

Note that the session transcript is mailed to the sender of the message, i.e. whoever the envelope header line says it is. As far as the server is concerned, this e-mail address is used as the user ID, just like a plain login name can be thought of as the user id. For sensitive commands, authentication based on that information is really weak. A more "secure" authentication is provided by the server powers, which is password-based. Unfortunately, the clear password has to be transmitted in the message itself and could be eavesdropped.

Recording New Commands and Variables

Server commands and variables are defined in the comserver file defined in your ~/.mailagent. The format of the file is that of a table with items on a row separated by tabs characters. Each line defines one command or variable. Any irrelevant field may be entered as a single '-' (minus) character. The format allows for shell-style (#) comments.

Each row has the following fields:

name type hide collect-data path extra


where:
name
is the name of the command or variable as recognized by the server.
type
is one of perl, shell, var, flag, help or end.
hide
indicates which arguments in the command are to be hidden (the command name being argument zero) in the session transcript. Use '-' if no arguments need to be hidden. Typically, this is used to hide clear passwords in commands. If more than one argument has to be hidden, then a list of numbers separated by a ',' (comma) may be specified, with no spaces between them. For instance '2,4' would hide arguments 2 and 4 in the transcript.
collect-data
is a flag (specify as either 'y' or 'n', but you may use complete words 'yes' or 'no') indicating whether the command collects additional data in a here-document until the EOF marker. Alternatively, you may specify '-' in place of 'n'.
path
specifies the path of the command (~name substitution allowed). If not relevant (e.g. when defining a variable) or when you want to leave it blank, use '-'. If a blank path is specified for a perl or shell command, then the implementation of that command is expected to be found in servdir, as defined in ~/.mailagent. If the command name is cmd for instance, then perl command are expected there in a file named cmd of cmd.pl, whereas shell commands are expected to be found in a cmd of cmd.sh file. Note that a command is disabled if it cannot be located at the time the comserver file is parsed.
extra
is any extra parameter needed for the command. Unlike other fields, this should be left blank if not needed. Anything up to the end of the line is grabbed by this field. Perl commands should specify the name of the perl function to call to execute the command; if none is specified, the name of the command itself is called. Shell commands may use that field to supply additional options, which will be inserted right after the command name and before any other user-supplied arguments. Others should leave this alone.

Special Command Types

There are currently two special command types.

The simplest is the end type. This is used to specify commands which may end the server processing. By default, processing continues until the end of the file is reached or a signature delimiter '--' is found. For instance, you may wish to define the command quit and give it the end type. As soon as the server reaches that command, it aborts processing and discards the remaining of the message.

The help type is usually attached to an help command and prints help on a command basis, help for each command being stored under the helpdir variable (defined in your ~/.mailagent) in a file bearing the same name as the command itself. For example, assuming a command shoot, its help file would be expected in helpdir/shoot. If no file is found there, mailagent looks in its public library (/usr/share/mailagent) for an help file. Help is provided only when the help file exists and is not zero-sized.

Creating the Root Power

In order to bootstrap the server, you need to create the root power. All the other powers may then be created by using the server interface, which ensures consistency and logs your actions. If you don't plan using powers at all, you may skip that section.

First, you need to pick up a good password for the root power. Someone with the root power can do virtually anything with the server, so be careful. Let's assume you choose root-pass as a password.

Edit passwd (defined in your ~/.mailagent) and add the following line:

root:<root-pass>:

i.e. enter the password in clear between '<' and '>'. It won't stay in that form for long, but this is the easiest way to bootstrap it. Protect the passwd file tightly (read-write permissions only for you). Then create a powerdir/root file, protect it the same way and add your e-mail address to it, on a line by itself. That must be the address that will show up in the From: line of your mails. Since clearance files support shell-style patterns, you may use login@*domain.top to allow mails from your login from any machine in your domain.

You are almost done. Now simply issue the following command:

mailagent -i -e 'SERVER -t'

and feed its standard input with:

From your e-mail address
From: your e-mail address
power root root-pass
password root root-pass
^D

Note that the first From line is mandatory here, since it's the envelope on which authentication is based. Since we're feeding mailagent with an handcrafted message, we must provide a valid envelope or the server will not switch into trusted mode...

The side effect of re-instantiating your password will be to crypt it in the passwd file, so that anybody looking at that file cannot guess your root password, hopefully.

Once you have a valid root power installed, you may create the system power by using newpower. Further powers may then be created and deleted using the system power only.

You should also create the security power and give it a different password than the root password. This is really needed only if you wish to remotely administrate the server. If you have local access and things get corrupted, it's always possible to change the root password manually by repeating this bootstrapping sequence.

Note that clearance checks are made using the envelope address of the message, which is a little harder to forge than plain header fields like Sender:. The envelope is extracted by looking at the first header line, which on Unix systems looks like:

        From envelope-address send-date


and is inserted by the mail transport agent (MTA). If you are using sendmail as the MTA, then only trusted users declared in the sendmail.cf file are able to create a "fake" envelope address, a feature typically used by mailing list dispatchers, since that address is then used as the bounce target in case the mail cannot be delivered. If that first header line is absent, the sender is computed using the Sender: field if present, then the From: field, but the auth variable is set to false and the server will not switch into trusted mode; in other words, it will not be possible to gain powers in that session.

Moreover, since the session transcript is sent to that same envelope address used to authenticate the eligibility for a power, the server feature can hardly be used to retrieve confidential information held at the site where the mailagent is run since the information would be sent to one of the users cleared for that power. It is the responsibility of you, the user, to make sure this cannot happen or you could get into legal troubles.

Finally, sensitive commands should be protected by a proper power, and great care should be taken in writing the command implementation to ensure the security cannot be circumvented. But no, this mailagent feature is not believed to be dangerous for the system or site it is used on, since a determined user could implement one trivially via a five line shell script. If security is really an issue, .forward files using the piping feature should be prohibited and access to cron forbidden in order to avoid automatic mail processing (since it would be possible to have cron invoke a mailagent process -or any other program for that matter- to process the incoming mail in a comparable way).

Example

Here is an example showing the steps involved in creating a shell command, which would take a script by collecting lines until an EOF mark and feed it to a real shell for execution. Since allowing this feature without any safeguards would be a real security hole, we protect that by requesting the power shell before allowing the execution.

Here is my implementation of the shell command (available in the mailagent distribution under misc/shell):

#!/bin/sh
# Execute commands from stdin, as transmitted by the mailagent server.
# File descriptor #3 is a channel to the session transcript.
# Make sure we have the shell power.
# Don't even allow the root power to bypass that for security reasons.
case ":$powers:" in
*:shell:*) ;;
*)
        echo "Permission denied." >&3
        exit 1
        ;;
esac
# Perhaps a shell was defined... Otherwise, use /bin/sh
case "$shell" in
'') shell='/bin/sh';;
esac
# Normally, a shell command has its output included in the transcript only in
# case of error or when the user requests the trace. Here however, we need to
# see what happened, so everything is redirected to the session transcript.
exec $shell -x >&3 2>&3

Note how we make access to the $powers and $shell environment variable. That last one is user-defined to allow dynamic set-up of a shell.

Assuming we store that command under servdir/shell.sh (don't forget to add the execution bit on the file...), here is how we declare it and its variable in the comserver file.

shell   shell   -       y       -
shell   var     -       -       -

This example shows that there is a separate name-space for variables and commands. Moreover, the command bears the same name as its type -- don't let that confuse you :-).

Now, assuming you have already created a system power and protected it with a password (let's assume sys-pass for the purpose of this example), you need to create the shell power. Although you could do it manually (like when you handcrafted the root power), it's better to use the SERVER interface since it ensures consistency.

In order to create the shell power required to use the newly created shell command, you need to add the following rule to your rule file:

Subject: Server         { SAVE server; SERVER -t };

which will save all server mail in a dedicated folder and process them. Note the -t option, which allows trusted mode, in which powers may be gained. Now send yourself the following mail:

Subject: Server
power system sys-pass
newpower shell shell-pass
[email protected]
EOF

which requests for the system power (needed to created most powers), and then creates a new power shell, assigning shell-pass as its password and clearing [email protected] for it. Note the here-document fill-in for the newpower command, up to the EOF marker. Of course, you need to replace the address by your real address.

You will receive a session transcript along these lines:

    ---- Mailagent session transcript for [email protected] ----
----> power system ********
OK.
====> newpower shell ********
OK.
====> --
End of processing (.signature)
    ---- End of mailagent session transcript ----

Note the concealed passwords, and the prompt change once the system power has been granted. Since my mailer automatically appends a signature, the processing stops on it.

Now let's use this new command... Send yourself the following mail:

Subject: Server
set shell /bin/ksh
set eof END
shell
ls -l /etc/passwd
END
power shell shell-pass
shell
ls -l /etc/passwd
END

If you everything is right, you should receive back a transcript looking like this:

    ---- Mailagent session transcript for [email protected] ----
----> set shell /bin/ksh
OK.
----> set eof END
OK.
----> shell
Permission denied.
Command returned a non-zero status (1).
FAILED.
----> power shell ********
OK.
====> shell
+ ls -l /etc/passwd
-rw-r--r--   1 root     system       691 Oct 01 14:24 /etc/passwd
OK.
====> --
End of processing (.signature)
    ---- End of mailagent session transcript ----

The first invocation of the shell command fails since we lack the shell power. The string "Permission denied." is echoed by the command itself into file descriptor #3 and makes it to the transcript.

Conclusion

The generic mail server implemented in mailagent can be used to implement a mailing list manager, a vote server, an archive server, etc... Unfortunately, it does not currently have the notion of state, with a command set dedicated to each state, so it is not possible to implement an intelligent archive server.

If you implement new simple server commands and feel they are generic enough to be contributed, please send them to me and I will gladly integrate them.

EXAMPLES

Here are some examples of rule files. First, if you do not specify a rule file or if it is empty, the following built-in rule applies:

All: /^Subject: [Cc]ommand/ { LEAVE; PROCESS };

Every mail is left in the mailbox. Besides, mail with "Subject: Command" anywhere in the message are processed.

The following rule file is the one I am currently using:

maildir = ~/mail;
All: /^Subject: [Cc]ommand/     { SAVE cmds; PROCESS };
To: /^[email protected]/            { POST -l mail.gue };
Apparently-To: ram,
Newsgroups: mail.gue            { BOUNCE [email protected] };
<_SEEN_>
        Apparently-To: ram,
        Newsgroups: mail.gue    { DELETE };
From: root, To: root            { BEGIN ROOT; REJECT };
<ROOT> /^Daily run output/      { WRITE ~/var/log/york/daily.%D };
<ROOT> /^Weekly run output/     { WRITE ~/var/log/york/weekly };
<ROOT> /^Monthly run output/    { WRITE ~/var/log/york/monthly };
From: ram               { BEGIN RAM; REJECT };
<RAM> To: ram           { LEAVE };
<RAM> X-Mailer: /mailagent/     { LEAVE };
<RAM>                   { DELETE };

The folder directory is set to ~/mail. All command mails are saved in the folder ~/mail/cmds and processed. They do not show up in my mailbox. Mails directed to the gue mailing list (French Eiffel's Users Group, namely Groupe des Utilisateurs Eiffel) are posted on the local newsgroup mail.gue and do not appear in my mailbox either. Any follow-up made on this group is mailed to me by inews (and not directly to the mailing list, because those mails would get back to me again and be fed to the newsgroup, which in turn would have them mailed back to the list, and so on, and so forth). Hence the next rule which catches those follow-ups and bounces them to the mailing list. Those mails will indeed come back, but the _SEEN_ rule will simply delete them.

On my machine, the mails for root are forwarded to me. However, everyday, the cron daemon starts some processes to do some administration clean-up (rotating log files, etc...), and mails the results back. They are redirected into specific folders with the WRITE command, to ensure they do not grow up without limit. Note the macro substitution for the daily output (on Mondays, the output is stored in daily.1 for instance).

The next group of rules prevents the mail system from sending back mails when I am in a group alias expansion. This is a sendmail option which I disabled on my machine. Care is taken however to keep mails coming from the mailagent which I receive as a blind carbon copy.

CAVEAT

In order to limit the load overhead on the system, only one mailagent process is allowed to run the commands. If some new mail arrives while another mailagent is running, that mail is queued and will be processed later by the main mailagent.

For the same reason, messages sent back by mailagent are queued by sendmail, to avoid the cost of mail transfer while processing commands.

SECURITY

First, let me discuss what security means here. It does not mean system safety against intruder attacks. If your system allows .forward hooks and/or cron jobs to be set by regular users, then your system is not secure at all. Period. So we're not bothering with security at the system level, but rather at your own account level where all sort of precious data is held.

To avoid any pernicious intrusion via Trojan horses, the C filter will refuse to run if the configuration file ~/.mailagent or the rule file specified are world writable or not owned by the user. Those tests are enforced even if the filter does not run setuid, because they compromise the security of your account. The mailagent will also perform some of those checks, in case it is not invoked via the C filter.

Indeed, if someone can write into your ~/.mailagent file, then he can easily change your rules configuration parameter to point to another faked rule file and then send you a mail, which will trigger mailagent, running as you. Via the RUN command, this potential intruder could run any command, using your privileges, and could set a Trojan horse for later perusal. Applying the same logic, the rule file must also be protected tightly.

And, no surprise, the same rules apply for your newcmd file, which is used to describe extended filtering commands. Otherwise it would allow someone to quietly redefine a commonly used standard command like LEAVE and later be able to assume your identity.

Versions after 3.0 PL44 come with an improved (from a security point of view) C filter that will not only perform the aforementionned checks but will also ensure that the perl executable and the mailagent script it is about to exec are not loosely protected (when execsafe is ON or when running with superuser privileges). Furthermore, if the filter is set up in your .forward as described in this man page, it will be able to check itself for safety and will warn you loundly if it can be tampered with, which could defeat all security checks.

Mailagent was also extended so that all programs executed via RUN and friends, as well as mail hooks, are checked for obvious protection flaws before being actually run Interpreted scripts (starting with the #! magic token) and perl scripts following the magic "exec perl if $under_shell" incantation are specially checked for further security of the relevant interpretor. Those checks are performed systematically (when execsafe is ON or when running with superuser privileges) even if the secure parameter was not set to ON. Also, all files about to be exec()ed are checked using the same extended check method used when secure is ON (ownership tests are skipped however when checking for exec()-ness of a file).

FILES

~/.mailagent
configuration file for mailagent.
~/agent.trace
trace dump from a PROCESS command when error cannot be mailed back.
~/mbox.filter
mailbox used by filter in case of error
~/mbox.urgent
mailbox used by mailagent in case of error
~/mbox.<username>
mailbox used if writing access is denied in the mail spool directory
/usr/share/mailagent/mailagent
directory holding templates and samples.
Log/agentlog
mailagent's log file.
Spool/agent.wait
list of mails waiting to be processed and stored outside of mailagent's queue directory. Even when logically empty, this file is kept around and still holds one blank line to reserve a block on the filesystem.
Queue/qmXXXXX
mail spooled by filter.
Queue/fmXXXXX
mail spooled by mailagent.
Queue/cmXXXXX
mail spooled by the AFTER command.
Hash/X/Y
hash files used by RECORD, UNIQUE, ONCE commands and vacation mode.

BUGS

There is a small chance that mail arrives while the main mailagent is about to finish its processing. That mail will be queued and not processed until another mail arrives (the main mailagent always processes the queue after having dealt with the message that invoked it).

A version number must currently contain a dot. Moreover, an old system (i.e. a system with an o in the patches column) must have a version number, so that mailagent can compute the name of the directory holding the patches.

The lock file is deliberately ignored when -q option is used (in fact, it is ignored whenever an option is specified). This may result in having mails processed more than once.

Mailagent is at the mercy of any perl bug, and there is little I can do about it. Some spurious warnings may be emitted by the data-loaded version, although they do not appear with the plain version.

Parsing of the rule file should be done by a real parser and not lexically. Or at least, it should be possible to escape otherwise meaningful characters like ';' or '}' within the rules.

AUTHOR

Raphael Manfredi <[email protected]>.