PMDF FAQ: Filtering Mail

 

Should I use MAPS RBL or other blacklists?

RBL (Real Time Blackhole List) is part of the Mail Abuse Prevention System (MAPS) organization, and can be found at http://mail-abuse.org/rbl/. A subscription to RBL can reduce spam because it verifies the DNS address of a sender. If the DNS address cannot be verified, the e-mail will not be delivered. Often spammers use forged e-mail addresses from non-existent domains.

Warning: Performing DNS checks may result in the rejection of some valid messages. For instance, this could include mail from legitimate sites that simply have not yet registered their domain name, or during periods of bad information in DNS.

Also, if DNS or connections to the sites being used for DNS verification become unavailable then mail delivery will be impacted. Use of these spam blocking techniques can impact performance as well as result in unreliable mail reception due to the dependency on multiple DNS lookups for every incoming SMTP connection.

PMDF supports RBL and other blackhole lists via the dispatcher options DNS_VERIFY_DOMAIN or ENABLE_RBL. Note that ENABLE_RBL=1 is the same as DNS_VERIFY_DOMAIN=blackholes.mail-abuse.org (the MAPS RBL list). Therefore, ENABLE_RBL has effectively been obsoleted by DNS_VERIFY_DOMAIN.

The DNS_VERIFYi shareable image can also be used to validate domain names or IP addresses via DNS. For example, it can be used to verify that an entry in DNS exists for the domain used in the SMTP MAIL FROM: command, or to look up an IP address in the MAPS RBL list and other blackhole lists. The message can be rejected or accepted based on the presence or absence of a corresponding DNS record, or a new header can be added to the message to indicate the problem.

DNS_VERIFY is supplied as a sharable image on VMS, and as a sharable object library on UNIX.

DNS_VERIFY has 4 routines that can be called:

  • dns_verify is the most general of the routines, but most complicated to set up. It simply does a lookup in DNS of the domain name that you specify, which could be the domain name corresponding to the IP address in the RBL list, for example.
  • The dns_verify_domain and dns_verify_domain_port routines are used to query the specified blackhole list and return pre-defined success, failure, and unknown messages.
  • The dns_verify_domain_warn routine performs the same DNS lookup as the dns_verify_domain and dns_verify_domain_port routines, but instead of rejecting the message if the DNS entry exists, it adds a new header line to the message.

Please see Chapter 16 in the PMDF System Managers Guide for more information about how to use each of these routines.


What are some examples of spam that PMDF can eliminate?

There are many ways PMDF can be used to eliminate spam. Here are just a few examples:

  • PMDF can eliminate e-mail if there is no date in the headers. Typically, spam does not include a date in the headers. Since RFC822 requires a date to be present, PMDF will insert a date header and include the header "date-warning" in the message. However, it is advised to discard mail with this message in the header.
  • Mail subject lines that contain certain phrases that are pornographic, or include illicit services can be eliminated.
  • Key words or phrases in the body of messages can be eliminated.


How do I minimize filtering mail that is not Spam (false positives)?

It is advisable to review the filtering mechanisms to make sure that you only trap mail that is considered spam. One way to avoid discarding legitimate mail accidentally is to specify that mail always be delivered from certain domains or e-mail addresses, such as, bigboss@homeoffice.com.


What are my PMDF filtering options?

PMDF filtering can occur on three levels.

1. System Level Filters - System level filters are exactly as the name implies. All mail that comes into the system is run through this filter. The system level filter is applied whenever a message gets enqueued to a channel. If a message is enqueued a number of times (once, say, on tcp_local and once on conversion and then the l or msgstore channel), then the filter is processed each time. This does not cause a problem since messages can be enqueued to different channels and the amount of overhead to do this operation is minimal. Some channels, such as the conversion channel, may change the content of a message, so it can be advantageous to run through the system filter another time.

To activate the system level filters, simply create a file called PMDF_TABLE:PMDF.FILTER (on VMS) or /pmdf/table/pmdf.filter (Linux). This can be done via the PMDF web-based interface or by using a text editor. The system level filters are part of the configuration file. When this file is changed, PMDF SMTP will have to be restarted (pmdf restart smtp). If the configuration is installed, it will need to be recompiled (pmdf cnbuild). On OpenVMS only, the configuration will need to be installed (install replace pmdf_config_data), and then restart SMTP.

2. Channel Level Filters - These filters are invoked when a message is dequeued (sourcefilter) or enqueued (destinationfilter) on a particular channel. The file names of these filters are specified by the value of the keyword. Channel filters are not part of the configuration so no restart is necessary. The channel level filters will override the system level filters. If a message is evaluated and recommended to be discarded using the system filter, and that same message is also evaluated and recommended to be kept using the channel filter, then the message will be kept.

3. Mailbox Filters - The e-mail administrator maintains the system and channel level filter files. End users maintain their own mailbox filters if the e-mail administrator enables this feature. All the mail that is destined for a particular user will run through a mailbox filter. Use of the filter keyword on the l channel, popstore or msgstore channel (what ever is appropriate for your system) activates this feature. Mailbox filters override both channel level and system level filters.


What is the format of a Sieve filter?

Sieve filtering is a series of tests. These tests are structured as "if" statements. For example:

if header :contains
  ["return-path","from","sender","resent-from","resent-sender"]
  ["unwanted.com","spamsite.com"]
  { discard;}

elsif header :matches
  ["return-path","from","sender","resent-from","resent-sender"]
  ["trash.com"]
  { discard;}

else { keep;}

elsif and else are optional, but, if used, must follow an if. There are a number of actions that one can take on a message. Some actions require the require statement.


Can you provide an example of a Sieve Filter for dealing with spam?

Spam is not only determined by the content of a message, but also by the Content header field. For example, normally messages with Content-Type of text/html are sent as text. To get around anti-spam filters, some spammers send this same message as Content-transfer-encoding at base64. This encodes the message by hiding the text from the anti-spam filters. It has been my experience that ONLY spammers do this.

The allof test uses the "logical and" during its tests.

if allof (header :contains "Content-type" ["text/html"],
  header :contains "Content-transfer-encoding" ["base64"])
{ discard;}

Below is another example of a simple Sieve test:

if body :contains
    ["You've got to see this page! It's really cool ;O)",
    "we don't want to waste your time",
    "I'll make you a promise. READ THIS E-MAIL TO THE END!",
    "CFGWIZ32.EXE",
    "Klez.E is the most common world-wide",
    "README.EXE",
    "Section 301",
    "Worlds First Absolutely FreeAdultSupersite",
    "src=3Dcid",
    "src=cid",
    "this is not virus mail"]
      { discard;}

There are three different types of arguments: :contains, :matches, and :is.

This short script looks for certain phrases within the body of the message. The argument :contains will flag a line where the phrase is a subset of the whole line. All characters are taken as literal, including wildcards (*). So if you were to look for the phrase "this is not virus mail", you would not be able to use "this is * mail", since the test phrase does not have an asterisk in it. To use wildcards, use the argument :matches. However, use this rule with caution. Not only would the test phrase catch the offending mail, but also legitimate mail with the following sentence:

"Hi Bill, this is the very important client mail message you asked me about."

The argument :is will flag only a line with an exact match.


Can I perform a case-sensitive test with Sieve filters?

By default, the tests are case-insensitive. With the :comparator argument, case sensitivity can be controlled. For example, the first line of a test would be:

if body :contains :comparator "i;ascii-casemap" 

to keep the default (case-insensitive) or use:

if body :contains :comparator "i;octet"

to force case-sensitive tests.


How do I block emails with file attachments?

To set up your conversion channel to remove unwanted file types that come through as attachments, you want to first create a CONVERSION table in your PMDF_ TABLE:MAPPINGS. file:

CONVERSIONS
IN-CHAN=TCP_*;OUT-CHAN=*;CONVERT Yes
IN-CHAN=*;OUT-CHAN=*;CONVERT No

The actual conversions performed by the conversion channel are controlled by rules specified in the PMDF conversions file. The conversions file is located via the PMDF_CONVERSION_FILE logical name (OpenVMS), or PMDF tailor file option (UNIX), and is usually the file PMDF_TABLE:CONVERSIONS. on OpenVMS, or /pmdf/table/conversions on UNIX.

You have to be very precise about the format of this file. The first line begins flush left in column 1, while the second and subsequent lines are indented at least 1 space. Each entry block is separated by a blank line. The correct form of the conversions would then be:

! CONVERSIONS - Table of conversions for the CONVERSION channel to perform
!
! For getting rid of the .exe attachments
out-channel=*; in-type=application; in-subtype=*;
 in-parameter-name-0=name; in-parameter-value-0=*.exe;
 delete=1

out-channel=*; in-type=application;in-subtype=*;
 in-dparameter-name-0=name;in-dparameter-value-0=*.exe;
 delete=1

How to detemine which section of your Sieve file caught your spam. Also, this tech tip covers how to bypass filters.

To log your tests to determine what types of words or phrases cause messages to be filtered, use one or all of the following actions: debug, discard, and/or reject.

{debug "Sieve: message contains in BODY-3 - discard"; discard;}

This will place the text of the debug action into the slave or master log file (e.g. tcp_local_slave.log) for those messages that matched a value in that particular test. If you keep these files for one day, you can write a script that writes these messages out to a log file. Please note that the debug action is a PMDF extension, and not part of the RFC. You will also need to have the slave_debug key word on the tcp_local channel and MM_DEBUG=2 in the option.dat file.

By default, should a message get trapped, it will be discarded (if that is the action) immediately. Since no system is perfect, you will get some "false positives" or, mail that was discarded that should not have been. Process Software recommends that you quarantine e-mail on your system for a period of time so that it can be reviewed. To accomplish this, follow the steps below:

1. Add another entry in to your Sieve file. It is best if you place this near the top:

if exists "X-Filter-File" { keep; stop;}

2. Discarded mail should be filed into the filter_discard channel. You may have to defined this channel in your pmdf.cnf file. It should be defined as:

! Filter channel
filter_discard notices 7
FILTER-DISCARD

The notices 7 says that the message will be kept in the filter_discard channel for 7 days until it is deleted.

3. So that PMDF is aware of the filter_discard channel, add or change

filter_discard=2

In your option.dat file.

Since these two files are in you configuration file, you will have to recompile your configuration.

4. To review what is in the filter_discard channel get into pmdf qm maint:

qm.maint> dir/env filter_discard

This will show you what messages are filtered, including the envelope from and to addresses for easy identification.

If you find a message that you suspect should not have been flagged (e.g. message number 34), then the following is recommended:

1. Determine what the message filename is for that message:

qm.maint> read 34

or

qm.maint> read/content 34

to read the content of the message

The file name will be a ZZnnnn.00 name

2. Edit the file and somewhere in the headers add the line:

X-Filter-File: x

the value after the : is immaterial, but header: value is the required syntax. Put this header tag just before the MIME-Version: 1.0 header line. Make sure not to leave any blank lines.

3. Save this edited file to the pmdf_queue:[process] (/pmdf/queue/process/) directory but change the extension to something other than 00, say 05. You can also use the reprocess channel.

4. Perform a pmdf cache /sync (pmdf cache -sync on Linux)

5. Then run or submit the pmdf process channel:

pmdf submit process or pmdf run process

The mail will bypass all the filters (since the X-Filter-File header is found) and be delivered.