Mail::SpamAssassin::Plugin::LDAPfilter
LDAPfilter is a SpamAssassin plugin that extracts common attribute values from incoming email messages and then queries a shared LDAP directory for the spam or ham markers associated with those values. This approach is similar to other comprehensive blacklist and whitelist models that can filter email according to one or more message elements, but the use of LDAP allows for high levels of scaleability that goes well beyond what most other approaches are able to provide.
Specifically, the LDAPfilter module examines incoming messages for each of the following attributes:
- SMTP sender IP address. The IP address of the remote sending system, as determined by SpamAssassin from data in the Received headers.
- SMTP sender reverse-DNS hostname. The hostname associated with the IP address above. This data is not always recorded by SMTP servers and may not always be avaialble. It is not always reliable either, although it can be very useful when it is accurate.
- SMTP sender HELO identfier. The string provided by the sending system in the SMTP handshake. This data is not very trustworthy but is sometimes useful.
- SMTP envelope MAIL FROM email address. SMTP exchanges use an envelope construct that includes a return address which tells the server where to send errors and delivery-status messages, which is stored in the Return-Path header by the last SMTP server. This data is not reliable either, but can be useful when it is accurate. This data is particularly useful for whitelisting email messages from mailing list remailers that use a consistent email address.
- SMTP envelope RCPT TO email addresses. The email address(es) of the message recipient(s), as specified in the envelope. This data determines who actually gets the message. SMTP servers do not normally record this data in the message headers, so you must be able to configure the server to record this data in a user-defined field before it can be analyzed. Note that this data will identify all of the recipients (including group members, people in the BCC header, and so forth), so it may be problematic to permanently record this data in the message. However, sometimes an administrative address needs to be whitelisted, or sometimes a honey-pot address needs to be blacklisted, and the envelope data is more reliable than the message's To and CC address fields.
- Message From: email address. The email address stored in the From: header, which may or may not be the same as the envelepe MAIL FROM email address. For example, a mailing list remailer may list a message's author in the From header, but list itself as the sender in the SMTP envelope, thereby causing delivery notifications to be sent to the remailer instead of the author.
- Message Reply-To: email address. The email address stored in the Reply-To: header, if it exists. This field is often used in phishing attacks that present the message as being from a respectable party but then force replies to be sent to an unrelated free email service.
- Message To: and CC: email addresses. The email addresses stored in the To: and/or CC: headers. Sometimes an administrative address needs to be whitelisted, or sometimes a honey-pot address needs to be blacklisted. Note that these fields may not list all of the actual recipients, so the envelope data is more reliable.
- Message body URIs. SpamAssassin has routines that extract URIs from inside message bodies, and LDAPfilter also has the ability to check these URIs against the LDAP database. This is particularly useful with some types of professional-marketing mailing lists that use their own domain name and redirection techniques to track message viewership and click-through rates, and can also be useful for whitelisting email that contains a fairly consistent URI.
Note that administrators can selectively enable or disable any of the above attributes when scanning email messages. Furthermore, in those cases where multiple attribute values may be present (as with multiple To: or CC: recipients), LDAPfilter allows the administrator to specify a cutoff number.
Furthermore, for each of the attribute values, LDAPfilter constructs searches for the determined attribute value, as well as the parent domains of the value. For example, attribute values that contain an email address include the explicit address as well as the domain name of the email address and its parent domains, with all of these values being submitted for comparison. Similarly, attribute values that contain a host IP address are combined with all of the possible parent subnet addresses, with all of the possible values being submitted in the LDAP search. This recursion feature allows administrators to specify filters for entire networks or individual assets, as may be appropriate for a specific filter. Recursion depth can also be restrained if the administrator desires.
Searches specifically look for entries with an attribute called "mailFilterName" which is defined in the included schema. Entries in the LDAP directory can use mailFilterName as a naming attribute if they want to have individual entries for each filter, or entries can use another naming attribute (such as CommonName) with one or mor mailFilterName subordinate attributes if the administrator wishes to associate multiple filters with a single entry.
Each entry in the LDAP directory can also has "filter" attributes for each of the individual message attributes, which are used to score the message. Specifically, the filter attributes store data values that indicate whether the message attribute is spam or ham, with four possible weightings. If any of the filter are returned, LDAPfilter assigns a generic SpamAssassin rule to the message, which are then incorporated into the overall score. Since these values are assigned for each message element (such as enevelop data, or a URI in the body), it allows for one rule per element. In this model, it is possible to assing a negative score to one attribute which is outweighed by a larger positive score for another attribute, with both scores being mixed with other rules for a final score.
Installation
To use this plugin, perform the following steps:
- Review the SpamAssassin threshold value, as defined by the "required_score" field in one of the SpamAssassin configuration files. Messages with a spam score that is equal to or higher than this value will trigger additional processing in SAGrey, while messages with a lower score will be ignored. SpamAssassin uses a default value of "5" if this setting is not explicitly defined.
Debug Output
LDAPfilter implements support for SpamAssassin debug output, which can be activated
with the --debug parameter to the SpamAssassin command-line. The
module-specific debug messages are marked with the string "LDAPfilter:" as
shown in the example below:
LDAPfilter: SMTP client IP address determined as "209.237.227.199/32"
LDAPfilter: searching for (|(mailFilterName=209.237.227.199/32)
(mailFilterName=209.237.227.192/29)(mailFilterName=209.237.227.192/28)
(mailFilterName=209.237.227.192/27)(mailFilterName=209.237.227.192/26)
(mailFilterName=209.237.227.128/25)(mailFilterName=209.237.227.0/24)
(mailFilterName=209.237.226.0/23)(mailFilterName=209.237.224.0/22)
(mailFilterName=209.237.224.0/21)(mailFilterName=209.237.224.0/20)
(mailFilterName=209.237.224.0/19)(mailFilterName=209.237.192.0/18)
(mailFilterName=209.237.128.0/17)(mailFilterName=209.237.0.0/16)
(mailFilterName=209.236.0.0/15)(mailFilterName=209.236.0.0/14)
(mailFilterName=209.232.0.0/13)(mailFilterName=209.224.0.0/12)
(mailFilterName=209.224.0.0/11)(mailFilterName=209.192.0.0/10)
(mailFilterName=209.128.0.0/9)(mailFilterName=209.0.0.0/8))
LDAPfilter: no entries were returned
LDAPfilter: SMTP client domain name determined as "hermes.apache.org"
LDAPfilter: searching for (|(mailFilterName=hermes.apache.org)
(mailFilterName=apache.org)(mailFilterName=org))
LDAPfilter: "cn=Friendly Nets,ou=Mail-Filters,ou=Serv..." has a
spamAssassinFilterClient attribute with the value of LIGHTLISTED
LDAPfilter: SMTP client HELO identifier determined as "mail.apache.org"
LDAPfilter: searching for (|(mailFilterName=mail.apache.org)
(mailFilterName=apache.org)(mailFilterName=org))
LDAPfilter: "cn=Friendly Nets,ou=Mail-Filters,ou=Serv..." does not have a
spamAssassinFilterHelo attribute
This plugin was developed for the purpose of being able to store blacklist and whitelist data in an LDAP server. It was originally intended to provide a way to reuse the Postfix LDAP filters inside SpamAssassin, but has subsequently detoured into becoming a generalized front-end for LDAP filtering mechanisms in SpamAssassin.
Using the LDAPfilter model, the author is able to define global entries that are accessible to all of the front-line SMTP servers, while users are also able to define additional entries in their personal LDAP views.
The following resources are available for downloading:
- the current pod documentation in HTML format -- START HERE!
- the ldapfilter.pm Perl module (v0.08 -- August 20, 2005)
- the ldapfilter.cf SpamAssassin configuration file (v0.04 -- August 20, 2005)
- the mailFilter.schema schema definition file (v0.01 -- May 24, 2005)
- the spamAssassinFilter.schema schema definition file (v0.02 -- June 20, 2005)
