LDAPfilter is a SpamAssassin plugin that reads a variety of common markers from incoming email messages and then queries a shared LDAP directory for spam or ham tags associated with those markers. This approach is similar to other comprehensive blacklist and whitelist models that can filter email according to one or more attributes, but the use of LDAP allows for high levels of scaleability that goes well beyond what most other approaches are able to provide.
Specifically, the LDAPfilter module examines incoming messages for each of the following attributes:
- SMTP sender IP address. The IP address of the remote sending system, as determined by SpamAssassin from data in the Received headers.
- SMTP sender reverse-DNS hostname. The hostname associated with the IP address above. This data is not always recorded by SMTP servers and may not always be avaialble. It is not always reliable either, although it can be very useful when it is provided and accurate.
- SMTP sender HELO identfier. The string provided by the sending system in the SMTP handshake. This data is not very trustworthy but can be useful as a negative indicator (such as when a remote host identifies as "localhost").
- SMTP envelope MAIL FROM email address. SMTP exchanges use an envelope construct that includes a return address which tells the server where to send errors and delivery-status messages. The last server in the delivery path extracts this data from the envelope and stores it in the Return-Path header of the message. This data is not reliable, but can be useful when it is accurate, such as whitelisting specific sender addresses.
- SMTP envelope RCPT TO email addresses. The email address(es) of the message recipient(s), as specified in the envelope. This data determines who actually gets the message. SMTP servers do not normally record this data in the message headers, so you must be able to configure the server to record this data in a user-defined field before it can be used by LDAPfilter. Note that this data will identify all of the recipients (including group members, people in the BCC header, and so forth), so it may be problematic to permanently record this data in the message. However, sometimes an administrative address needs to be whitelisted, or sometimes a honey-pot address needs to be blacklisted, and the envelope data is more reliable than the recipient To and CC address fields in the message headers.
- Message From: email address. The email address stored in the From: header, which may or may not be the same as the envelepe MAIL FROM email address. For example, a mailing list remailer may list a message's author in the From header, but list itself as the sender in the SMTP envelope.
- Message Reply-To: email address. The email address stored in the Reply-To: header, if it exists. This field is often used in phishing attacks that present the message as being "from" a respectable party but then force replies to be sent to an unrelated email service.
- Message To: and CC: email addresses. The email addresses stored in the To: and/or CC: headers. Sometimes an administrative address needs to be whitelisted, or sometimes a honey-pot address needs to be blacklisted. Note that these fields may not list all of the actual recipients, so the envelope data is more reliable.
- Message body URIs. SpamAssassin has routines that extract URIs from inside message bodies, and LDAPfilter also has the ability to check these URIs against the LDAP database. This is particularly useful with blacklist some of the third-party marketers that use their own domain names to track message viewership and click-through rates. By the same token, this data can also be useful for whitelisting desirable newsletters that contains a consistent URI.
Administrators can selectively enable or disable any of the above attributes. In those cases where multiple attribute values may be present (as with multiple To: or CC: recipients), LDAPfilter allows the administrator to specify a cutoff number.
LDAPfilter builds multi-part searches that include the exact attribute value and all of the likely parent values. This allows administrators to build filters for any scope, and even for overlapping scopes. For example, if a From header field contains the email address of firstname.lastname@example.org, then LDAPfilter will construct a search string that uses that email address, the example.co.uk mail domain, as well as the co.uk and uk parent domains. For data that uses an IP address, LDAPfilter starts the search string with the host address and a /32 subnet mask appended to the end, but also applies all of the subnet maks between /29 and /8 to the address to calculate the parent networks and then includes those addresses in the search string. This recursion feature allows administrators to specify filters for individual assets or entire portions of the Internet, as may be appropriate for a specific filter. Recursion depth can also be tweaked if the administrator desires.
The string values are stored in an LDAP attribute called "mailFilterName". Entries in the LDAP directory can use mailFilterName as a naming attribute, or the entries can use another naming attribute (such as CommonName) with mutliple mailFilterName attributes providing multiple matches to the entry. For example, a "cn=Spammers" entry may have hundreds of mailFilterName atributes that each identify a specific domain name, while an entry for a local network that only has a single mailFilterName for the subnet address might simply use that attribute for the naming attribute. The point here is that LDAPfilter searches by attribute and does not make any decisions based on how the entry is named.
As LDAPfilter parses through the incoming email message, it issues search requests for the entry by name and also by filter type. More specifically, entries can have multiple filter weights associated with them, with each weight being incorporated into the final score as the message data is analyzed.. For example, an LDAP entry with a mailFilterName attribute for example.com can have filter attributes that apply to the SMTP HELO string and a URI in the message body (among others), with each filter value being returned as LDAPfilter works through the message data. As such, it's possible for messages to be weighted multiple times, based on the full message.
Furthermore, each filter can have have up to four scoring values, ranging from full blacklist to full whitelist, with two shades of grey in between. Each of the filter values will be converted to numeric scores when SpamAssassin finishes the message processing. In this regard, it's possible for a message to be blacklisted due to one value, but whitelitsted due to another, and for the final score to be neutral due to the sum value of zero.
To use this plugin, perform the following steps:
- Review the manual page for LDAPfilter to learn about the modules configuration options.
- Download spamassassin-ldapfilter.0.9.tar.gz to a temporary directory on the SpamAssassin host system.
- Expand the archive with the command
tar -xvzf spamassassin-ldapfilter.0.9.tar.gz, and copy the files from the spamassassin-ldapfilter directory that is created to the main SpamAssassin directory.
- Import the spamAssassinFilter.schema file into the LDAP server's configuration, and restart the LDAP server if necessary. The schema definition file is available separately if you wish to review it beforehand.
- Change to the main SpamAssassin directory, and use a text editor of your choice to review the contents of the ldapfilter.cf configuration file, especially the LDAP session options. By default, LDAPfilter will try to use DNS SRV resource records to locate a local LDAP server, and will then attempt to bind to that server with anonymous credentials. If either of these actions fail, LDAPfilter will exit gracefully. If you need to explicitly identify the LDAP server, the transport protocol, or the authentication credentials to use for the LDAP queries, edit the appropriate fields in the ldapfilter.cf configuration file.
- Execute the command
spamassassin --lintto verify that SpamAssassin operates as expected.
- Create the appropriate entries in your LDAP server, and assign the appropriate security permissions.
Once the software has been installed and configured, incoming messages will begin to be processed through LDAPfilter.
LDAPfilter implements support for SpamAssassin debug output, which can be activated
--debug parameter to the SpamAssassin command-line. The
module-specific debug messages are marked with the string "LDAPfilter:" as
shown in the example below:
LDAPfilter: SMTP client IP address determined as "22.214.171.124/32" LDAPfilter: searching for (|(mailFilterName=126.96.36.199/32) (mailFilterName=188.8.131.52/29)(mailFilterName=184.108.40.206/28) (mailFilterName=220.127.116.11/27)(mailFilterName=18.104.22.168/26) (mailFilterName=22.214.171.124/25)(mailFilterName=126.96.36.199/24) (mailFilterName=188.8.131.52/23)(mailFilterName=184.108.40.206/22) (mailFilterName=220.127.116.11/21)(mailFilterName=18.104.22.168/20) (mailFilterName=22.214.171.124/19)(mailFilterName=126.96.36.199/18) (mailFilterName=188.8.131.52/17)(mailFilterName=184.108.40.206/16) (mailFilterName=220.127.116.11/15)(mailFilterName=18.104.22.168/14) (mailFilterName=22.214.171.124/13)(mailFilterName=126.96.36.199/12) (mailFilterName=188.8.131.52/11)(mailFilterName=184.108.40.206/10) (mailFilterName=220.127.116.11/9)(mailFilterName=18.104.22.168/8)) LDAPfilter: no entries were returned LDAPfilter: SMTP client domain name determined as "hermes.apache.org" LDAPfilter: searching for (|(mailFilterName=hermes.apache.org) (mailFilterName=apache.org)(mailFilterName=org)) LDAPfilter: "cn=Friendly Nets,ou=Mail-Filters,ou=Serv..." has a spamAssassinFilterClient attribute with the value of LIGHTLISTED LDAPfilter: SMTP client HELO identifier determined as "mail.apache.org" LDAPfilter: searching for (|(mailFilterName=mail.apache.org) (mailFilterName=apache.org)(mailFilterName=org)) LDAPfilter: "cn=Friendly Nets,ou=Mail-Filters,ou=Serv..." does not have a spamAssassinFilterHelo attribute