The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
The SpamAssassin Network Protocol
=================================

(This document is perpetually in draft status)

Synopsis
--------

The protocol for communication between spamc/spamd is somewhat HTTP-like.  The
conversation looks like:

               spamc --> PROCESS SPAMC/1.2\r\n
               spamc --> Content-length: <size>\r\n
  (optional)   spamc --> User: <username>\r\n
               spamc --> [optional \r\n-delimited headers...]
               spamc --> \r\n [blank line]
               spamc --> --message sent here--

               spamd --> SPAMD/1.1 0 EX_OK\r\n
               spamd --> Content-length: <size>\r\n
               spamc --> [optional \r\n-delimited headers...]
               spamd --> \r\n [blank line]
               spamd --> --processed message sent here--

After each side is done writing, it shuts down its side of the connection.

The first line from spamc is the command for spamd to execute (PROCESS a
message is the command in protocol<=1.5) followed by the protocol version.

There may be additional headers following the command, which are as yet
undefined.  Servers should ignore these, and keep looking for headers which
they do support, or the "\r\n\r\n" end-of-headers marker.

The first line of the response from spamd is the protocol version (note this is
SPAMD here, where it was SPAMC on the other side) followed by a response code
from sysexits.h followed by a response message string which describes the error
if there was one.  If the response code is not 0, then the processed message
will not be sent, and the socket will be closed after the first line is sent.

Again, there may be additional headers following the response line, which are
as yet undefined.  Clients should ignore these, and keep looking for headers
which they do support, or the "\r\n\r\n" end-of-headers marker.


Commands
--------

The following commands are defined as of protocol 1.5:

CHECK         --  Just check if the passed message is spam or not and reply as
                  described below

SYMBOLS       --  Check if message is spam or not, and return score plus list
                  of symbols hit

REPORT        --  Check if message is spam or not, and return score plus report

REPORT_IFSPAM --  Check if message is spam or not, and return score plus report
                  if the message is spam

SKIP          --  Ignore this message -- client opened connection then changed
                  its mind

PING          --  Return a confirmation that spamd is alive.

PROCESS       --  Process this message as described above and return modified
                  message

TELL          --  Tell what type of we are to process and what should be done
                  with that message.  This includes setting or removing a local
                  or a remote database (learning, reporting, forgetting, revoking).

HEADERS       --  Same as PROCESS, but return only modified headers, not body
                  (new in protocol 1.4)


CHECK command returns just a header (terminated by "\r\n\r\n") with the first
line as for PROCESS (ie a response code and message), and then a header called
"Spam:" with value of either "True" or "False", then a semi-colon, and then the
score for this message, " / " then the threshold.  So the entire response looks
like either:

	SPAMD/1.1 0 EX_OK\r\n
	Spam: True ; 15 / 5\r\n

or

	SPAMD/1.1 0 EX_OK\r\n
	Spam: False ; 2 / 5\r\n

There may be additional headers following the "Spam:" header, which
are as yet undefined.  Clients should ignore these, and keep looking for
headers which they do support, or the "\r\n\r\n" end-of-headers marker.


SYMBOLS command returns the same as CHECK, followed by a line listing all the
rule names, separated by commas.  Note that some versions of the protocol
terminate this line with "\r\n", and some do not, due to an oversight; so
clients should be flexible on whether or not a CR-LF pair follows
the symbol text, and how many CR-LFs there are.  Protocol version 1.3
onwards will always not terminate the line with "\r\n".


REPORT command returns the same as CHECK, followed immediately by the report
generated by spamd:

	SPAMD/1.1 0 EX_OK\r\n
	Spam: False ; 2 / 5\r\n
        \r\n
	This mail is probably spam.  The original message has been attached
	along with this report, so you can recognize or block similar unwanted
	mail in future.  See http://spamassassin.apache.org/tag/ for more details.
	[...]

Note that the superfluous-score/threshold-line bug that appeared in
SpamAssassin 2.5x is fixed.

Clients should be flexible on whether or not a CR-LF pair follows
the report text, and how many CR-LFs there are.


REPORT_IFSPAM returns the same as REPORT if the message is spam, or nothing at
all if the message is non-spam.


The PING command does not actually trigger any spam checking, and (as with
SKIP) no additional headers are expected. It returns a simple confirmation
response, like this:

	SPAMD/1.5 0 PONG\r\n

This facility may be useful for monitoring programs which wish to check that
the daemon is alive and providing at least a basic response within a reasonable
time frame.

Note that since the protocol version 1.5, a client sending a PING command
is required to follow the command (and a null header) with an empty line,
for consistency with other commands (fixes bug 6187).


TELL accepts three new headers, Message-class, Set and Remove and will return
two possible headers, DidSet and DidRemove which indicate which action was
taken.  It is up to the caller to determine if the proper action happened.
Here are some examples:

To learn a message as spam:
TELL SPAMC/1.3\r\n
Message-class: spam\r\n
Set: local\r\n

To forget a learned message:
TELL SPAMC/1.3\r\n
Remove: local\r\n

To report a spam message:
TELL SPAMC/1.3\r\n
Message-class: spam\r\n
Set: local, remote\r\n

To revoke a ham message:
TELL SPAMC/1.3\r\n
Message-class: ham\r\n
Set: local\r\n
Remove: remote\r\n

HEADERS returns the same as PROCESS, up to and including the double-newline
separator between message headers and body -- but stops there.  It was
added in SpamAssassin 3.2.0.  Note that this requires protocol version
1.4.



Headers
-------

The following optional headers are defined as of protocol 1.4:

Content-length

    Length of a request or response body, in bytes (generally a requirement
    as of protocol version 1.2 onwards)

Spam

    See above; used in server responses to the CHECK command.

User

    Username of the user on whose behalf this scan is being performed. The
    meaning of this is up to the server; format is that of a traditional UNIX
    username ([-A-Za-z0-9_]+).

Compress

    An optional header, sent by the client to the server, whose value may
    consist of the string "zlib", indicating that the message body transmitted
    by the client is compressed using Zlib compression.  (This is new in
    SpamAssassin 3.2.0.)

As-yet-undefined headers should not be treated as errors, and instead
should be ignored.  Multiple headers can appear in requests and responses
(this was not clearly defined until protocol version 1.3).