Ce contenu n’est pas encore disponible dans votre langue.
How to use Email Spam Filter
The Email Spam Filter lets your services handle inbound emails in a reliable and structured way. Instead of writing your own parser or classifier, you can send raw EML (RFC 822) files to Sentinel’s HTTP API and receive back a parsed email object along with a detailed classification report.
This guide walks you through the setup, API usage, and configuration options to help you integrate the Email Spam Filter into your applications and workflows.
Related Resources
Setup
- Ensure you have Sentinel installed.
- Create a new API key and assign it to a Security Group with Restricted access level (the EML API is non-public).
Receiving Emails
Sentinel does not include an SMTP server. You need to handle email reception separately and then forward the messages to Sentinel for analysis. Common options include:
- Fetching new emails directly from your mail server
- Using a cloud provider like AWS SES for inbound email
In all cases, emails are exchanged in EML (RFC 822) format, which is the raw format you must submit to Sentinel’s API for parsing and classification.
Parsing and Classifying Emails
To process an email in EML format, use the POST /v1/eml
endpoint:
curl -X POST http://localhost:8080/v1/eml \ -H "Content-Type: application/octet-stream" \ -H "Authorization: Bearer {API_KEY}" \ --data-binary @email.eml
{ "authentication": { "ARC": { "comment": "i=1 spf=pass dkim=pass dkdomain=example.com dmarc=pass fromdomain=example.com", "result": "pass" }, "DKIM": [ { "comment": "invalid public key", "result": "neutral", "signingDomain": "example.com" } ], "SPF": { "comment": "example.host: domain of hello@example.com designates 1.2.3.4 as permitted sender", "result": "pass" } }, "classification": { "classification": "GOOD", "score": 0.5, "email": { "rules": { "DISPOSABLE": { "score": 0 }, "DMARC": { "score": 0 }, "FREE_PROVIDER": { "score": 0 }, "MX": { "score": 0 } }, "score": 0, "time": 0.334, "triggeredRules": [] }, "ip": null, "location": null, "rateLimit": null, "similarity": null, "text": { "classifier": "en", "language": "en", "rules": { "CAPITALIZATION": { "score": 0 }, "CURRENCY": { "score": 0 }, "EMOJI": { "score": 0 }, "EXCLAMATION": { "score": 0 }, "HASH_TAGS": { "score": 0 }, "HTML": { "score": 0 }, "HTML_INJECTION": { "score": 0 }, "NUMBERS_ONLY": { "score": 0 }, "PROFANITY": { "score": 0 }, "RANDOM_CHARS": { "score": 0 }, "SHORT_TEXT": { "score": 1 }, "SPAM_WORDS": { "score": 0 }, "SPECIAL_CHARS": { "score": 0 }, "SQL_INJECTION": { "score": 0 }, "UNEXPECTED_LANGUAGE": { "score": 0 }, "URL": { "score": 0 } }, "score": 1, "time": 0.173, "triggeredRules": [ "SHORT_TEXT" ] }, "triggeredRules": [ "SHORT_TEXT" ] }, "mail": { "attachments": [], "cc": null, "from": [ { "address": "test@example.com", "name": "" } ], "headers": [ { "name": "subject", "value": "Test email" }, { "name": "from", "value": { "address": "test@example.com", "name": "" } }, { "name": "content-type", "params": { "boundary": "aaaaa" }, "value": "multipart/mixed" } ], "html": "<h1>Heading 1</h1>\n\n<p>Paragraph</p>", "inReplyTo": null, "messageId": null, "priority": null, "replyTo": null, "subject": "Test email", "text": "Hello World", "to": null }, "rules": { "ARC": { "score": null }, "CLASSIFICATION": { "score": 0.5 }, "DELIVERED_TO_MISMATCH": { "score": 0 }, "DKIM": { "score": null }, "FROM_SPOOFING": { "score": 0 }, "NO_SUBJECT": { "score": 0 }, "NO_TEXT": { "score": 0 }, "REPLY_TO_SPOOFING": { "score": 0 }, "SPF": { "score": null }, "UNDISCLOSED_RECIPIENTS": { "score": 1 } }, "score": 1.5, "spam": false, "time": 5.189}
Submit raw EML file to the endpoint.
The --data-binary @email.eml
parameter submits the file email.eml
in your current working directory as the request body.
Options
Processing configuration can be provided using HTTP headers:
X-Authenticate
: Whether to perform ARC, DKIM, SPF authentication. (boolean, defaults tofalse
).X-Attachments-Upload
: Whether to upload attachments to configuredupload
storage. If enabled, thecontent
property is null and insteadcontentUri
is returned. (boolean, defaults tofalse
).X-Attachments-Size-Limit
: The maximum file size limit of an attachment to be processed (integer, defaults to5000000
- 5MB).X-Disable-Rules
: A comma-separated list of rules to disable.X-Similarity-Groups
: A comma-separated list of similarity group names (training data) which should be checked.X-Mail-From
: The sender address received fromMAIL FROM
(defaults toReturn-Path
value).X-Smtp-Ip
: The IP address of the remote SMTP relay or client.X-Smtp-Helo
: The hostname provided in the HELO/EHLO command.X-Smtp-Mta
: Hostname of the server performing the authentication (defaults to the host’s own hostname).X-Trust-Authentication
: Whether to parse and trust the last Authentication-Results header for authentication (boolean, defaults totrue
).
Authentication
The email authentication refers to the verification of seals such as ARC, DKIM, and SPF, which verify the authenticity of the sender and the content.
By default, authentication is performed by checking the last Authentication-Results
header, which should be added by your receiving mail server.
Alternatively, you can enable full authentication by sending X-Authenticate: true
, which performs necessary verifications. Performing the authentication requires DNS lookups, which can negatively impact performance.
When using X-Authenticate: true
, include the X-Mail-From
and X-Smtp-*
headers to enable proper SPF authentication.
The results of the authentication are present in the authentication
parameter of the response:
{ "authentication": { "ARC": { "comment": "i=1 spf=pass dkim=pass dkdomain=example.com dmarc=pass fromdomain=example.com", "result": "pass" }, "DKIM": [ { "comment": "invalid public key", "result": "neutral", "signingDomain": "example.com" } ], "SPF": { "comment": "example.host: domain of hello@example.com designates 1.2.3.4 as permitted sender", "result": "pass" } }}
Parsing
The response from the POST /v1/eml
endpoint contains the mail
property, which includes the parsed email data, including a list of attachments. This JSON response can be directly consumed by services processing inbound email without the need to implement complex parsing logic in your services.
{ "mail": { "attachments": [], "date": "2025-09-01T10:16:09.000Z", "from": [{ "address": "hello@example.com", "name": "Hello" }], "headers": [], "html": null, "subject": "Test email", "text": "Hello world...", "to": [{ "address": "me@example.com", "name": "Me" }] }}
For the full response schema, see the POST /v1/eml
endpoint documentation.
Attachments
Sentinel automatically parses attachments included in the EML file.
By default, the contents of the attachments are returned as Base64-encoded strings. This method is not suitable for large attachments, and it is recommended to upload attachments to the upload storage instead.
To upload attachments to the upload storage, send the X-Attachments-Upload: true
header. The response will contain the property contentUri
which can be downloaded using the GET /v1/blobs/{key}
API endpoint.
By configuring the X-Attachments-Size-Limit
header, you can control the maximum size of attachments which will be processed. If the attachment is greater than the limit, the attachment metadata will still appear in the parsed response under attachments
, but its content will be ignored — the API will return content: null
and contentUri: null
.
Downloading Attachments
When using X-Attachments-Upload: true
, attachments will be uploaded to configured upload storage and the parameter contentUri
will be returned with each attachment in the following format:
blob://uploads/eml/attachments/2025-09-01/2c39908c3aa32a6484fd405a7c2f782e.png?size=11998&type=image%2Fpng&filename=image.png
To download an attachment using contentUri
, use the GET /v1/blobs/{key}
endpoint and pass the path returned in contentUri
as the key
parameter:
For example, from the blob URI above, the download URL will be:
GET /v1/blobs/eml/attachments/2025-09-01/2c39908c3aa32a6484fd405a7c2f782e.png?size=11998&type=image%2Fpng&filename=image.png
Classification
The email text is classified using the built-in Classifier and the result is provided in the classification
parameter of the response.
If spam is detected, the response includes spam: true
along with a score
. A score of 2 or higher is classified as spam.
In addition to the Classifier’s rules, there are several email-specific rules listed below.
Rules
ARC
: This rule matches if the ARC authentication does not successfully pass.CLASSIFICATION
: The overall score of the text-based classification (seeclassification
property).DELIVERED_TO_MISMATCH
: This rule matches if theDelivered-To
andTo
headers do not match.DKIM
: This rule matches if the DKIM authentication does not successfully pass.FROM_SPOOFING
: This rule matches if theFrom
address includes a different address in the name field.NO_SUBJECT
: This rule matches if theSubject
is empty.NO_TEXT
: This rule matches if the email does not contain any text or HTML message.REPLY_TO_SPOOFING
: This rule matches if theReply-To
address does not match the sender.SPF
: This rule matches if the SPF authentication does not successfully pass.UNDISCLOSED_RECIPIENTS
: This rule matches if there is no validTo
address.
Detecting Phishing
There are two rules that indicate phishing attempts with a high degree of certainty:
FROM_SPOOFING
REPLY_TO_SPOOFING
Spoofing the sender’s and/or the reply-to addresses is a common practice in phishing emails, allowing the attacker to appear like a legitimate known or high-profile identity while directing replies to their own address.
Learning Spam
Email classification works out of the box without training data, but detection accuracy can be enhanced by using the Similarity and Training Data feature. This allows Sentinel to detect unwanted phrases or text segments in emails.
To enable similarity matching with training data, set the X-Similarity-Groups
header and specify the groups to check against (in “partial” mode of the similarity detection).
This feature makes it possible to apply a more traditional approach to spam detection, using samples of known spam or user-reported messages. To add such examples, use the Training Data API.
Server Configuration
The body size limit for the POST /v1/eml
endpoint is restricted by the ENV variable EML_BODY_LIMIT
which defaults to 5MB. The API will return an error if the EML file is larger than this limit. To allow submission of larger EML files, increase the limit.