Questi contenuti non sono ancora disponibili nella tua lingua.
How to use Email Spam Filter
The Email Spam Filter lets your services handle inbound emails in a reliable and structured way. Instead of writing your own parser or classifier, you can send raw EML (RFC 822) files to Sentinel’s HTTP API and receive back a parsed email object along with a detailed classification report.
This guide walks you through the setup, API usage, and configuration options to help you integrate the Email Spam Filter into your applications and workflows.
Related Resources
Setup
- Ensure you have Sentinel installed.
- Create a new API key and assign it to a Security Group with Restricted access level (the EML API is non-public).
Receiving Emails
Sentinel does not include an SMTP server. You need to handle email reception separately and then forward the messages to Sentinel for analysis. Common options include:
- Fetching new emails directly from your mail server
- Using a cloud provider like AWS SES for inbound email
In all cases, emails are exchanged in EML (RFC 822) format, which is the raw format you must submit to Sentinel’s API for parsing and classification.
Parsing and Classifying Emails
To process an email in EML format, use the POST /v1/eml endpoint:
curl -X POST http://localhost:8080/v1/eml \ -H "Content-Type: application/octet-stream" \ -H "Authorization: Bearer {API_KEY}" \ --data-binary @email.eml{ "authentication": { "ARC": { "comment": "i=1 spf=pass dkim=pass dkdomain=example.com dmarc=pass fromdomain=example.com", "result": "pass" }, "DKIM": [ { "comment": "invalid public key", "result": "neutral", "signingDomain": "example.com" } ], "SPF": { "comment": "example.host: domain of hello@example.com designates 1.2.3.4 as permitted sender", "result": "pass" } }, "classification": { "classification": "GOOD", "score": 0.5, "email": { "rules": { "DISPOSABLE": { "score": 0 }, "DMARC": { "score": 0 }, "FREE_PROVIDER": { "score": 0 }, "MX": { "score": 0 } }, "score": 0, "time": 0.334, "triggeredRules": [] }, "ip": null, "location": null, "rateLimit": null, "similarity": null, "text": { "classifier": "en", "language": "en", "rules": { "CAPITALIZATION": { "score": 0 }, "CURRENCY": { "score": 0 }, "EMOJI": { "score": 0 }, "EXCLAMATION": { "score": 0 }, "HASH_TAGS": { "score": 0 }, "HTML": { "score": 0 }, "HTML_INJECTION": { "score": 0 }, "NUMBERS_ONLY": { "score": 0 }, "PROFANITY": { "score": 0 }, "RANDOM_CHARS": { "score": 0 }, "SHORT_TEXT": { "score": 1 }, "SPAM_WORDS": { "score": 0 }, "SPECIAL_CHARS": { "score": 0 }, "SQL_INJECTION": { "score": 0 }, "UNEXPECTED_LANGUAGE": { "score": 0 }, "URL": { "score": 0 } }, "score": 1, "time": 0.173, "triggeredRules": [ "SHORT_TEXT" ] }, "triggeredRules": [ "SHORT_TEXT" ] }, "mail": { "attachments": [], "cc": null, "from": [ { "address": "test@example.com", "name": "" } ], "headers": [ { "name": "subject", "value": "Test email" }, { "name": "from", "value": { "address": "test@example.com", "name": "" } }, { "name": "content-type", "params": { "boundary": "aaaaa" }, "value": "multipart/mixed" } ], "html": "<h1>Heading 1</h1>\n\n<p>Paragraph</p>", "inReplyTo": null, "messageId": null, "priority": null, "replyTo": null, "subject": "Test email", "text": "Hello World", "to": null }, "rules": { "ARC": { "score": null }, "CLASSIFICATION": { "score": 0.5 }, "DELIVERED_TO_MISMATCH": { "score": 0 }, "DKIM": { "score": null }, "FROM_SPOOFING": { "score": 0 }, "NO_SUBJECT": { "score": 0 }, "NO_TEXT": { "score": 0 }, "REPLY_TO_SPOOFING": { "score": 0 }, "SPF": { "score": null }, "UNDISCLOSED_RECIPIENTS": { "score": 1 } }, "score": 1.5, "spam": false, "time": 5.189}Submit raw EML file to the endpoint.
The --data-binary @email.eml parameter submits the file email.eml in your current working directory as the request body.
Options
Processing configuration can be provided using HTTP headers:
X-Authenticate: Whether to perform ARC, DKIM, SPF authentication. (boolean, defaults tofalse).X-Attachments-Upload: Whether to upload attachments to configureduploadstorage. If enabled, thecontentproperty is null and insteadcontentUriis returned. (boolean, defaults tofalse).X-Attachments-Size-Limit: The maximum file size limit of an attachment to be processed (integer, defaults to5000000- 5MB).X-Disable-Rules: A comma-separated list of rules to disable.X-Similarity-Groups: A comma-separated list of similarity group names (training data) which should be checked.X-Mail-From: The sender address received fromMAIL FROM(defaults toReturn-Pathvalue).X-Smtp-Ip: The IP address of the remote SMTP relay or client.X-Smtp-Helo: The hostname provided in the HELO/EHLO command.X-Smtp-Mta: Hostname of the server performing the authentication (defaults to the host’s own hostname).X-Trust-Authentication: Whether to parse and trust the last Authentication-Results header for authentication (boolean, defaults totrue).
Authentication
The email authentication refers to the verification of seals such as ARC, DKIM, and SPF, which verify the authenticity of the sender and the content.
By default, authentication is performed by checking the last Authentication-Results header, which should be added by your receiving mail server.
Alternatively, you can enable full authentication by sending X-Authenticate: true, which performs necessary verifications. Performing the authentication requires DNS lookups, which can negatively impact performance.
When using X-Authenticate: true, include the X-Mail-From and X-Smtp-* headers to enable proper SPF authentication.
The results of the authentication are present in the authentication parameter of the response:
{ "authentication": { "ARC": { "comment": "i=1 spf=pass dkim=pass dkdomain=example.com dmarc=pass fromdomain=example.com", "result": "pass" }, "DKIM": [ { "comment": "invalid public key", "result": "neutral", "signingDomain": "example.com" } ], "SPF": { "comment": "example.host: domain of hello@example.com designates 1.2.3.4 as permitted sender", "result": "pass" } }}Parsing
The response from the POST /v1/eml endpoint contains the mail property, which includes the parsed email data, including a list of attachments. This JSON response can be directly consumed by services processing inbound email without the need to implement complex parsing logic in your services.
{ "mail": { "attachments": [], "date": "2025-09-01T10:16:09.000Z", "from": [{ "address": "hello@example.com", "name": "Hello" }], "headers": [], "html": null, "subject": "Test email", "text": "Hello world...", "to": [{ "address": "me@example.com", "name": "Me" }] }}For the full response schema, see the POST /v1/eml endpoint documentation.
Attachments
Sentinel automatically parses attachments included in the EML file.
By default, the contents of the attachments are returned as Base64-encoded strings. This method is not suitable for large attachments, and it is recommended to upload attachments to the upload storage instead.
To upload attachments to the upload storage, send the X-Attachments-Upload: true header. The response will contain the property contentUri which can be downloaded using the GET /v1/blobs/{key} API endpoint.
By configuring the X-Attachments-Size-Limit header, you can control the maximum size of attachments which will be processed. If the attachment is greater than the limit, the attachment metadata will still appear in the parsed response under attachments, but its content will be ignored — the API will return content: null and contentUri: null.
Downloading Attachments
When using X-Attachments-Upload: true, attachments will be uploaded to configured upload storage and the parameter contentUri will be returned with each attachment in the following format:
blob://uploads/eml/attachments/2025-09-01/2c39908c3aa32a6484fd405a7c2f782e.png?size=11998&type=image%2Fpng&filename=image.pngTo download an attachment using contentUri, use the GET /v1/blobs/{key} endpoint and pass the path returned in contentUri as the key parameter:
For example, from the blob URI above, the download URL will be:
GET /v1/blobs/eml/attachments/2025-09-01/2c39908c3aa32a6484fd405a7c2f782e.png?size=11998&type=image%2Fpng&filename=image.pngClassification
The email text is classified using the built-in Classifier and the result is provided in the classification parameter of the response.
If spam is detected, the response includes spam: true along with a score. A score of 2 or higher is classified as spam.
In addition to the Classifier’s rules, there are several email-specific rules listed below.
Rules
ARC: This rule matches if the ARC authentication does not successfully pass.CLASSIFICATION: The overall score of the text-based classification (seeclassificationproperty).DELIVERED_TO_MISMATCH: This rule matches if theDelivered-ToandToheaders do not match.DKIM: This rule matches if the DKIM authentication does not successfully pass.FROM_SPOOFING: This rule matches if theFromaddress includes a different address in the name field.NO_SUBJECT: This rule matches if theSubjectis empty.NO_TEXT: This rule matches if the email does not contain any text or HTML message.REPLY_TO_SPOOFING: This rule matches if theReply-Toaddress does not match the sender.SPF: This rule matches if the SPF authentication does not successfully pass.UNDISCLOSED_RECIPIENTS: This rule matches if there is no validToaddress.
Detecting Phishing
There are two rules that indicate phishing attempts with a high degree of certainty:
FROM_SPOOFINGREPLY_TO_SPOOFING
Spoofing the sender’s and/or the reply-to addresses is a common practice in phishing emails, allowing the attacker to appear like a legitimate known or high-profile identity while directing replies to their own address.
Additionally, when a verified phishing URL is detected with the Phishing Detection feature, the Classifier triggers the URL_PHISHING rule, providing a strong indication of a phishing attempt.
Learning Spam
Email classification works out of the box without training data, but detection accuracy can be enhanced by using the Similarity and Training Data feature. This allows Sentinel to detect unwanted phrases or text segments in emails.
To enable similarity matching with training data, set the X-Similarity-Groups header and specify the groups to check against (in “partial” mode of the similarity detection).
This feature makes it possible to apply a more traditional approach to spam detection, using samples of known spam or user-reported messages. To add such examples, use the Training Data API.
Server Configuration
The body size limit for the POST /v1/eml endpoint is restricted by the ENV variable EML_BODY_LIMIT which defaults to 5MB. The API will return an error if the EML file is larger than this limit. To allow submission of larger EML files, increase the limit.