Skip to content

Classifier

The Classifier enables you to classify text and other information, helping filter spam and identify legitimate messages. It analyzes textual and contextual data using a built-in natural language processing engine and machine learning, providing a numeric score indicating message legitimacy.

Note: The Classifier is an improved version of the Spam Filter previously offered as SaaS.

Resources

Feature Highlights

  • Comprehensive analysis of text, email addresses, device information, and IP addresses
  • Spam detection through pattern recognition and phrase analysis
  • Security protection against HTML/SQL injection patterns
  • Language detection (supports 160+ languages)
  • Geo-location identification from IP addresses or time zones
  • Data matching against training datasets
  • Full support for 19 languages (partial support for others)
  • High performance suitable for real-time classification (~10ms HTTP round-trip for 10KB texts)

Use Cases

  • Comprehensive anti-spam: Detect spam submitted through online forms or APIs by analyzing text and validating factors like email addresses and IP addresses
  • Email address validation: Identify fake or suspicious email addresses and distinguish between “free” and “work” emails
  • IP address validation: Determine if an IP address is associated with a proxy or TOR exit, and check against blocklists
  • Security firewall: Protect against common HTML and SQL injection attempts in text
  • Language detection: Automatically detect up to 160 languages from provided text
  • Geo-location: Detect user location, commonly spoken languages, currency, and other information from IP addresses or time zones
  • Geo-fencing: Block specific countries, regions, or continents from accessing your website or APIs

Implementation Guide

The classifier endpoint is designed for back-end services. Below are common use cases:

Text Classification

Provide the text payload (or fields as name-value pairs) to classify input text for spam and security patterns.

Terminal window
POST /v1/classifier
Content-Type: application/json
{
"text": "Example text to classify"
}

IP Address Classification

Provide the user’s IP address to resolve geo-location and other properties.

Terminal window
POST /v1/classifier
Content-Type: application/json
{
"ip": "10.0.0.1"
}

Email Address Classification

Provide the email domain to verify DNS records and check against known disposable or free email providers.

Terminal window
POST /v1/classifier
Content-Type: application/json
{
"email": "@gmail.com"
}

Device Classification

Provide HTTP headers from the user’s device to verify device information.

Terminal window
POST /v1/classifier
Content-Type: application/json
{
"headers": {
"Accept": "...",
"Accept-Language": "...",
"User-Agent": "..."
},
"ip": "10.0.0.1"
}

Rate Limiter

Send a custom rateLimit object to limit user requests.

Terminal window
POST /v1/classifier
Content-Type: application/json
{
"rateLimit": {
"limit": "10/1h"
}
}

For more about rate limiters, see the Rate-Limiters guide.

Response Format

The response includes the classification result with overall score and triggered rules.

{
"classification": "BAD",
"score": 3.4,
"triggeredRules": ["PROFANITY"],
...
}

Response details:

  • classification - GOOD (< 1), NEUTRAL (1-2), or BAD (> 2)
  • score - Numeric score (scores > 2 indicate spam)
  • triggeredRules - Array of matching rules, sorted by score

For details, see the API Documentation.

Alternative: Widget Integration

To enable form field classification in the ALCTHA Widget:

  1. Navigate to Security Group configuration > Advanced tab under rules
  2. Create a Set-type rule
  3. Select Classify Form Fields and set to true

Language Support

Fully supported languages:

  • Bulgarian
  • Czech
  • Danish
  • Dutch
  • English
  • Finnish
  • French
  • German
  • Greek
  • Hungarian
  • Italian
  • Norwegian
  • Polish
  • Portuguese
  • Romanian
  • Russian
  • Slovak
  • Spanish
  • Swedish

For unsupported languages, the system defaults to English-based analysis with basic functionality.