Zum Inhalt springen

Dieser Inhalt ist noch nicht in deiner Sprache verfügbar.

How to Detect Spam with Similarity Matching

The Similarity Detection feature in ALTCHA Sentinel lets you compare new messages against known spam-like examples using semantic similarity — going beyond exact matches to catch variations in meaning. This guide walks you through how to set it up, train it, and use it effectively to moderate content in chats, forums, and user-generated platforms.

How It Works

The similarity engine uses cosine similarity on text embeddings generated by the open-source model all-MiniLM-L6-v2. This model is optimized for encoding sentences and short paragraphs. Input text longer than 256 word pieces is automatically truncated.

Rather than checking for exact word matches, the model evaluates semantic similarity. For example, phrases like “message me”, “text me”, and “contact us” are considered closely related in meaning.

Although primarily trained on English data, the model supports multiple languages with varying accuracy.

Setup

  1. Ensure you have Sentinel installed.
  2. Create a new API key and assign it to a Security Group with Restricted access level (the Similarity API is non-public).

Training Data

The Similarity API accepts either:

  • Direct examples (an array of strings), or
  • groups (names of predefined training data sets).

To create or manage training data, use the Training Data API, or add it manually through the interface.

Weights and Thresholds

Training data items can include:

  • threshold: Minimum similarity score to count as a match (e.g., 0.7 = 70% similarity).
  • weight: Multiplier applied to the similarity score (e.g., 0.75 x weight = final score).

Match Against Examples

The most basic use case is to compare a given input text against a list of examples.

curl -X POST http://localhost:8080/v1/similarity \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {API_KEY}" \
-d '{
"examples": [
"Claim your exclusive reward now by clicking the link below!",
"Get your exclusive prize now by visiting this link!",
"Don'\''t miss out—claim your unique prize by clicking below!",
"The weather today is sunny and perfect for a walk in the park."
],
"text": "Claim your exclusive prize now by clicking the link below!"
}'

Partial Matching

To match phrases within longer text, enable partial matching by setting "partial": true. In this mode, examples should be short (e.g. 1–5 words).

curl -X POST http://localhost:8080/v1/similarity \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {API_KEY}" \
-d '{
"examples": [
"message me on WhatsApp",
"text me on Telegram",
"contact me on Facebook"
],
"text": "Don'\''t waste your money on expensive sneakers! Text me on WhatsApp +123123123 for better deals.",
"partial": true
}'

Using Predefined Training Data

Instead of passing examples directly, you can reference training groups. In the example below, the system matches the input text against the chat_spam training group.

curl -X POST http://localhost:8080/v1/similarity \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {API_KEY}" \
-d '{
"groups": ["chat_spam"],
"text": "Don'\''t waste your money on expensive sneakers! Text me on WhatsApp +123123123 for better deals.",
"partial": true
}'

Creating Training Data via API

You can submit new training examples through the API—for instance, when a user clicks a “Report Spam” button. Use the POST /v1/training-data endpoint.

Always review submitted data to ensure quality. High-quality training examples result in more accurate spam detection.

Evaluating Similarity Scores

The Similarity API returns a list of matched examples and their corresponding similarity score, a value between 0.0 and 1.0. A higher score means the input text is more semantically similar to the example.

However, the API does not label content as “spam” automatically. It’s up to you to evaluate the response and decide what threshold is appropriate for your use case.

How to Detect Spam

You should treat a message as spam if it exceeds a threshold score that you define. A good starting point is:

  • 0.7 and above → likely spam
  • 0.4 – 0.7 → possibly suspicious, depending on context
  • below 0.4 → usually safe

These values are not strict rules. You can adjust them based on your tolerance for false positives.

For partial matches, you may want to use a lower threshold (e.g. 0.6) since matches tend to be fuzzier.

Tips

  • Use Partial Matching for Short Phrases
    Enable partial matching when scanning for short phrases within longer messages. Keep your examples concise—ideally 1 to 5 words—for the most accurate results.

  • Keep Texts Short
    To ensure optimal performance and accuracy, limit the length of text, examples, and training data to 256 tokens (word pieces). For partial matching, keep examples especially brief and targeted.

  • Stop on First Match
    For performance or early spam detection, consider using the stopOnMatch parameter. This sets a threshold (0.0–1.0) at which the system will stop processing further matches once a result exceeds it.