Language Detection

The Language Detection feature in Sentinel enables fast and accurate identification of the language used in any given text.

Available as a standalone API endpoint, it can be seamlessly integrated into your services for reliable language detection.

Resources

API Documentation

Feature Highlights

Detects over 160 languages
High accuracy detection
Fast response times (~3ms HTTP round-trip for short texts)

Implementation Guide

Language Detection is also integrated into the Classifier to assist in identifying potential spam.

To use language detection directly in your applications or services, simply call the POST /v1/language endpoint with the text you want to analyze:

POST /v1/language
Content-Type: application/json

{
  "text": "Hallo, hoe gaat het?"
}

Example response:

{
  "languages": [
    {
      "language": "nl",
      "probability": 0.428
    },
    {
      "language": "sv",
      "probability": 0.085
    },
    {
      "language": "af",
      "probability": 0.075
    }
  ],
  "time": 1.547
}

Accuracy

Accuracy varies by language and input length. In general:

Detection is more accurate with longer text inputs.
For major languages, short phrases (2–3 words) are often sufficient.
Inputs longer than 400 characters will be truncated.

For best results, submit full sentences or paragraphs when possible.