Esta página aún no está disponible en tu idioma.
Logging to ClickHouse
ALTCHA Sentinel supports logging to ClickHouse, enabling high-performance data storage and real-time analytics.
What is ClickHouse?
ClickHouse is a high-performance, columnar database management system designed for real-time analytical queries. It excels at handling large volumes of log data efficiently, enabling fast aggregation and reporting without impacting your primary operational databases.
Why Enable ClickHouse?
Storing request logs in ClickHouse enables scalable, high-speed analytics and reporting while reducing the load on your primary database. This separation improves overall system performance and allows you to leverage powerful analytics tools.
For detailed tuning tips, see Performance Tuning.
Key Benefits
- Offloads logging workload from the primary database, improving its responsiveness
- Provides efficient querying and aggregation optimized for analytics
- Reduces disk I/O on container volumes by using ClickHouse’s optimized storage engine
- Enables integration with external tools like ClickHouse clients or Business Intelligence (BI) platforms for advanced analysis
Configuration
Requirements
- ClickHouse version 22.6 or later
Database Setup
- Create the ClickHouse database:
CREATE DATABASE IF NOT EXISTS altcha_sentinelENGINE = Atomic;
- Create the logs table with appropriate schema and indexes:
CREATE TABLE altcha_sentinel.logs ( accountId LowCardinality(String), apiKeyId LowCardinality(String), time DateTime, browser UInt8, context Map(String, String), countryCode LowCardinality(FixedString(2)), device UInt8, endpoint UInt8, ip IPv6, method UInt8, network UInt8, path LowCardinality(String), referrer LowCardinality(String), triggeredRules Array(UInt8), serverLatency UInt32, statusCode UInt16, verified Bool,
INDEX idx_statusCode statusCode TYPE set(1000) GRANULARITY 1, INDEX idx_countryCode countryCode TYPE set(1000) GRANULARITY 1, INDEX idx_method method TYPE set(1000) GRANULARITY 1)ENGINE = MergeTree()PARTITION BY toYYYYMM(time)ORDER BY (accountId, toStartOfHour(time), apiKeyId)TTL time + INTERVAL 5 YEAR DELETESETTINGS index_granularity = 8192, flatten_nested = 0;
Retention Policy
The TTL
clause defines how long data is kept in the table. In this example, logs older than 5 years are automatically deleted by ClickHouse to control storage size and comply with data retention requirements. You can adjust the retention period by modifying the interval in the TTL
expression, for example:
TTL time + INTERVAL 1 YEAR DELETE
to keep logs for 1 year only.
Enabling ClickHouse Integration
Set the CLICKHOUSE_URL
environment variable to enable ClickHouse logging. For advanced configurations, review related environment variables.
Example connection URL format:
http://user:password@localhost:8123/altcha_sentinel
Batching Configuration
To optimize performance, the ClickHouse client batches log entries before sending them. Adjust these environment variables as needed:
CLICKHOUSE_BATCH_MAX
: Maximum number of logs per batchCLICKHOUSE_BATCH_INTERVAL
: Maximum flush interval before sending a batch
Identifier Mappings
For performance reasons, certain fields are stored as integers. Use the following mappings to decode these identifiers:
browser
Number | Browser |
---|---|
1 | chrome |
2 | firefox |
3 | edge |
4 | safari |
5 | brave |
6 | vivaldi |
7 | opera |
device
Number | Device |
---|---|
1 | desktop |
2 | console |
3 | mobile |
4 | tablet |
5 | smarttv |
6 | wearable |
7 | embedded |
8 | bot |
network
Number | Network |
---|---|
1 | fixed |
2 | mobile |
3 | hosting |
4 | proxy |
5 | tor |
method
Number | HTTP Method |
---|---|
1 | GET |
2 | POST |
3 | PATCH |
4 | PUT |
5 | DELETE |
6 | OPTIONS |
7 | QUERY |
8 | HEAD |
endpoint
Number | Endpoint |
---|---|
1 | challenge |
2 | verify |
rule
Number | Rule |
---|---|
1 | CAPITALIZATION |
2 | CURRENCY |
3 | DMARC |
4 | EMOJI |
5 | EXCLAMATION |
6 | FREE_PROVIDER |
7 | HASH_TAGS |
8 | HIGH_RISK_COUNTRY |
9 | HOSTING |
10 | HTML |
11 | HTML_INJECTION |
12 | MALICIOUS |
13 | MX |
14 | NUMBERS_ONLY |
15 | PROFANITY |
16 | PROXY |
17 | RANDOM_CHARS |
18 | SHORT_TEXT |
19 | SPAM_WORDS |
20 | SPECIAL_CHARS |
21 | SQL_INJECTION |
22 | TOR |
23 | UNEXPECTED_LANGUAGE |
24 | UNKNOWN_LANGUAGE |
25 | URL |
26 | BOT |
27 | ACCEPT_HEADER_MISSING |
28 | ACCEPT_LANGUAGE_HEADER_MISSING |
29 | USER_AGENT_HEADER_MISSING |
30 | DISPOSABLE |
31 | LOCATION_DISTANCE |
32 | TIMEZONE_MISMATCH |
33 | RATE_LIMIT |
34 | SIMILARITY |
Implementation Notes
context
— A custom key-value map containing additional metadata set during the verification process.endpoint
andpath
— Whenendpoint = 0
, the full request path is recorded in thepath
field.verified
— Present only whenendpoint = 2
(i.e., the/v1/verify
endpoint). Indicates whether the verification attempt was successful (1
equals success).