Use this file to discover all available pages before exploring further.
KaireonAI ships with production-grade operational infrastructure built into the platform. Rate limiting, circuit breakers, dead letter queues, and Prometheus metrics all work out of the box — configure them through environment variables and tenant settings.
All metrics are exposed at GET /api/metrics in Prometheus text format. Scrape this endpoint from your Prometheus server or any compatible collector. Default metrics (Node.js process stats) are auto-collected with the kaireon_ prefix.
The platform uses a sliding window algorithm to enforce per-key request limits. Each request timestamp is recorded; when the count within the window exceeds the configured maximum, subsequent requests are rejected with 429 Too Many Requests.
Milliseconds until the earliest window slot frees up
Each rate-limit decision exposes allowed (boolean), remaining (requests left in the window), and retryAfterMs (set when the request is rejected) so callers can surface helpful retry guidance to clients.
Rate limiters are instantiated with two parameters:
new RateLimiter({ maxRequests: 100, // requests per window windowMs: 60_000, // window size in milliseconds});
If Redis is not configured (REDIS_URL not set), rate limiting falls back to in-memory mode. The platform still works, but limits are per-process rather than global.
Circuit breaker state is persisted to Redis (key prefix kaireon:cb:) when REDIS_URL is set, so state survives process restarts. If Redis is unavailable, state is maintained in-memory only.
Every state transition emits a kaireon_circuit_breaker_state_change_total counter increment with labels name, from, and to. Alert on transitions to open:
Events that fail processing after retries are moved from the outbox to the dead letter queue (DLQ). DLQ entries are persisted, scoped per tenant, and organized by topic so admins can triage failed events by source.
GET /api/v1/admin/dlq — Retrieve DLQ summary and events (admin role required).
Parameter
Type
Default
Description
limit
query
50
Max events returned (capped at 200)
topic
query
—
Filter by topic
Response includes totalEvents, a byTopic breakdown, the event list, and an alert field:
Alert Level
Condition
"OK"
10 or fewer events
"WARNING"
11-100 events
"CRITICAL"
More than 100 events
POST /api/v1/admin/dlq — Retry or purge DLQ events (admin role required).
{ "action": "retry", // "retry" or "purge" "eventIds": ["..."], // optional: specific event IDs "topic": "decisions" // optional: all events for a topic}
Retry re-enqueues events back to the outbox with status: "pending" and retryCount: 0, then deletes the DLQ entry (transactional).
The platform caches offers, qualification rules, and contact policies to reduce database load during decision execution. An emergency flush endpoint is available for situations where cached data becomes stale.POST /api/v1/admin/cache — Emergency cache invalidation (admin role required).
If no body is provided, all caches are flushed. Every flush is audit-logged.Monitor cache effectiveness with kaireon_cache_hits_total and kaireon_cache_misses_total.
Creative queries are filtered by the set of candidate offer IDs, not loaded for the entire tenant. This prevents unbounded memory usage when a tenant has thousands of creatives across many offers.
Flow route resolution is cached with a 120-second TTL, avoiding repeated database lookups for the same flow across concurrent requests.
Chunked inserts — CSV ingestion writes to the database in batches of 1,000 rows, preventing memory exhaustion on large files and reducing transaction lock duration.
Streaming batch execution — The batch executor uses summary counters (rows loaded, failed, skipped) instead of accumulating all row results in memory, allowing pipelines to process files larger than available RAM.
Decision traces provide forensic visibility into every stage of the decision pipeline. Configure tracing in Settings > General > Retention > Decision Trace.
Setting
Description
decisionTraceEnabled
Master toggle for trace capture
decisionTraceSampleRate
Percentage of requests to trace (0-100)
Retention period
How long traces are retained before cleanup
Traces are persisted to the decision-trace store and viewable from the Decision Flows detail page. Each trace records the full pipeline execution: candidates at each stage, filter reasons, scores, rankings, and timing breakdowns.
Decision latency suddenly increased. Identify which pipeline stage is the bottleneck:
# Overall decision latency p99histogram_quantile(0.99, rate(kaireon_decision_pipeline_duration_ms_bucket[5m]))# Break down by sub-stage to find the bottleneckhistogram_quantile(0.99, rate(kaireon_qualification_filter_latency_ms_bucket[5m]))histogram_quantile(0.99, rate(kaireon_scoring_latency_ms_bucket[5m]))histogram_quantile(0.99, rate(kaireon_ranking_latency_ms_bucket[5m]))
Check for scoring model circuit breakers opening (model inference may be slow or failing):
# DLQ depth by tenantkaireon_dlq_depth# Outbox processing failuresrate(kaireon_outbox_processed_total{status="failed"}[5m])# Age of oldest unprocessed eventhistogram_quantile(0.99, rate(kaireon_outbox_event_age_seconds_bucket[5m]))
Remediation steps:
Check the DLQ admin endpoint (GET /api/v1/admin/dlq) to identify failing topics
Investigate the root cause (downstream service outage, schema mismatch)
Fix the underlying issue
Retry events with POST /api/v1/admin/dlq with action: "retry"