> ## Documentation Index
> Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Decision Sentinel

> Continuous decision-stream health monitoring — suppression spikes and empty-decision rates, with alerts and opt-in auto-pause.

## Overview

The Decision Sentinel is a background watcher that answers a question dashboards can't: *is the decisioning stream silently going wrong right now?* It runs every 30 minutes (`GET /api/v1/cron/ai-sentinel`, CRON\_SECRET-gated) and evaluates two metrics per tenant over the last 60 minutes versus the previous 60-minute window.

## Metrics

Both metrics are computed from `decision_traces` and are also registered as standard **alert-rule metrics**, so you can build your own alert rules on them in Settings > Alerts:

| Metric                 | Definition                                                                                                                                      | Warn  | Hard breach |
| ---------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- | ----- | ----------- |
| `suppression_rate`     | Qualified candidates removed by the suppression + contact-policy stages: `(Σ afterQualification − Σ afterContactPolicy) / Σ afterQualification` | ≥ 80% | ≥ 95%       |
| `empty_candidate_rate` | Decisions that returned zero offers: traces with `finalCount = 0` / total traces                                                                | ≥ 30% | ≥ 50%       |

New tenants get default alert rules for both (suppression ≥ 80%, empty-candidate ≥ 30%) alongside the existing 5xx and degraded-scoring defaults.

The Sentinel requires at least 20 traces across the two windows before trusting a rate — low-traffic tenants are skipped rather than alerted on noise.

## What happens on a breach

* **Warn breach** — a System Health alert (source `sentinel`) is written and appears on AI > Insights and the notification surfaces. Alerts are deduplicated within the window.
* **Hard breach** — the alert severity is `critical` (which also routes to configured side channels). If — and only if — the tenant has opted in via **Settings > AI Configuration > AI Autonomy > "Sentinel may auto-pause active flows"** (`aiAutopilot.sentinelAutoPause: true`), the Sentinel pauses all active decision flows using the same optimistic-locking pattern as the fairness recheck, with a full audit trail (`auto_pause` / `sentinel_breach`).

Without the opt-in, the Sentinel never mutates anything.

## Why these two metrics

Both failure modes are *config-induced silence*: a mis-scoped contact policy, an aggressive frequency cap, a broken qualification rule, or an expired offer schedule doesn't throw errors — it just quietly stops decisions from going out. Latency and 5xx alerting never notices. The Sentinel watches the funnel itself.
