LLM Explanations

KaireonAI can turn any persisted decision trace into a written explanation of why that decision was made, suitable for three different audiences:

Regulator — formal, compliance-grade prose with full factor detail. Writes to the AuditLog for DSAR / regulator use.
Agent — structured JSON for internal tooling (call-center consoles, troubleshooting UIs). Machine-readable.
Customer — one or two plain-language sentences you can show to the end customer in-product.

Narratives are generated on-demand against persisted decision traces. They never run during a /recommend call, so LLM latency or availability cannot affect live decisioning.

When to use each mode

Mode	Audience	Format	Typical length	Side effects
`regulator`	Compliance, auditors, DSAR exports	Prose	~400 words	Writes an audit-log entry
`agent`	Internal agents / tooling	JSON	~300 tokens	—
`customer`	End customer	Short sentence	~30–60 words	—

Pick the mode that matches the downstream consumer. The same trace can be explained in all three modes and the results are cached independently.

How to enable it

LLM explanations are off by default for every tenant. Enable them in Settings > AI explanations, or via the API:

curl -X PUT https://your-host/api/v1/ai/explanations-settings \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: $KAIREON_API_KEY" \
  -d '{ "llmExplanationsEnabled": true }'

The flag is stored at tenantSettings.aiAnalyzerSettings.llmExplanationsEnabled. While disabled, the narrative endpoint returns 403:

{
  "error": {
    "code": "FORBIDDEN",
    "message": "LLM explanations are not enabled for this tenant...",
    "status": 403
  }
}

End-to-end flow

   Decision pipeline                         Narrative endpoint
   ─────────────────                         ───────────────────
   /recommend  ─►  DecisionTrace  ──────►    /decisions/{id}/narrative
                   (persisted)                │
                                              ▼
                                       tenant opt-in? ── no ─► 403
                                              │ yes
                                              ▼
                                       cache hit for
                                       (tenant × trace × mode      ─ yes ─► return cached
                                        × model × inputsHash)?
                                              │ no
                                              ▼
                                       redactPII(context)
                                              │
                                              ▼
                                       LLM provider (generateText)
                                              │
                                              ▼
                                       cache in Redis, TTL 7d
                                              │
                                              ▼
                                       mode = regulator?  ─► write AuditLog
                                              │
                                              ▼
                                       NarrativeResult (JSON)

PII redaction

Before any outbound LLM call, the input context runs through a PII-redaction pass:

Customer identifiers, email addresses, phone numbers, and free-text address fields are replaced with typed placeholders.
Only offer IDs, feature contributions, scores, policy reasons, and experiment assignment remain in the prompt.
The prompt explicitly notes “customer attributes redacted before LLM call; features reflected via topFactors on each offer” so the model does not invent missing attributes.

Redaction is best-effort defense-in-depth. The primary safeguard is the per-tenant opt-in. Do not disable the opt-in globally if your data-handling policies forbid sending any customer-derived data to external models.

Caching

Aspect	Value
Store	Redis
TTL	7 days
Key	`tenantId × decisionTraceId × mode × model × inputsHash`
Bypass	Pass `"noCache": true` in the request body

The inputsHash component guarantees that if the underlying trace is re-scored or updated, the next request will generate a fresh narrative.

Audit log for regulator mode

Every successful mode = "regulator" call produces an audit-log row:

{
  "action": "generate_narrative",
  "entityType": "decision_trace",
  "entityId": "trace_001",
  "entityName": "regulator narrative",
  "changes": {
    "mode": "regulator",
    "model": "claude-sonnet-4-7",
    "cached": false,
    "narrativePreview": "The system recommended offer_premium_card..."
  }
}

This makes regulator narratives discoverable during DSAR assembly and compliance review. Agent and customer narratives do not write audit rows.

Worked example — one trace, three views

Given a trace where offer_premium_card was selected with score 0.89, and offer_gold_plus came second with 0.71, the three modes produce:

Regulator

curl -X POST https://your-host/api/v1/decisions/trace_001/narrative \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: $KAIREON_API_KEY" \
  -d '{ "mode": "regulator" }'

{
  "narrative": "On 2026-04-18, the KaireonAI decision engine evaluated 47 candidate offers for the referenced customer request. After qualification filters removed 15 ineligible offers and contact-policy checks suppressed 4 additional offers, 28 offers entered the scoring stage. The Premium Card offer was ranked first with a composite score of 0.89; the top contributing factors were recent-transaction-count (positive), tenure-months (positive), and segment-match (positive). The Gold Plus alternative (score 0.71) was ranked lower due to a lower recent-transaction contribution. No experimental holdout was applied.",
  "mode": "regulator",
  "model": "claude-sonnet-4-7",
  "cached": false,
  "tokens": { "input": 1420, "output": 280 },
  "createdAt": "2026-04-18T21:04:18.000Z"
}

Agent

curl -X POST https://your-host/api/v1/decisions/trace_001/narrative \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: $KAIREON_API_KEY" \
  -d '{ "mode": "agent" }'

{
  "narrative": "{\n  \"selected\": { \"offerId\": \"off_premium_card\", \"score\": 0.89 },\n  \"topFactors\": [\n    { \"feature\": \"recent_transaction_count\", \"direction\": \"positive\" },\n    { \"feature\": \"tenure_months\", \"direction\": \"positive\" }\n  ],\n  \"alternatives\": [\n    { \"offerId\": \"off_gold_plus\", \"score\": 0.71, \"whyNotChosen\": \"ranked lower than selected offer(s)\" }\n  ],\n  \"policiesFired\": [\"frequency_cap\"]\n}",
  "mode": "agent",
  "model": "claude-sonnet-4-7",
  "cached": false,
  "tokens": { "input": 1420, "output": 190 },
  "createdAt": "2026-04-18T21:04:19.000Z"
}

Customer

curl -X POST https://your-host/api/v1/decisions/trace_001/narrative \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: $KAIREON_API_KEY" \
  -d '{ "mode": "customer" }'

{
  "narrative": "We thought the Premium Card would be most useful for you right now, based on how you've been using your account. You can see other available offers any time in your dashboard.",
  "mode": "customer",
  "model": "claude-sonnet-4-7",
  "cached": false,
  "tokens": { "input": 1420, "output": 55 },
  "createdAt": "2026-04-18T21:04:19.500Z"
}

Exact SHAP for gradient_boosted models

The narrative endpoint produces prose explanations. For numerical, mathematically-grounded per-feature attributions on gradient_boosted models, KaireonAI also exposes an exact TreeSHAP endpoint:

curl -X POST https://your-host/api/v1/decisions/trace_001/shap \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: $KAIREON_API_KEY" \
  -d '{
    "modelId": "mdl_premium_card_gbm",
    "offerId": "off_premium_card",
    "attributes": {
      "recent_transaction_count": 12,
      "tenure_months": 36,
      "average_balance": 15400
    }
  }'

{
  "shapValues": {
    "recent_transaction_count": 1.314,
    "tenure_months": 0.428,
    "average_balance": -0.211
  },
  "baseline": -0.602,
  "rawMargin": 1.012,
  "additivityResidual": 0.0
}

Implementation: Lundberg, Erion, Lee 2018 (path-dependent TreeSHAP, Algorithm 2). Sum of shapValues plus baseline equals rawMargin exactly — the additivity invariant is verified per-call and reported as additivityResidual. Use this when:

A regulator demands a defensible per-feature breakdown for an audit (EU AI Act Art. 13 / 22, GDPR Art. 15).
You want to feed numerical attributions into a downstream dashboard, CSV export, or your own NLG layer.
The cheap path-heuristic explanations field on DecisionTrace.scoringResults is not exact enough — TreeSHAP is the consistent, axiomatically-grounded alternative.

Pair /shap with /narrative for regulator exports: the SHAP numbers are the math, the narrative is the prose. Both flow through the same per-tenant llmExplanationsEnabled opt-in. See the Decision Traces API for full request/response and audit-log details.

Rate limits

20 requests / minute / tenant on the narrative endpoint.
30 requests / minute / tenant on the SHAP endpoint.
Exceeding returns 429 TOO_MANY_REQUESTS. Retry after the bucket refills (the limiter is a sliding 60-second window).

In-app usage

Open any row in Studio > Decision Traces and click Explain. The dialog has tabs for all three modes, a regenerate button that sets noCache: true, and a metadata footer showing the model used, cache status, and token counts. See also: Decision Traces API | Security Model | AI Configuration

Get Started

Tutorials

Decisioning

Studio

Data Pipelines

AI & ML

Operations & Reporting

Governance & Security

Integrations

Reference

When to use each mode

How to enable it

End-to-end flow

PII redaction

Caching

Audit log for regulator mode

Worked example — one trace, three views

Regulator

Agent

Customer

Exact SHAP for gradient_boosted models

Rate limits

In-app usage

Get Started

Tutorials

Decisioning

Studio

Data Pipelines

AI & ML

Operations & Reporting

Governance & Security

Integrations

Reference

Documentation Index

​When to use each mode

​How to enable it

​End-to-end flow

​PII redaction

​Caching

​Audit log for regulator mode

​Worked example — one trace, three views

​Regulator

​Agent

​Customer

​Exact SHAP for gradient_boosted models

​Rate limits

​In-app usage

When to use each mode

How to enable it

End-to-end flow

PII redaction

Caching

Audit log for regulator mode

Worked example — one trace, three views

Regulator

Agent

Customer

Exact SHAP for gradient_boosted models

Rate limits

In-app usage