Agent Playbooks

Overview

KaireonAI exposes ~110 primitive MCP tools (one per entity + CRUD + intelligence). Agent playbooks are a higher-level layer on top: named, composable operations that chain 5-10 primitive tools into one structured invocation. Agents call a single playbook instead of orchestrating the chain themselves. Every playbook:

Validates its input with Zod.
Enforces a tenantId (from input or the MCP server’s configured tenant).
Is dry-run by default for any side-effecting operation — callers pass apply: true to commit.
Writes an entry to AuditLog with action: "playbook.<name>" when it commits.
Has per-(tenant, playbook) sliding-window rate limits.
Runs LLM work on-demand only — never on the /recommend hot path.

Playbooks are registered as first-class MCP tools under the playbook_* namespace.

The 10 playbooks

Tool	Purpose	Dry-run default
`playbook_run_shadow_experiment`	Create a shadow challenger + replay recent traces + return uplift + recommendation	yes
`playbook_arbitrate_policy_conflict`	Detect policy conflicts + propose priority resolution	yes
`playbook_explain_decision_chain`	Fetch a trace + generate regulator/agent/customer narratives	read-only
`playbook_promote_challenger_if_winning`	Run uplift z-test + promote winner if thresholds met	yes
`playbook_rebuild_offer_qualification`	Suggest tighter/looser qualification thresholds from recent outcomes	yes
`playbook_audit_tenant_data_health`	Score connectors + pipelines + schemas + DLQ + trace coverage	read-only
`playbook_bootstrap_new_offer_campaign`	Build Category + SubCategory + Offer + Creative + QualRule + Journey skeleton	yes
`playbook_simulate_weight_change`	Replay traces with proposed arbitration weights; winners/losers	read-only
`playbook_explain_algorithm_upgrade`	Auto-upgrade evaluator + LLM rationale + projected AUC gain	read-only
`playbook_generate_dsar_export`	Assemble GDPR Art.15 bundle: traces + regulator narratives + subject data	always applies

run_shadow_experiment

Input

{
  "championModelId": "abc-123",
  "challengerConfig": { "rules": [...] },
  "sampleSize": 200,
  "apply": false
}

Output

{
  "playbook": "run_shadow_experiment",
  "dryRun": true,
  "shadowModelId": null,
  "replay": { "tracesReplayed": 187, "offersCompared": 1244, "championWins": 612, "challengerWins": 604, "ties": 28 },
  "uplift": { "uplift": -0.006, "pValue": 0.61, "significant": false },
  "recommendation": "keep_champion"
}

Chain: load champion model → replay traces → re-score each offer with champion and challenger configs → compute uplift (two-proportion z-test) → recommend promote, keep_champion, or gather_more_data.

arbitrate_policy_conflict

Runs the conflict detector across offers, qualification rules, contact policies, and experiments. For priority_tie conflicts, proposes a priority bump on the alphabetically-first rule. With apply: true, commits the priority changes.

explain_decision_chain

Given a decisionTraceId, fetches the trace and calls generateNarrative() for each of the three modes (regulator, agent, customer). Narratives are cached per-(tenant × trace × mode × model). Input

{ "decisionTraceId": "trace-1", "modes": ["regulator", "customer"] }

promote_challenger_if_winning

Runs the uplift z-test on experiment.results (champion vs each challenger). Promotes the best challenger only if it clears:

samples >= minSamples (default 100) for champion AND challenger
significant === true at 95% confidence
uplift >= minUplift (default 0.02)

Before/after champion model IDs are logged to the audit trail.

rebuild_offer_qualification

For an offer, walks its assigned qualification rules and examines recent InteractionHistory to compute conversion rates. Suggests threshold deltas:

Low conversion → tighten threshold (+0.1)
High conversion → loosen (-0.1)
Middle band → no change

Confidence reported as low / medium / high based on sample size.

audit_tenant_data_health

Five probes, each scored 0-100:

Probe	Pass criteria
`connector_status`	No connectors in `error` state
`pipeline_freshness`	All pipelines ran within `pipelineFreshnessHours` (default 24)
`schema_validation`	No empty schemas
`dlq_depth`	DLQ count == 0
`decision_trace_coverage`	> 100 traces in last 24h

Returns overallScore (average) and overallStatus (worst of pass/warn/fail).

bootstrap_new_offer_campaign

Given a brief, produces a plan of entities to create. Reuses existing Category / SubCategory if they match by name. With apply: true, creates:

Offer (draft)
Creative (draft, bound to supplied channelId)
QualificationRule (draft, scope=offer, propensity_threshold >= 0.5)
Journey skeleton (optional; scaffolding only)

Returns yamlDiff — a YAML-ish rendering of the plan.

simulate_weight_change

Replays traces with proposed arbitration weights applied to each offer’s stored objectives. Reports per-offer impression delta (before vs after) and the top 20 movers. Read-only — never commits weight changes.

explain_algorithm_upgrade

Calls evaluateUpgrade() (deterministic) and optionally generates an LLM explanation of the recommendation. Projected AUC gain is a heuristic based on confidence tier and the current AUC gap.

generate_dsar_export

Assembles a GDPR Art.15 data-subject access bundle:

All DecisionTrace rows for the customer in the last N months (default 12)
A regulator-mode narrative for each trace (capped at maxNarratives, default 25)
Qualification + contact-policy outcomes attached to each trace
Offers presented
The output of exportSubjectData() (interactions, summaries, deliveries, etc.)

Always writes an audit log entry — DSAR requests are non-discardable.

Architecture notes

Implementation lives in platform/src/lib/mcp/playbooks/.
Each playbook is one file; index.ts wires registerPlaybooks(ctx) into the MCP server.
Playbooks call into Prisma / @/lib services directly (in-process), bypassing HTTP transport for efficiency.
Audit logging reuses logAudit() — same store and hash chain as all other audit events.
Rate limits are a simple in-memory sliding window keyed on (tenant, playbook).

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

Reports

Release Notes

Overview

The 10 playbooks

run_shadow_experiment

arbitrate_policy_conflict

explain_decision_chain

promote_challenger_if_winning

rebuild_offer_qualification

audit_tenant_data_health

bootstrap_new_offer_campaign

simulate_weight_change

explain_algorithm_upgrade

generate_dsar_export

Architecture notes

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

Reports

Release Notes

​Overview

​The 10 playbooks

​run_shadow_experiment

​arbitrate_policy_conflict

​explain_decision_chain

​promote_challenger_if_winning

​rebuild_offer_qualification

​audit_tenant_data_health

​bootstrap_new_offer_campaign

​simulate_weight_change

​explain_algorithm_upgrade

​generate_dsar_export

​Architecture notes

Overview

The 10 playbooks

run_shadow_experiment

arbitrate_policy_conflict

explain_decision_chain

promote_challenger_if_winning

rebuild_offer_qualification

audit_tenant_data_health

bootstrap_new_offer_campaign

simulate_weight_change

explain_algorithm_upgrade

generate_dsar_export

Architecture notes