Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt

Use this file to discover all available pages before exploring further.

What this solves

Most A/B tests confuse “which variant won” with “did the engine actually help vs doing nothing?” A holdout group answers the second question: a known percentage of traffic gets zero personalization (or a fixed-rule fallback). Comparing engaged-rate across variant × (in-experiment vs holdout) gives you causal uplift, not just relative ranking.

Why this works

The platform has two complementary mechanisms:
  1. Champion / Challenger on the Score nodechampionChallenger.{champion, challengers[]} routes per-customer via deterministic hash so the same customer always lands in the same variant.
  2. Tenant-level holdout percentagetenant.settings.holdoutPercentage (0-100) reserves that share of traffic for a “no NBA” fallback that returns offers sorted by priority weight only (the same path NBA-disabled tenants take).
Combine them and you get: champion-vs-challenger inside the experiment, control group outside, all variants persisted on every decision trace.

Step 1 — Set the holdout percentage

curl -X PUT https://playground.kaireonai.com/api/v1/tenant-settings \
  -H "Content-Type: application/json" -H "X-Requested-With: XMLHttpRequest" \
  -d '{ "holdoutPercentage": 10 }'
This sends ~10% of every customer’s deterministic-random roll into the priority-only fallback. Verify with GET /api/v1/tenant-settings afterward.

Step 2 — Configure champion/challenger on the Score node

{
  "id": "score",
  "type": "score",
  "config": {
    "method": "formula",
    "championChallenger": {
      "enabled": true,
      "experimentId": "cards-q4-uplift",
      "champion":    { "modelKey": "scorecard-v2",   "weight": 50 },
      "challengers": [
        { "modelKey": "bayesian-v2",         "weight": 30 },
        { "modelKey": "gradient_boosted-v2", "weight": 20 }
      ]
    }
  }
}
The weights sum to 100. Each customer’s customerId × experimentId hash falls into one bucket; the routing is persistent across sessions for that customer until you change the configuration.

Step 3 — Capture the variant on each decision

The recommend response includes:
{
  "experimentVariant": "challenger-bayesian-v2",
  "controlGroup": false,
  ...
}
controlGroup: true means this customer was in the holdout — the engine ran the priority-only fallback path. The decision_traces.experimentAssignment JSONB persists the variant for later analysis.

Step 4 — Measure uplift

The platform’s /api/v1/experiments/uplift endpoint computes z-tested uplift between in-experiment and holdout:
curl https://playground.kaireonai.com/api/v1/experiments/cards-q4-uplift/uplift \
  -H "X-Requested-With: XMLHttpRequest"
Returns conversion rate per variant + the holdout, along with a confidence interval. The math lives in platform/src/lib/experimentation/uplift.ts.

Gotchas

  • Holdout is tenant-wide. Setting holdoutPercentage affects every flow for that tenant; if you need per-flow holdouts use the experiment.holdoutPercent field on the Experiment resource instead.
  • Variant assignment is persistent. The same customer always sees the same variant — even after the experiment ends, until you flip championChallenger.enabled to false.
  • autoPromote on the Experiment resource (when enabled) automatically promotes the winning challenger to champion after the experiment meets its success criteria. Combine with four-eyes approval for governance.

What the trace will show

customerId   | experimentVariant         | controlGroup | finalCount
cust-A-001   | champion-scorecard-v2     | false        | 3
cust-A-002   | challenger-bayesian-v2    | false        | 3
cust-A-003   | (none)                    | true         | 3      ← holdout, priority-only path

Proof reference

T11 (bulk respond) + T15 (scoring strategy resolution) + the experiment fixture in T1 of the proof bundle cover this end-to-end.