> ## Documentation Index
> Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# A/B test with holdout

> Run a champion-vs-challenger comparison while reserving a deterministic control group for causal uplift measurement. The platform handles routing, assignment persistence, and uplift calculation.

## What this solves

Most A/B tests confuse "which variant won" with "did the engine actually help vs doing nothing?" A holdout group answers the second question: a known percentage of traffic gets zero personalization (or a fixed-rule fallback). Comparing engaged-rate across variant × (in-experiment vs holdout) gives you causal uplift, not just relative ranking.

## Why this works

The platform has two complementary mechanisms:

1. **Champion / Challenger on the Score node** — `championChallenger.{champion, challengers[]}` routes per-customer via deterministic hash so the same customer always lands in the same variant.
2. **Tenant-level holdout percentage** — `tenant.settings.holdoutPercentage` (0-100) reserves that share of traffic for a "no NBA" fallback that returns offers sorted by priority weight only (the same path NBA-disabled tenants take).

Combine them and you get: champion-vs-challenger inside the experiment, control group outside, all variants persisted on every decision trace.

## Step 1 — Set the holdout percentage

```bash theme={null}
curl -X PUT https://playground.kaireonai.com/api/v1/tenant-settings \
  -H "Content-Type: application/json" -H "X-Requested-With: XMLHttpRequest" \
  -d '{ "holdoutPercentage": 10 }'
```

This sends \~10% of every customer's deterministic-random roll into the priority-only fallback. Verify with `GET /api/v1/tenant-settings` afterward.

## Step 2 — Configure champion/challenger on the Score node

```json theme={null}
{
  "id": "score",
  "type": "score",
  "config": {
    "method": "formula",
    "championChallenger": {
      "enabled": true,
      "experimentId": "cards-q4-uplift",
      "champion":    { "modelKey": "scorecard-v2",   "weight": 50 },
      "challengers": [
        { "modelKey": "bayesian-v2",         "weight": 30 },
        { "modelKey": "gradient_boosted-v2", "weight": 20 }
      ]
    }
  }
}
```

The weights sum to 100. Each customer's `customerId × experimentId` hash falls into one bucket; the routing is persistent across sessions for that customer until you change the configuration.

## Step 3 — Capture the variant on each decision

The recommend response includes:

```json theme={null}
{
  "experimentVariant": "challenger-bayesian-v2",
  "controlGroup": false,
  ...
}
```

`controlGroup: true` means this customer was in the holdout — the engine ran the priority-only fallback path. The `decision_traces.experimentAssignment` JSONB persists the variant for later analysis.

## Step 4 — Measure uplift

The platform's `/api/v1/experiments/uplift` endpoint computes z-tested uplift between in-experiment and holdout:

```bash theme={null}
curl https://playground.kaireonai.com/api/v1/experiments/cards-q4-uplift/uplift \
  -H "X-Requested-With: XMLHttpRequest"
```

Returns conversion rate per variant + the holdout, along with a confidence interval. The math lives in `platform/src/lib/experimentation/uplift.ts`.

## Gotchas

* **Holdout is tenant-wide.** Setting `holdoutPercentage` affects every flow for that tenant; if you need per-flow holdouts use the `experiment.holdoutPercent` field on the Experiment resource instead.
* **Variant assignment is persistent.** The same customer always sees the same variant — even after the experiment ends, until you flip `championChallenger.enabled` to false.
* **`autoPromote`** on the Experiment resource (when enabled) automatically promotes the winning challenger to champion after the experiment meets its success criteria. Combine with `four-eyes` approval for governance.

## What the trace will show

```
customerId   | experimentVariant         | controlGroup | finalCount
cust-A-001   | champion-scorecard-v2     | false        | 3
cust-A-002   | challenger-bayesian-v2    | false        | 3
cust-A-003   | (none)                    | true         | 3      ← holdout, priority-only path
```

## Proof reference

T11 (bulk respond) + T15 (scoring strategy resolution) + the experiment fixture in T1 of the proof bundle cover this end-to-end.
