Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt

Use this file to discover all available pages before exploring further.

Datasets

DatasetCustomersOffersDomain
banking-cards5,00012Credit-card propensity & cross-sell
telco-churn8,0006Retention offer routing
retail-loyalty10,00020Loyalty-tier upgrade & coupon
Each dataset ships customers.csv, offers.json, outcomes.csv, splits.json, and meta.json. All synthetic — no PII. Generated by tools/qa/decisioning-bench/datasets/generate.ts with a hard-coded seed so every checkout reproduces the same rows.

Submission contract

A submission is a Docker image that:
  1. Listens on :8080.
  2. Accepts POST /recommend with body { customerId, channelId?, attributes? } and returns { decisionTraceId, offers: [{ offerId, score, rank }] }.
  3. Accepts POST /respond with body { customerId, outcome, ... }.
  4. Tolerates 100 RPS for 5 minutes (the bench harness ramps to that).

Running the harness

cd tools/qa/decisioning-bench
docker run -d --name submission -p 8080:8080 my/submission:latest
node harness/run.mjs --dataset banking-cards --target http://localhost:8080
Results land at results/<dataset>/<utc-timestamp>.json plus a leaderboard CSV.

Scoring rubric

DimensionWeight
Latency p9920%
AUC (rank-based)25%
Fairness gap (worst DI ratio)15%
Explanation quality10%
Uplift over random30%
Composite normalized to 0-100.

Honest limits

  • V1 datasets are synthetic. Real-world distribution shift is not modeled — this is a capability check, not a market-fit signal.
  • Latency is measured from the harness on a single host. Multi-pod scale isn’t tested here; that’s k6’s job.
  • The repo currently lives in tools/qa/decisioning-bench/. Splitting to a dedicated kaireonai/decisioning-bench repo + GitHub-Pages leaderboard is operator-driven (needs repo creation + Pages config).