Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt

Use this file to discover all available pages before exploring further.

How scoring works — end-to-end

This page is the synthesis of every model component in the platform. The pieces are documented individually (algorithms, maturity ramp, uplift, model lifecycle), but the question operators and evaluators ask most often is: “when a customer event lands, what actually happens between the request and the response?” That’s what’s covered here, in order, with every step linked to its deep-dive.

The 30-second mental model

A POST /api/v1/recommend runs the customer through eight stages. Each stage either narrows the candidate set, scores it, or persists state. The scoring stages are where the models actually drive the decision — everything else is filtering or bookkeeping. The learning loop is the dashed arrow at the bottom: every /respond updates ModelAdaptation posteriors that the next /recommend reads at stage 6.

Stage-by-stage walkthrough

1. Load + enrich the customer

Source: platform/src/app/api/v1/recommend/route.tsenrichCustomer The route resolves the customer from customerId and runs the configured enrichment node of the decision flow. Enrichment pulls customer attributes from the Data module’s schema tables (declared via Schemas + loaded via Pipelines), plus any behavioral metrics that have rolling windows (impression count 30d, complaint rate 90d, etc.). Output: a customer payload with both static attributes and rolling metrics that downstream stages can reference. Optional flow knob: EnrichNodeConfig.excludeJoinIds[] (added in WS1) — lets the flow author exclude expensive joins per-decision-flow when the data isn’t needed.

2. Build the candidate offer set

For each request the engine starts with every active offer in the tenant (status = "active", not soft-deleted), then narrows. The narrowing happens via three optional filters before scoring:
  • channelId query param → only offers with creatives on the requested channel
  • placementId query param → only offers with creatives at the requested placement
  • mandatory offers (per business hierarchy) bypass downstream filtering and always rank
This is the input set that flows into qualification.

3. Qualification rules

Source: lib/qualification-engine.ts Each qualification rule evaluates per (customer, offer) pair. There are 13 wired ruleType values — segment membership, attribute conditions, propensity thresholds, recency checks, metric conditions, etc. — and rules can be scoped global, category, sub-category, channel, offer, creative, or placement. A candidate that fails ANY qualification rule is dropped. The qualification result is persisted to decision_traces.qualificationResults[] with {offerId, passed, reason, ruleId} so the Decision Provenance UI can answer “why did Customer X NOT get Offer Y” without leaving the row.

4. Contact policy (suppression)

Source: lib/contact-policy-engine.ts Contact policies are the always-on layer (made implicit in WS T21, with optional skipContactPolicy opt-out per flow). 13 wired ruleType values cover frequency caps, cooldowns, category-suppression windows, complaint suppression, do_not_contact (DNC — the only mechanism that suppresses across channels), and metric_condition rules. Each policy is per-candidate; the first blocking match suppresses. Unknown ruleTypes fail-closed (block + log) — a safety guarantee.

5. Maturity ramp (Bayesian Confidence-Bound — BCB-MR)

This is the first place models drive a decision. Source: lib/ml/maturity.ts, called from lib/pipeline-runner.ts → applyMaturityRamp. The maturity ramp gates exposure for offers whose posterior is too wide to rank confidently. Detailed math in /ai-ml/maturity-ramp. The short version: For each candidate offer, the engine computes the Wilson 95% credible-interval width for the Bernoulli posterior at the most-specific available scope (offer → channel → direction → category → global). If the width is tenant.settings.maturityWidthThreshold (default 0.20), the offer is mature → full exposure. Otherwise:
decayingFloor(n) = baseFloor / √(1 + n / decayHalfLife)
exposureProbability = max(decayingFloor(n), wilsonLower)
A deterministic hash (customerId, offerId, today) rolls against the exposure probability — if the roll exceeds it, the candidate is excluded from this customer’s decision today. Why this matters: cold-start offers get controlled exploration; mature offers run at full confidence; offers with strong early evidence aren’t punished by the floor decay. The posterior-width gate (vs. a fixed evidence-count threshold) lets low-volume offers mature once their CI is tight enough and keeps high-volume volatile offers in exploration when their CI stays wide.

6. Scoring — the model-heavy stage

Source: lib/pipeline-runner.ts (PRIE-U branch around line 2140) + lib/scoring/*.ts Each candidate gets scored. The scoring method is configured per decision flow as one of priority_weighted, propensity, or formula (PRIE / PRIE-U). The decision flow can also reference a specific algorithm model via the Score Node OR rely on a default scorer.

6a. Picking which algorithm scores this candidate

The platform supports 10 algorithm types. Each has its own page in /ai-ml/algorithms/*:
TypeWhen it fits
scorecardRule-based weights, no training needed. Best for transparency.
bayesianNaive Bayes with online updates — industry-standard Bayesian classifier with per-feature posteriors.
logistic_regressionCalibrated probability output. Good baseline.
gradient_boostedHigh-fidelity tabular. Best raw AUC. Requires retraining.
thompson_banditExploration-exploitation per offer; converges to best arm.
epsilon_greedySimpler bandit, ε% exploration.
online_learnerStreaming SGD logistic regression.
neural_cfCollaborative filtering, customer × offer embeddings.
external_endpointDelegate scoring to a 3rd-party HTTP scorer.
onnx_importedBring-your-own ONNX model.
registryStatus controls which models are used live. Only champion is the default scorer for its registry family; shadow scores silently for offline evaluation; challenger participates in experiments. Detailed lifecycle: /ai-ml/model-lifecycle.

6b. The hierarchical propensity read

When the scoring method is propensity or formula, the engine reads ModelAdaptation rows in this priority order:
offer  →  channel  →  direction  →  category  →  global  →  0.5 fallback
Each tier has its own evidence threshold before it’s trusted (offer ≥ 50, channel ≥ 15, direction ≥ 10, category ≥ 20, global ≥ 10). The first tier above its threshold wins; if offer-level evidence is sparse-but-present, the score is blended with the strongest available fallback via Bayesian shrinkage. The propensitySource field on every scored candidate records WHICH tier fired (offer, offer+blend, channel, direction, category, global, fallback). Persisted into decision_traces.scoringResults[i].propensitySource so operators can answer “why did this offer rank where it did?”. See /ai-ml/model-lifecycle#scope-hierarchy for full thresholds.

6c. Bandits write per-offer state

For thompson_bandit and epsilon_greedy, every /respond updates the bandit posterior at the chosen scope:
  • Thompson stores Beta(α, β). Convert → α += 1; dismiss → β += 1.
  • ε-greedy stores (pulls, totalReward). Every respond increments pulls; positive outcomes also increment totalReward.
Both update incrementally per-respond — no batch retrain needed.

7. PRIE-U arbitration — the final ranking score

When the flow’s scoring method is formula, the final per-candidate score is a weighted geometric mean across five dimensions:
score = P^Wp × R^Wr × I^Wi × E^We × max(τ, floor)^Wu
where each dimension comes from a different source:
LetterDimensionSource
PPropensityStage 6b’s hierarchical adaptation read
RRelevancecomputeRelevance(candidate, context) — channel match, recency, segment fit
IImpactComposite of Offer.businessValue / margin / revenueValue
EEmphasisOffer.priority / 100 — manual business priority
UUpliftCATE estimate τ = μ_T − μ_C mapped to max(0.01, 0.5 + τ/2)
Wp, Wr, Wi, We, Wu are the per-RankingProfile weights (tenant.settings.defaultRankingProfileId, or specified on the flow). Default Wu = 0 keeps the legacy 4-factor PRIE bit-identical (back-compat). When Wu > 0, persuadable offers (τ positive) get a multiplicative boost and sleeping-dog offers (τ negative) get suppressed — exactly what makes the uplift signal change the ranking. For the detailed CATE math, T-learner / X-learner derivations, and the four uplift segments (persuadable / sure_thing / lost_cause / sleeping_dog), see /ai-ml/uplift-modeling. The PRIE composition draws on the recommender-systems literature where propensity (likelihood of conversion), relevance (channel/context match), business impact, and editorial emphasis are four axes that any multi-objective ranker must combine. The uplift dimension U is what differentiates a causal ranker from a predictive one — see /ai-ml/uplift-modeling for the references.

8. Allocation — Hungarian or greedy

For multi-placement decisions (the group node in a decision flow), the engine has to assign offers to placement slots. Two strategies:
  • Hungarian: globally optimal assignment that maximizes the total score across all (offer, placement) pairs subject to constraints (one offer per slot, no offer repeated, channel coupling). O(n³). Default for premium accounts.
  • Greedy: fastest available offer wins; subsequent placements get the next-best. O(n log n). Used when latency budget is tight.
After allocation, the channel atomic coupling pass applies: if any placement on a channel with couplingMode: "atomic" is empty (couldn’t find a viable offer), the engine empties the ENTIRE channel — so a half-rendered email never goes out. The flow’s couplingOverride lets you toggle this per-flow.

Bookkeeping — what gets persisted

Before the response returns, the engine writes:
  1. One recommendation-type interaction_history row per returned decision — via the new persistDecisionInteractions helper (Bug #248 fix). This is the audit join key the /respond route uses to bind a {customerId, rank} pair back to the (offerId, creativeId, channelId) that was actually shown.
  2. One impression-type interaction_history row per decision delivered on a channel where impressionMode != "explicit" — for channels we send (email, batch), the impression is auto-recorded. For client-rendered channels (web, mobile push), the impression isn’t recorded until the client calls /api/v1/impressions.
  3. One decision_trace row with the full forensic chain: qualification results, contact policy decisions, scoring results (with propensitySource, upliftTau, upliftMultiplier per candidate), selected offers, ranking weights used, experiment assignment if any, inputsHash, totalLatencyMs. Sampled per tenant.settings.decisionTraceSampleRate.

The learning loop — what /respond does to the next decision

The ModelAdaptation row updated at each scope is what the next /recommend reads in stage 6b. Because adaptations are tiered, a single respond improves scoring at every level the offer participates in:
  • (scope: "offer", scopeId: <offerId>) — directly improves this offer’s per-decision score
  • (scope: "category", scopeId: <categoryId>) — improves baseline for every offer in this category
  • (scope: "channel", scopeId: <channelId>) — improves baseline for every offer on this channel
  • (scope: "direction", scopeId: "inbound" | "outbound") — improves baseline for traffic with this intent
  • (scope: "global", scopeId: "") — improves baseline for everything
The Bug #248 attribution precondition guard prevents inflated learning: positive outcomes credited against an offer the customer was never actually shown (e.g. external attribution noise) get blocked from the adaptation upsert with status: "recorded_without_adaptation" plus an attribution_precondition_failed audit row. Model state stays protected. The model_matured telemetry event fires when an offer’s Wilson CI width crosses the maturity threshold downward for the first time — one event per (model × scope × scopeId) transition. See /ai-ml/maturity-ramp for how this gates exposure on subsequent /recommend calls.

Worked example — the headline finding from the model-architecture round

A real live-test result from /api/v1/algorithm-models/.../uplift?method=t_learner&mode=fitted against the e2e tenant. The same offer (Auto Loan Refi) produces a different CATE in two different score-time contexts:
Customer context (channel × direction)τ (CATE)SegmentWhat the engine concludes
direct_mail × inbound−0.073sleeping_dogShowing this offer suppresses conversion. Hide it.
direct_mail × outbound+0.0036uncertainNeutral effect. Default ranking applies.
The marginal mode (which used pre-aggregated ModelAdaptation rates) collapsed both to τ = 0 — couldn’t distinguish them. The fitted mode (two separate logistic regressions on treated vs. control subsets of interaction_history, with per-row features for channel one-hot, direction one-hot, time-of-day sin/cos, day-of-week sin/cos) produces context-varying τ — which is exactly what the per-customer CATE literature (Künzel et al. PNAS 2019) calls the heterogeneous treatment effect. Plug Wu > 0 into the flow’s RankingProfile and the ranking now actively pushes the sleeping_dog DOWN on inbound while leaving it neutral on outbound. The same model, the same offer, two different decisions per channel-direction context. This is what makes Kaireon’s decisioning behave differently from a propensity-only system.

Configuration knobs — quick reference

Every knob the operator can turn that affects the scoring path:

Per-tenant (tenant.settings)

SettingDefaultWhere it lands
maturityRampMode"bayesian_ci"Stage 5 — BCB-MR vs. legacy_count
maturityWidthThreshold0.20Stage 5 + telemetry D threshold
maturityRampColdStartFloor0.50Stage 5 — baseFloor
maturityFloorDecayHalfLife10Stage 5 — floor decay constant
modelMaturityThreshold100Stage 5 — legacy_count mode only
upliftMethodDefault"t_learner"Stage 7 — default for /uplift endpoint
propensityScoreFloor0.05Stage 6 — minimum propensity component
propensitySmoothingWeight10Stage 6b — Bayesian shrinkage strength
defaultRankingProfileIdStage 7 — Wp/Wr/Wi/We/Wu source
UI: /settings/models (new this round) exposes the maturity + uplift knobs; the rest live under /settings.

Per-RankingProfile (weights JSONB)

propensityWeight (Wp, default 0.4), relevanceWeight (Wr, default 0.2), impactWeight (Wi, default 0.3), emphasisWeight (We, default 0.1), upliftWeight (Wu, default 0).

Per-DecisionFlow

rankingProfileId (which weights to use), scoringMethod (priority_weighted / propensity / formula), couplingOverride (channel atomic coupling), skipContactPolicy (rare — for synthetic flows that shouldn’t suppress).

Per-AlgorithmModel

status (operational), registryStatus (lifecycle), autoLearn (whether /respond updates state), learnMode, learnSchedule, outcomeWeights.

Where every model interaction lives

Pre-existing dives: This-round dives: API references:

Kaireon’s decisioning capabilities at a glance

CapabilityImplementation
Per-(channel × direction) posteriorPer-scope row with (scope, scopeId) as the unit of adaptation
Maturity gateWilson credible-interval width (principled — width ≤ 0.20 at 95% CI = mature)
Per-customer CATET-learner and X-learner, per-context uplift estimation
τ in rankingPRIE-U composite: P^Wp × R^Wr × I^Wi × E^We × max(τ, floor)^Wu
Historical backfillIdempotent cron, replayable from decision_traces
Maturity telemetrymodel_matured audit event with old/new state
Per-scope adaptation UIAdaptations panel with Wilson CI bars per (scope, scopeId)

See /about for the platform overview, or jump back to Core concepts for the building-block view.