Recommend - KaireonAI

Decision Flows list view in the Kaireon Studio

POST /api/v1/recommend runs the tenant’s published decision flow for one customer and returns a ranked list of offers, each tagged with an interactionId and recommendationId for downstream attribution. The route is the production hot path for next-best-action delivery.

What it does

The route resolves a decision flow for the (tenant, channel, placement) tuple, then hands execution to the decision-flow engine. The engine walks the flow’s nodes (Enrich → Qualify → Score → Rank → Compute) and returns the ranked candidates plus a compact trace summary. Resolution preference is explicit > routed > auto-selected: an explicit decisionFlowKey (or legacy blueprintKey) wins; if absent, a flow-route lookup keyed by channel and placement runs; if no route matches, the most recently updated published or active flow is picked; if no flows exist, a base flow is lazy-created on the first call. The route has two synchronous side effects per call. Every returned decision is written to the interaction history as a recommendation row so POST /api/v1/respond can look it up by recommendationId + rank. Decisions whose channel does not require explicit impression tracking are also auto-recorded as impression rows in the same partitioned table. Both writes use raw SQL because the standard batch-insert path emits an ON CONFLICT (id) clause that is invalid against a composite primary key on a partitioned table.

Quick start

curl -X POST https://playground.kaireonai.com/api/v1/recommend \
  -H "Content-Type: application/json" \
  -H "X-API-Key: krn_your_api_key" \
  -H "X-Tenant-Id: 5a9904b9-..." \
  -d '{
    "customerId": "cust_42",
    "channel": "email",
    "placement": "hero_banner",
    "limit": 3,
    "sessionId": "9b1d-4e6c",
    "attributes": { "tier": "gold" },
    "locale": "en-US",
    "currency": "USD"
  }'

Response (single-placement shape, abbreviated):

{
  "interactionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "recommendationId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "customerId": "cust_42",
  "sessionId": "9b1d-4e6c",
  "decisionFlowKey": "main",
  "decisionFlowVersion": 7,
  "experimentVariant": null,
  "controlGroup": false,
  "direction": "inbound",
  "timestamp": "2026-04-30T14:22:01.123Z",
  "channel": "email",
  "placement": "hero_banner",
  "locale": "en-US",
  "currency": "USD",
  "count": 3,
  "decisions": [
    {
      "rank": 1,
      "score": 0.84,
      "offerId": "off_premium_card",
      "offerName": "Premium Travel Card",
      "channelName": "Email",
      "channelType": "email",
      "placementId": "plc_hero",
      "placementName": "Hero Banner",
      "categoryId": "cat_credit",
      "categoryName": "Credit Cards",
      "subCategory": "Travel",
      "mandatory": false,
      "priority": 80,
      "weight": 100,
      "creativeId": "crv_email_a",
      "creativeName": "Email Variant A",
      "templateType": "html",
      "content": "...",
      "properties": {},
      "abTestVariant": null,
      "constraints": {},
      "expiresAt": null,
      "metadata": {},
      "scoreExplanation": {
        "method": "priority_weighted",
        "priority": 80,
        "weight": 100,
        "fitMultiplier": 1.0,
        "finalScore": 0.84
      },
      "personalization": { "personalized_rate": 4.99, "greeting": "Welcome back" },
      "impressionId": "imp_a3f1..."
    }
  ],
  "meta": {
    "totalCandidates": 12,
    "afterQualification": 8,
    "afterSuppression": 8,
    "afterContactPolicy": 6,
    "degradedScoring": false
  }
}

Live sample — captured from playground

The following request and response are captured verbatim from the 2026-05-05 functional test against https://playground.kaireonai.com, showing differential propensity scoring for customer C1 (high credit, short tenure) when the published flow’s score node points at Scorecard A (credit-first weights).

curl -X POST 'https://playground.kaireonai.com/api/v1/recommend' \
  -H 'Content-Type: application/json' \
  -H 'X-API-Key: krn_…' \
  -H 'X-Tenant-Id: b25341df-…' \
  -d '{
    "customerId": "C1",
    "channel": "FT Email",
    "limit": 3,
    "attributes": {
      "credit_score": 780,
      "tenure_months": 3,
      "has_mortgage": false,
      "income_band": "high"
    }
  }'

Response:

{
  "interactionId": "9f12ecaa-…",
  "recommendationId": "9f12ecaa-…",
  "customerId": "C1",
  "channel": "FT Email",
  "decisionFlowKey": "base-nba-flow",
  "decisionFlowVersion": 31,
  "count": 3,
  "decisions": [
    {
      "rank": 1,
      "score": 0.9800,
      "offerId": "1ca9c6eb-…",
      "offerName": "FT Mortgage Refi",
      "channelName": "FT Email",
      "scoreExplanation": "propensity (Scorecard A)",
      "personalization": {}
    }
  ]
}

Switching the published flow to Scorecard B (loyalty-first weights) and repeating the same request returns score: 0.6460 for the same customer — proving the algorithm swap takes effect end-to-end (see functional test report).

How it works

Authentication and quota

Every call resolves a tenant before any work runs. Requests carrying an X-API-Key that starts with krn_ are validated against the database and the bound tenant id is used (header X-Tenant-Id is ignored to prevent spoofing); other requests fall back to a tenant id read from the request headers. Missing tenant context returns 401; a tenant id that doesn’t exist in the database returns 403. After auth, the handler enforces per-window rate limits and a lifetime decision quota. Playground tenants are capped at 5,000 lifetime decisions; past that the route returns 429 with error code "PLAYGROUND_QUOTA_EXCEEDED". Non-playground tenants face no decision quota.

Anonymous-customer derivation

When customerId is missing or set to "anonymous", the route derives a stable surrogate. With a sessionId present, the surrogate is anon-{sessionId} after validating the session id against ^[a-zA-Z0-9_-]+$ and capping at 64 chars. Without a session id, the surrogate is anon-{8-hex} derived from an FNV-1a hash of x-forwarded-for + user-agent. The GET endpoint runs the same derivation.

Decision-flow resolution

Resolution order:

Explicit decisionFlowKey (or legacy blueprintKey) in the body. If the value is not a key, the route attempts a case-insensitive name lookup and rewrites it to a key.
Flow-route lookup keyed by (tenantId, channelId | channel, placement), cached for 120s under route:{tenantId}:{channelId}:{placement}.
Most recently updated published or active flow.
If no flow exists, a base flow is lazy-created on the first call.

If none of these produce a key, the handler returns 400 No published decision flow found.

Kill switch and control group

The tenant kill switch reads tenant.settings.nbaEnabled and caches the result for 60s under nba-enabled:{tenantId}. When the flag is false, the route bypasses flow execution and returns a fallback priority response (offers sorted by priority descending, with nbaEnabled: false and meta.fallbackMode = "priority_only"). Control-group bucketing runs an FNV-1a hash of control:{customerId}:{YYYY-MM-DD} to deterministically bucket the customer for the day. The percentage is read from tenant.settings.controlGroupPercent (default 2%, cached 60s under control-group-pct:{tenantId}). Control-group decisions keep qualification and contact-policy filtering but get scores randomized via the same hash function so the rank order is independent of the model.

Engine execution

The engine returns a result envelope with the following shape:

{
  now: Date,
  results: any[],
  customerId: string,
  decisionFlowId: string,
  decisionFlowVersion: number | null,
  variantName: string | null,
  traceSummary: { totalCandidates, afterQualification, afterSuppression, afterContactPolicy, topScores },
  degradedScoring?: boolean,
  debugTrace?: { ... }   // only when debug is requested
}

The per-decision shape is built by the response node. A scoreExplanation block is set on every decision and is asserted to be present by the integration test suite.

Realtime EXP3-IX bandit

When tenantSettings.aiAnalyzerSettings.ranking.exp3IxEnabled is true and arms are configured, the route samples one arm before flow execution. The picked armIndex and armId thread into the auto-impression’s response JSON and into the top-level response body. When the flag is off or no arms are configured, the response omits banditArmIndex. See EXP3-IX Ranking for arm configuration.

Side effects

The auto-impression block filters decisions whose channel impressionMode !== "explicit" and inserts one impression row per decision via raw SQL. The recommendation-recording block then inserts one recommendation row per decision. Both blocks loop sequentially — N decisions produce up to 2N round-trips against the partitioned interaction-history table. This is intentional: a single batch INSERT … ON CONFLICT (id) DO NOTHING cannot be expressed against a composite primary key on a partitioned table.

Reference

Request body

The POST body is read field-by-field rather than validated against a single Zod schema. Per-field shape requirements are listed below. The batch endpoint at /api/v1/recommend/batch does use a single Zod schema.

customerId

string

default:"anonymous"

Unique customer identifier. When omitted or set to "anonymous", the route derives a stable surrogate from sessionId (preferred) or x-forwarded-for + user-agent.

channel

string

Filter candidates to creatives whose channel matches this channel type or name.

channelId

string

Channel ID (UUID) used for flow-route lookup. When supplied, takes precedence over channel for routing.

placement

string

Filter candidates to creatives bound to this placement.

placements

array

Multi-placement request. Each entry is { placementId: string, limit?: number }. When present, the response is the multi-placement shape and the single-placement fields are omitted.

deduplicate

boolean

default:"false"

Multi-placement only. When true, placements are resolved sequentially and each placement excludes offers already returned by earlier placements.

limit

number

default:"5"

Maximum decisions returned. Clamped to [1, 50].

sessionId

string

Session identifier. Used both for anonymous-customer derivation and for echoing back into the response and the auto-impression’s context. Validated as alphanumeric/-/_, max 64 chars.

context

object

default:"{}"

Free-form real-time context (device, page URL, etc.) merged into the auto-impression’s context JSON.

segments

array

default:"[]"

Customer segment ids passed through to the engine for qualification rules that match against segments.

attributes

object

default:"{}"

Per-request customer attributes. Available to the Compute stage as attributes.<key> variables when evaluating computed-field formulas.

locale

string

Locale code (e.g. en-US). Echoed back in the response; reserved for content selection in future stages.

currency

string

ISO 4217 currency code. Echoed back in the response.

direction

string

default:"inbound"

inbound or outbound. Stored on the recommendation interaction row.

excludeOffers

array

default:"[]"

Offer IDs to exclude from candidates. Legacy alias excludeActions is also accepted.

excludeCreatives

array

default:"[]"

Creative IDs to exclude. Legacy alias excludeTreatments is also accepted.

decisionFlowKey

string

Explicit flow key (or name — a case-insensitive name lookup also resolves). Legacy alias blueprintKey is also accepted.

debug

boolean

default:"false"

When true, the engine attaches a debugTrace block (per-rule pass/fail reasons) to the response.

explain

boolean

default:"false"

When true, the route adds an explanation object to each decision and a rejectedOffers[] block built from the debug trace. Implies debug: true.

Response (single-placement)

This is the shape returned for both the explicit-key path and the auto-resolve path.

{
  "interactionId": "uuid",
  "recommendationId": "uuid",
  "customerId": "cust_42",
  "sessionId": "9b1d-4e6c",
  "decisionFlowKey": "main",
  "decisionFlowVersion": 7,
  "experimentVariant": null,
  "controlGroup": false,
  "direction": "inbound",
  "timestamp": "2026-04-30T14:22:01.123Z",
  "channel": "email",
  "placement": "hero_banner",
  "locale": "en-US",
  "currency": "USD",
  "count": 3,
  "decisions": [ /* see decisions[] below */ ],
  "banditArmIndex": 2,
  "banditArmId": "arm_b",
  "meta": {
    "totalCandidates": 12,
    "afterQualification": 8,
    "afterSuppression": 8,
    "afterContactPolicy": 6,
    "degradedScoring": false,
    "negotiationApply": { "applied": 1, "rejected": 0 }
  }
}

interactionId

string

UUID minted at the start of the request. Echoed in the auto-impression’s response.interactionId and the recommendation row’s response.interactionId.

recommendationId

string

Same value as interactionId. Use either when calling POST /api/v1/respond.

customerId

string

Either the supplied customerId or the derived anonymous surrogate.

sessionId

string | null

Echoed back from the request body.

decisionFlowKey

string

Key of the flow that ran. In the explicit-key path the value reflects the raw request input (after name-to-key translation); in the auto-resolve path it reflects the resolved key.

decisionFlowVersion

number | null

Version number of the flow’s published snapshot, or null when running from a draft configuration.

experimentVariant

string | null

Variant name when the flow has an experiment node and the customer was assigned a variant.

controlGroup

boolean

True when the customer was bucketed into the always-on control group for the current UTC day.

direction

string

Echoed from the request body, default "inbound".

timestamp

string

ISO timestamp from bpResult.now, set when the engine started.

channel

string

Echoed from the channel request field, or "all" when none was supplied.

placement

string

Echoed from the placement request field, or "all" when none was supplied.

locale

string | null

Echoed from the request body.

currency

string | null

Echoed from the request body.

count

number

Number of items in decisions[] after the per-request limit was applied.

decisions

array

Ranked offers. See the per-decision sub-fields below.

rejectedOffers

array

Present only when explain=true and the debug trace contains rejection reasons. Each entry is { offerId, offerName, stage: "eligibility" | "contact_policy", reason }.

banditArmIndex

number

Present only when EXP3-IX is enabled for the tenant AND tenantSettings.aiAnalyzerSettings.ranking has configured arms. The route picks one arm before flow execution and threads its index into the response.

banditArmId

string

Companion to banditArmIndex. Echo this value back into interaction.response when calling /respond so the arm’s log-weight gets updated on outcome.

`decisions[]` per-item shape

rank

number

1-based rank after sorting and any control-group reshuffle.

score

number

Final score from the scorer (or randomized in control group).

offerId

string

offerName

string

channelName

string | null

Joined from creative.channel.name.

channelType

string | null

Joined from creative.channel.channelType.

placementId

string | null

placementName

string | null

Joined from creative.placement.name.

categoryId

string | null

categoryName

string | null

Joined from offer.categoryRef.name with fallbacks.

subCategory

string | null

Joined from offer.subCategoryRef.name with fallbacks.

mandatory

boolean

Mirrored from offer.mandatory.

priority

number

Mirrored from the candidate’s priority (driven by offer.priority).

weight

number

default:"100"

Mirrored from the candidate’s weight.

creativeId

string | null

creativeName

string | null

templateType

string | null

From creative.templateType.

content

any

From creative.content.

properties

object

Per-candidate property bag from the engine.

abTestVariant

string | null

From creative.abTestVariant.

constraints

object

From creative.constraints.

expiresAt

string | null

ISO timestamp from offer.expiresAt.

metadata

object

From offer.metadata.

scoreExplanation

object

Set on every decision; the integration test suite asserts it must be present. Shape: { method, priority, weight, fitMultiplier, finalScore }.

personalization

object

Free-form Record<string, any>. Keys are tenant-defined — they come from the category’s customFields of type computed, plus optional flow-level extras and per-flow overrides. Standard examples include personalized_rate or greeting, but the field set is open.

impressionId

string

Present only when the candidate’s channel uses implicit impression tracking and the auto-impression insert succeeded. Looked up by deduplication id and threaded into the decision.

appliedNegotiation

object

Present only when the realtime apply-mode wire ran and the candidate was accepted. Shape: { sessionId, proposal }.

appliedNegotiationReject

object

Present only when the realtime apply-mode wire rejected the candidate. Shape: { sessionId, reason }.

Response (multi-placement)

Returned when the request body supplies a placements[] array.

{
  "placements": {
    "hero_banner": {
      "offers": [
        {
          "rank": 1,
          "score": 0.84,
          "offerId": "off_a",
          "offerName": "...",
          "creativeId": "crv_x",
          "creativeName": "...",
          "channelName": "Email",
          "categoryName": "Cards",
          "mandatory": false,
          "priority": 80,
          "personalization": {}
        }
      ],
      "count": 1
    }
  },
  "customerId": "cust_42",
  "interactionId": "uuid",
  "recommendationId": "uuid",
  "requestId": "uuid",
  "timestamp": "2026-04-30T14:22:01.123Z"
}

placements

object

Map of placementId → { offers: [...], count: number }. As of the multi-placement fix (#158), each offers[] entry includes the same render-essential creative fields as the single-placement shape — specifically templateType and content — so consumers can render the creative without a follow-up fetch. The fields that remain single-placement-only are weight, scoreExplanation, impressionId, and appliedNegotiation (these come from stages that don’t fire in the multi-placement code path).

customerId

string

interactionId

string

recommendationId

string

Same value as interactionId.

requestId

string

Same value as interactionId and recommendationId. Multi-placement only — kept for legacy callers that key on requestId.

timestamp

string

ISO timestamp set when the response was assembled.

channelCoupling

array

Present only when the multi-placement request crossed at least one channel with couplingMode = "atomic" (or a DecisionFlow.couplingOverride = "atomic"). One entry per channel touched by the request. Shape: [{ channelId, channelName, mode, cascaded, emptyPlacements: [placementId, ...] }]. When cascaded = true, the channel’s other placements were emptied because at least one placement in the same channel had no candidates — surface this in the consumer UI to distinguish “we suppressed this because the sibling was empty” from “this placement just wasn’t configured.” Cross-channel coupling is intentionally NOT supported; different channels are independent attention surfaces.

How the multi-placement engine runs the flow

When placements: [...] is provided AND every placement resolves to the same decision flow (via resolveFlowRoute → channel+placement → channel-only → tenant default), the recommend route runs executeDecisionFlow ONCE with placementFilters: [<all requested placement ids>]. The match_creatives node keeps candidates whose placementId is in the requested set (plus wildcards when allowWildcard is true). The group node’s allocator (Hungarian or Greedy) sees the full set of slots across all placements in a single cost matrix, so Hungarian enforces per-offer uniqueness across placements within the channel. Each surviving candidate is stamped with its assigned placementId, and the route splits the flat result list back into per-placement buckets in the response. When placements resolve to different flows, the route falls back to per-placement execution (one executeDecisionFlow call per placement, each scoped to a single placement). In that fallback, Hungarian uniqueness only holds within a single placement; the channel-coupling pass still runs at the route boundary across the aggregated placementResults.

Response (engine-emitted grouped)

Returned when the executed flow’s last node is a Group node that emits placements (V2 grouped response).

{
  "interactionId": "uuid",
  "recommendationId": "uuid",
  "customerId": "cust_42",
  "timestamp": "2026-04-30T14:22:01.123Z",
  "placements": { "hero_banner": { "offers": [], "count": 0 } },
  "meta": {
    "decisionFlowKey": "main",
    "decisionFlowVersion": 7,
    "experimentVariant": null,
    "totalCandidates": 12,
    "degradedScoring": false
  }
}

The engine-emitted grouped response carries a meta block; the request-driven multi-placement response (when the caller supplies placements: [...]) does not.

GET endpoint

GET /api/v1/recommend accepts a subset of POST body fields as query-string parameters: customerId, channel, placement, limit, decisionFlowKey, explain, debug. The GET response omits sessionId, locale, and currency because no request body carries them. The GET endpoint returns 400 decisionFlowKey is required. Multiple active flows exist: … when more than one published or active flow exists and no decisionFlowKey query parameter is set. POST auto-selects in the same situation.

Status codes

Code	When
200	Successful recommendation
400	Missing required body fields, invalid JSON, or unresolvable flow
400	GET only — multiple active flows exist and no `decisionFlowKey` query param
401	Missing tenant context
403	Invalid tenant identifier
429	Rate limit exceeded OR playground 5,000-decision quota exhausted
500	Unexpected server error
504	Request exceeded the 30s timeout

Three different error envelope shapes are emitted by this route:

Standard API-error envelope — used by 400 and 500: { error: { code, message, status, traceId, timestamp } }.
Tenant-error envelope — used by 401 and 403: { title, detail }.
Timeout envelope — { error: { code: "TIMEOUT", message: "Request timed out", status: 504 } }. Omits traceId and timestamp.

The 429 envelope from the playground quota path is { error: { code, message, used, limit } }; the rate-limiter’s 429 envelope is also distinct.

Required headers

Header	Required	Purpose
`Content-Type`	POST only	`application/json` for POST bodies
`X-API-Key`	Yes (one of the two)	API key (`krn_…`) — also used as the rate-limit identifier
`X-Tenant-Id`	Yes (one of the two)	Direct tenant id; ignored when `X-API-Key` resolves a tenant
`X-Forwarded-For`	No	Falls back to anonymous-id derivation and rate-limit identifier
`User-Agent`	No	Used in the FNV-1a hash for anonymous customers
`x-user-id`	No	Triggers onboarding-step tracking only

Authorization: Bearer … is not a supported authentication mode on this route. The middleware reads the Authorization header only to gate CSRF; tenant resolution only verifies X-API-Key (when prefixed with krn_) and X-Tenant-Id.

Configuration

Environment variables

Variable	Effect
`NODE_ENV=test`	Disables rate limiting in the test runner

The route does not read any other environment variables directly. Tenant-level behavior is configured through tenant.settings and tenantSettings.aiAnalyzerSettings.

Caches

Cache key	TTL	What it caches
`route:{tenantId}:{channelId}:{placement}`	120s	Flow-route resolution by channel + placement
`flowkey:{tenantId}:{flowId}`	120s	Flow id → key lookup
`flow:{tenantId}:{flowKey}`	60s	Compiled decision-flow object
`nba-enabled:{tenantId}`	60s	Tenant kill-switch check
`control-group-pct:{tenantId}`	60s	Control-group percentage
`shap-enabled:{tenantId}`	60s	Whether to compute SHAP in the hot path

Rate limits

Tenant type	Per-window	Window	Lifetime decision quota
Playground (`tenant.isPlayground = true`)	100	60s	5,000
Non-playground	1,000	60s	None

Rate-limit identifier preference: X-API-Key > X-Forwarded-For > "anonymous".

Request timeout

The POST handler is wrapped in a 30-second request timeout. The GET handler is not wrapped. On timeout the response is 504 { error: { code: "TIMEOUT", message: "Request timed out", status: 504 } }.

Honest limits

The auto-resolve fallthrough response returns a meta block that omits afterSuppression; the explicit-key path includes all five counters. Tracked as a code-side cleanup.
Auto-impression and recommendation writes are up to 2N round-trips per request — one raw insert per decision in the impression loop, plus one per decision in the recommendation loop — because the interaction-history table is partitioned and a batch INSERT cannot use ON CONFLICT (id) against a composite primary key. A request with limit=50 produces up to 100 sequential SQL round-trips.
The POST body is not validated against a single Zod schema. Field validation is per-read in the handler. The batch endpoint at /api/v1/recommend/batch does use a single Zod schema.
Authorization: Bearer … is not a supported auth mode. The middleware reads the Authorization header only to gate CSRF; tenant resolution only verifies X-API-Key and X-Tenant-Id.
The 504 timeout envelope shape diverges from the standard API-error envelope — it omits traceId and timestamp. The 401/403 envelope uses { title, detail } and is also distinct.
Bandit arm-index threading fires only when the tenant has EXP3-IX enabled AND configured banditConfig.arms in tenantSettings.aiAnalyzerSettings.ranking. Without arms it is a structured no-op (no banditArmIndex in the response). See EXP3-IX Ranking for arm configuration.

Respond API — record the outcome of a recommendation.
Decision Flows — the engine that backs this route.
Decisioning Gates — qualification + contact-policy stages.
Ranking Profiles — multi-objective scoring weights.

​What it does

​Quick start

​Live sample — captured from playground

​How it works

​Authentication and quota

​Anonymous-customer derivation

​Decision-flow resolution

​Kill switch and control group

​Engine execution

​Realtime EXP3-IX bandit

​Side effects

​Reference

​Request body

​Response (single-placement)

​decisions[] per-item shape

​Response (multi-placement)

​How the multi-placement engine runs the flow

​Response (engine-emitted grouped)

​GET endpoint

​Status codes

​Required headers

​Configuration

​Environment variables

​Caches

​Rate limits

​Request timeout

​Honest limits

​Related

What it does

Quick start

Live sample — captured from playground

How it works

Authentication and quota

Anonymous-customer derivation

Decision-flow resolution

Kill switch and control group

Engine execution

Realtime EXP3-IX bandit

Side effects

Reference

Request body

Response (single-placement)

`decisions[]` per-item shape

Response (multi-placement)

How the multi-placement engine runs the flow

Response (engine-emitted grouped)

GET endpoint

Status codes

Required headers

Configuration

Environment variables

Caches

Rate limits

Request timeout

Honest limits

Related