Documentation Index
Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt
Use this file to discover all available pages before exploring further.
POST /api/v1/recommend runs the tenant’s published decision flow for one customer and returns a ranked list of offers, each tagged with an interactionId and recommendationId for downstream attribution. The route is the production hot path for next-best-action delivery.
What it does
The handler atsrc/app/api/v1/recommend/route.ts resolves a DecisionFlow for the (tenant, channel, placement) tuple, then hands execution to executeDecisionFlow in src/lib/decision-flow-engine.ts. The engine walks the flow’s nodes (Enrich → Qualify → Score → Rank → Compute) and returns a DecisionFlowResult declared at src/lib/decision-flow-engine.ts:69-97 that contains the ranked candidates plus a compact traceSummary.
Resolution preference is explicit > routed > auto-selected: an explicit decisionFlowKey (or legacy blueprintKey) wins (route.ts:755-770); if absent, a FlowRoute lookup runs (route.ts:939-945); if no route matches, the most recently updated published or active flow is picked (route.ts:947-955); if no flows exist, ensureBaseFlow lazy-creates a base flow on the first call (route.ts:1278-1284).
The route has two synchronous side effects per call. Every returned decision is written to interaction_history as a recommendation row so POST /api/v1/respond can look it up by recommendationId + rank (route.ts:1067-1086). Decisions whose channel does not require explicit impression tracking are also auto-recorded as impression rows in the same partitioned table (route.ts:999-1038). Both writes use raw SQL because Prisma’s createMany({ skipDuplicates }) emits ON CONFLICT (id) which is invalid against a composite primary key on a partitioned table.
Quick start
How it works
Authentication and quota
Every call goes throughrequireTenant at src/lib/tenant.ts:88. Requests carrying an X-API-Key that starts with krn_ are validated against the database and the bound tenantId is used (header X-Tenant-Id is ignored to prevent spoofing); other requests fall back to getTenantId(request). Missing tenant context returns 401 (tenant.ts:117); a tenantId that doesn’t exist in the DB returns 403 (tenant.ts:128).
After auth, the handler enforces rate limits via rateLimit in src/lib/rate-limit-unified.ts and a lifetime decision quota via enforceDecisionQuota in src/lib/licensing/middleware.ts:5. Playground tenants are capped at 5,000 lifetime decisions (src/lib/licensing/meter.ts:78); past that the route returns 429 with code PLAYGROUND_QUOTA_EXCEEDED. Non-playground tenants face no decision quota.
Anonymous-customer derivation
WhencustomerId is missing or set to "anonymous", the route derives a stable surrogate. With a sessionId present, the surrogate is anon-{sessionId} after validating the session id against ^[a-zA-Z0-9_-]+$ and capping at 64 chars (route.ts:734-740). Without a session id, the surrogate is anon-{8-hex} derived from an FNV-1a hash of x-forwarded-for + user-agent (route.ts:741-751). The GET handler runs the same derivation at route.ts:504-515.
Decision-flow resolution
Resolution order:- Explicit
decisionFlowKey(or legacyblueprintKey) in the body. If the value is not a key, the route attempts a case-insensitivenamelookup and rewrites it to a key (route.ts:755-770). FlowRoutelookup keyed by(tenantId, channelId | channel, placement), cached for 120s underroute:{tenantId}:{channelId}:{placement}(route.ts:213, 423-486).- Most recently updated
publishedoractiveflow (route.ts:947-955). - If no flow exists,
ensureBaseFlowlazy-creates a base flow on the first call (route.ts:1278-1284).
400 No published decision flow found. (route.ts:1284-1286).
Kill switch and control group
isNbaEnabled(tenantId) reads tenant.settings.nbaEnabled and caches the result for 60s under nba-enabled:{tenantId} (route.ts:282-305). When the flag is false, the route bypasses flow execution and returns the fallbackPriorityResponse shape at route.ts:340-393 (offers sorted by priority descending, with nbaEnabled: false and meta.fallbackMode = "priority_only").
isInControlGroup(customerId, controlGroupPercent) runs an FNV-1a hash of control:{customerId}:{YYYY-MM-DD} to deterministically bucket the customer for the day (route.ts:175-186). The percentage is read from tenant.settings.controlGroupPercent (default 2%, cached 60s under control-group-pct:{tenantId} at route.ts:191-211). Control-group decisions keep qualification and contact-policy filtering but get scores randomized via the same hash function so the rank order is independent of the model.
Engine execution
executeDecisionFlow(decisionFlowKey, ctx) returns DecisionFlowResult declared at src/lib/decision-flow-engine.ts:69-97:
src/lib/pipeline-runner.ts:2282-2316. scoreExplanation is set on every decision (pipeline-runner.ts:2309-2315) and src/__tests__/integration/block7-response-validation.integration.test.ts:227-236 asserts it must be present.
Realtime EXP3-IX bandit
WhentenantSettings.aiAnalyzerSettings.arbitration.exp3IxEnabled is true and arms are configured, selectBanditArmForRecommend at src/lib/arbitration/apply-online-tuning.ts:167 samples one arm before flow execution. The picked armIndex and armId thread into the auto-impression’s response JSON (route.ts:1011-1014) and into the top-level response body (route.ts:1254-1256 for the explicit-key path, route.ts:1407-1411 for the auto-resolve path). When the flag is off or no arms are configured, the call returns null and the response omits banditArmIndex. See EXP3-IX Arbitration for arm configuration.
Side effects
The auto-impression block atroute.ts:999-1038 filters decisions whose channel impressionMode !== "explicit" and inserts one impression row per decision via prisma.$executeRaw. The recommendation-recording block at route.ts:1067-1086 then inserts one recommendation row per decision. Both blocks loop sequentially — N decisions produce up to 2N round-trips against the partitioned interaction_history table. This is intentional: a single batch INSERT … ON CONFLICT (id) DO NOTHING cannot be expressed against a composite primary key on a partitioned table.
Reference
Request body
The POST body is read field-by-field atroute.ts:714-776 rather than validated against a single Zod schema. Per-field shape requirements live in the destructure and the inline checks below it. The batch endpoint at /api/v1/recommend/batch does use a single Zod schema, BatchRecommendSchema at src/lib/api-validate.ts:782-790.
Unique customer identifier. When omitted or set to
"anonymous", the route derives a stable surrogate from sessionId (preferred) or x-forwarded-for + user-agent (route.ts:732-751).Filter candidates to creatives whose channel matches this channel type or name (
route.ts:716).Channel ID (UUID) used for
FlowRoute lookup. When supplied, takes precedence over channel for routing (route.ts:729, 800).Filter candidates to creatives bound to this placement (
route.ts:717).Multi-placement request. Each entry is
{ placementId: string, limit?: number }. When present, the response is the multi-placement shape and the single-placement fields are omitted (route.ts:728, 791-935).Multi-placement only. When true, placements are resolved sequentially and each placement excludes offers already returned by earlier placements (
route.ts:795, 865-890).Maximum decisions returned. Clamped to
[1, 50] (route.ts:778).Session identifier. Used both for anonymous-customer derivation and for echoing back into the response and the auto-impression’s
context (route.ts:719, 736-740, 1011). Validated as alphanumeric/-/_, max 64 chars.Free-form real-time context (device, page URL, etc.) merged into the auto-impression’s
context JSON (route.ts:720, 1011).Customer segment ids passed through to the engine for qualification rules that match against segments (
route.ts:721).Per-request customer attributes. Available to the Compute stage as
attributes.<key> variables when evaluating computed-field formulas (route.ts:722, decision-flow-engine.ts:331-333).Locale code (e.g.
en-US). Echoed back in the response; reserved for content selection in future stages (route.ts:723).Currency code (e.g.
USD). Echoed back in the response (route.ts:724).inbound or outbound. Stored on the recommendation interaction row (route.ts:725, 1078).Offer IDs to exclude from candidates. Legacy alias
excludeActions is also accepted (route.ts:775).Creative IDs to exclude. Legacy alias
excludeTreatments is also accepted (route.ts:776).Explicit flow key (or name — case-insensitive name lookup runs at
route.ts:762-768). Legacy alias blueprintKey is also accepted (route.ts:755).When true, the engine attaches a
debugTrace block (per-rule pass/fail reasons) to the response (route.ts:773, decision-flow-engine.ts:87-96).When true, the route adds an
explanation object to each decision and a rejectedOffers[] block built from the debug trace (route.ts:772, 1207-1234). Implies debug: true.Response (single-placement)
Shape returned atroute.ts:1236-1266 (explicit-key path) and route.ts:1389-1420 (auto-resolve path).
UUID minted at the start of the request (
route.ts:779). Echoed in the auto-impression’s response.interactionId and the recommendation row’s response.interactionId.Same value as
interactionId. Use either when calling POST /api/v1/respond.Either the supplied
customerId or the derived anonymous surrogate.Echoed back from the request body.
Key of the flow that ran. In the explicit-key path the value reflects the raw request input (after name-to-key translation); in the auto-resolve path it reflects the resolved key.
Version number of the flow’s
publishedVersions snapshot, or null when running from draftConfig (decision-flow-engine.ts:584-597).Variant name when the flow has an experiment node and the customer was assigned a variant.
True when the customer was bucketed into the always-on control group for the current UTC day (
route.ts:175-186).Echoed from the request body, default
"inbound".ISO timestamp from
bpResult.now, set when the engine started.Echoed from the
channel request field, or "all" when none was supplied.Echoed from the
placement request field, or "all" when none was supplied.Echoed from the request body.
Echoed from the request body.
Number of items in
decisions[] after the per-request limit was applied.Ranked offers. See the per-decision sub-fields below.
Present only when
explain=true and the debug trace contains rejection reasons (route.ts:1253). Each entry is { offerId, offerName, stage: "eligibility" | "contact_policy", reason }.Present only when EXP3-IX is enabled for the tenant AND
tenantSettings.aiAnalyzerSettings.arbitration has configured arms. Set by selectBanditArmForRecommend at src/lib/arbitration/apply-online-tuning.ts:167 and threaded into the response at route.ts:1254-1256.Companion to
banditArmIndex. Echo this value back into interaction.response when calling /respond so the arm’s log-weight gets updated on outcome.Trace counters from the engine. See sub-fields below.
Number of offers that entered the pipeline (
decision-flow-engine.ts:78).Candidates remaining after qualification rules ran (
decision-flow-engine.ts:79).Candidates remaining after contact policies ran (
decision-flow-engine.ts:81).Candidates remaining after suppression rules ran (
decision-flow-engine.ts:80). Present in the explicit-key path response; see Honest limits for the auto-resolve path’s omission.True when at least one scorer threw an error and a fallback score was used (
decision-flow-engine.ts:85).Present only when
decorateDecisionsWithNegotiationApply ran a non-noop pass. Shape: { applied: number, rejected: number } (route.ts:1263, 1417). See Constraints and Negotiation.Present only when
debug=true or explain=true. Shape declared at decision-flow-engine.ts:87-96.decisions[] per-item shape
Built at src/lib/pipeline-runner.ts:2282-2316.
1-based rank after sorting and any control-group reshuffle.
Final score from the scorer (or randomized in control group).
Joined from
creative.channel.name.Joined from
creative.channel.channelType.Joined from
creative.placement.name.Joined from
offer.categoryRef.name with fallbacks.Joined from
offer.subCategoryRef.name with fallbacks.Mirrored from
offer.mandatory.Mirrored from the candidate’s priority (driven by
offer.priority).Mirrored from the candidate’s weight.
From
creative.templateType.From
creative.content.Per-candidate property bag from the engine.
From
creative.abTestVariant.From
creative.constraints.ISO timestamp from
offer.expiresAt.From
offer.metadata.Set on every decision (
pipeline-runner.ts:2309-2315); the integration test at src/__tests__/integration/block7-response-validation.integration.test.ts:227-236 asserts it must be present. Shape: { method, priority, weight, fitMultiplier, finalScore }.Free-form
Record<string, any>. Keys are tenant-defined — they come from the category’s customFields of type computed, plus optional flow-level extras and per-flow overrides. Standard examples include personalized_rate or greeting, but the field set is open. The full evaluation contract lives at src/lib/decision-flow-engine.ts:310-374.Present only when the candidate’s channel uses implicit impression tracking and the auto-impression INSERT succeeded. Looked up via
deduplicationId and threaded into the decision at route.ts:1043-1051.Present only when the realtime apply-mode wire ran and the candidate was accepted (
route.ts:1190-1191). Shape: { sessionId, proposal }.Present only when the realtime apply-mode wire rejected the candidate (
route.ts:1192). Shape: { sessionId, reason }.Response (multi-placement)
Returned atroute.ts:927-934 when the request body supplies a placements[] array.
Map of
placementId → { offers: [...], count: number }. Per-offer keys are a subset of the single-placement shape (no weight, templateType, content, scoreExplanation, impressionId, appliedNegotiation).Same value as
interactionId.Same value as
interactionId and recommendationId (route.ts:932). Multi-placement only — kept for legacy callers that key on requestId.ISO timestamp set when the response was assembled.
Response (engine-emitted grouped)
Returned atroute.ts:1099-1113 when the executed flow’s last node is a Group node that emits placements (V2 grouped response).
meta block; the request-driven multi-placement response (when the caller supplies placements: [...]) does not.
GET endpoint
GET /api/v1/recommend accepts a subset of POST body fields as query-string parameters: customerId, channel, placement, limit, decisionFlowKey, explain, debug. The handler is at route.ts:494-677. The GET response omits sessionId, locale, and currency because no request body carries them (route.ts:649-669).
GET returns 400 decisionFlowKey is required. Multiple active flows exist: … when more than one published/active flow exists and no decisionFlowKey query parameter is set (route.ts:565-569). POST auto-selects in the same situation (route.ts:947-955).
Status codes
| Code | When | Source |
|---|---|---|
| 200 | Successful recommendation | NextResponse.json paths in route.ts |
| 400 | Missing required body fields, invalid JSON, or unresolvable flow | badRequest from src/lib/api-error.ts:55 |
| 400 | GET only — multiple active flows exist and no decisionFlowKey query param | route.ts:565-569 |
| 401 | Missing tenant context | tenant.ts:117 |
| 403 | Invalid tenant identifier | tenant.ts:128 |
| 429 | Rate limit exceeded OR playground 5,000-decision quota exhausted | src/lib/rate-limit-unified.ts + src/lib/licensing/middleware.ts:20-30 |
| 500 | Unexpected server error | serverError from src/lib/api-error.ts:90 |
| 504 | Request exceeded the 30s timeout | src/lib/request-timeout.ts:24-27 |
apiErrorenvelope — used by 400 and 500:{ error: { code, message, status, traceId, timestamp } }.tenant.tsenvelope — used by 401 and 403:{ title, detail }(tenant.ts:102-105, 117, 128).- 504 envelope —
{ error: { code: "TIMEOUT", message: "Request timed out", status: 504 } }(request-timeout.ts:24-27). OmitstraceIdandtimestamp.
{ error: { code, message, used, limit } } (licensing/middleware.ts:20-30); the rate-limiter’s 429 envelope is set by rate-limit-unified.ts.
Required headers
| Header | Required | Read at | Purpose |
|---|---|---|---|
Content-Type | POST only | Next.js framework | application/json for POST bodies |
X-API-Key | Yes (one of the two) | tenant.ts:97, rate-limit-unified.ts:68 | API key (krn_…) — also used as the rate-limit identifier |
X-Tenant-Id | Yes (one of the two) | tenant.ts:113 (via getTenantId) | Direct tenant id; ignored when X-API-Key resolves a tenant |
X-Forwarded-For | No | rate-limit-unified.ts:71, route.ts:506, 742 | Falls back to anonymous-id derivation and rate-limit identifier |
User-Agent | No | route.ts:507, 743 | Used in the FNV-1a hash for anonymous customers |
x-user-id | No | route.ts:924, 1094, 1305 | Triggers markOnboardingStep for onboarding tracking only |
Authorization: Bearer … is not a supported authentication mode on this route. The middleware reads the Authorization header only to gate CSRF; tenant.ts only verifies X-API-Key (when prefixed with krn_) and X-Tenant-Id.
Configuration
Environment variables
| Variable | Effect |
|---|---|
NODE_ENV=test | Disables rate limiting in the test runner (rate-limit-unified.ts:101) |
grep 'process.env' on route.ts returning 0 hits. Tenant-level behavior is configured through tenant.settings and tenantSettings.aiAnalyzerSettings.
Caches
| Cache key | TTL | What it caches |
|---|---|---|
route:{tenantId}:{channelId}:{placement} | 120s | FlowRoute resolution (route.ts:213, 429) |
flowkey:{tenantId}:{flowId} | 120s | Flow id → key lookup (route.ts:214, 403) |
flow:{tenantId}:{flowKey} | 60s | Compiled DecisionFlow object (decision-flow-engine.ts:569, 578) |
nba-enabled:{tenantId} | 60s | Tenant kill-switch check (route.ts:285, 297) |
control-group-pct:{tenantId} | 60s | Control-group percentage (route.ts:194, 205) |
shap-enabled:{tenantId} | 60s | Whether to compute SHAP in the hot path (route.ts:318, 328) |
Rate limits
| Tenant type | Per-window | Window | Lifetime decision quota |
|---|---|---|---|
Playground (tenant.isPlayground = true) | 100 | 60s | 5,000 (PLAYGROUND_DECISION_LIMIT at licensing/meter.ts:78) |
| Non-playground | 1,000 | 60s | None (licensing/middleware.ts:34) |
X-API-Key > X-Forwarded-For > "anonymous" (rate-limit-unified.ts:67-77).
Request timeout
The POST handler is wrapped inwithTimeout(handler, 30_000) at route.ts:1429. The GET handler is not wrapped (route.ts:494). On timeout the response is 504 { error: { code: "TIMEOUT", message: "Request timed out", status: 504 } }.
Honest limits
- The auto-resolve fallthrough response at
route.ts:1389-1420returns ametablock that omitsafterSuppression(compare with the explicit-key path atroute.ts:1257-1264which includes all five counters). Tracked as a code-side cleanup. - Auto-impression and recommendation writes are up to 2N round-trips per request — one
prisma.$executeRawper decision in the impression loop atroute.ts:1022-1034, plus one per decision in the recommendation loop atroute.ts:1069-1086— becauseinteraction_historyis partitioned and Prisma’s batch INSERT cannot useON CONFLICT (id)against a composite primary key. A request withlimit=50produces up to 100 sequential SQL round-trips. - The POST body is not validated against a single Zod schema. Field validation is per-read in the handler. The batch endpoint at
/api/v1/recommend/batchdoes use a single Zod schema (BatchRecommendSchemaatsrc/lib/api-validate.ts:782-790). Authorization: Bearer …is not a supported auth mode. The middleware reads theAuthorizationheader only to gate CSRF;tenant.tsonly verifiesX-API-KeyandX-Tenant-Id.- The 504 timeout envelope shape diverges from the standard
apiErrorenvelope — it omitstraceIdandtimestamp(request-timeout.ts:24-27). The 401/403 envelope fromtenant.tsuses{ title, detail }and is also distinct from theapiErrorenvelope. - Bandit arm-index threading fires only when the tenant has both
isExp3IxEnabled(...)true AND configuredbanditConfig.armsintenantSettings.aiAnalyzerSettings.arbitration. Without arms it is a structured no-op (nobanditArmIndexin the response). See EXP3-IX Arbitration for arm configuration.
Related
- Respond API — record the outcome of a recommendation.
- Decision Flows — the engine that backs this route.
- Decisioning Gates — qualification + contact-policy stages.
- Arbitration Profiles — multi-objective scoring weights.