KaireonAI reads ~169 environment variables across the platform API process, the background worker, the MCP server, the license-generator script, and the Python ml-worker. This page is the canonical index — every variable below has been audited against the running code.Documentation Index
Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt
Use this file to discover all available pages before exploring further.
What it documents
Each variable below has a name, the subsystem that reads it, default value (literal fallback in the code, or— when there is none), and a one-line purpose. Variables are grouped by domain so an operator can configure a single subsystem without scrolling the whole page. Subsystem deep-dives live on /self-host/configure/configuration-reference and /self-host/deploy/helm-reference; this page indexes everything in one place and cross-links to the deep-dive where one exists.
Quick start
The minimum to boot the platform locally:DATABASE_URL is the only variable required outside production; the other five become required when NODE_ENV=production.
Production checklist (in addition to the six above):
REDIS_URL— required for caching, rate limiting, and the background worker queue.CORS_ALLOWED_ORIGINS— must be a non-empty, non-wildcard list in production or env validation logs a warning.EVENT_PUBLISHER+ the matching backend group (KAFKA_*,MSK_*,EVENTBRIDGE_*,KINESIS_*) — defaults toredis.INTERACTION_STORE+ the matching backend group (SCYLLA_*,DYNAMODB_*,KEYSPACES_*) — defaults topg.SEARCH_INDEX+OPENSEARCH_*when running OpenSearch — defaults topg.OTEL_EXPORTER_OTLP_ENDPOINT— when shipping traces.- A SIEM block (
SIEM_BACKEND+SIEM_ENDPOINT+ optionalSIEM_API_KEY/SIEM_SOURCETYPE/SIEM_INDEX) — when shipping audit logs to Splunk, Datadog, or Elastic.
How it works
Env validation runs at process startup (eagerly invoked from the database layer). Two passes:- Presence — the validator iterates the required-vars list. Production-required keys missing → throw. Development-required keys missing → warn-only.
- Format — the validator checks
DATABASE_URLstarts withpostgresql://,REDIS_URLstarts withredis://orrediss://, the listen port is in the legal range[1, 65535],CORS_ALLOWED_ORIGINSis non-empty and non-wildcard in production, and any optional-boolean vars are literally the string"true"or"false".
NEXT_PHASE === "phase-production-build") skips validation entirely so next build does not need production secrets.
Subsystem env reads happen lazily at first use:
- The platform’s dependency-injection container reads
EVENT_PUBLISHER,INTERACTION_STORE, andSEARCH_INDEXonce-per-process to pick the backend, then reads the backend-group vars to construct the client. - The Postgres pool reads
PG_POOL_MAXonce at module load. - Per-request env reads (such as
NEGOTIATION_GLOBAL_KILLandALLOW_UNSIGNED_WEBHOOKS) hitprocess.envon every call. Operators changing those vars need a process restart only when the read is module-scoped, not per-request.
NODE_ENV | What changes |
|---|---|
production | Required keys throw on missing. CORS wildcard rejected. CSP unsafe-eval removed. Encryption fallback key disabled. |
development | Required keys warn-only. CSP allows unsafe-eval for React stack traces. Deterministic encryption fallback key allowed. |
test | Same as development plus rate limiter is disabled and RLS_AUTO_ENABLE is force-skipped. |
Reference
Database and migrations
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
DATABASE_URL | — | Database layer; also the cleanup cron job | PostgreSQL connection string. Required in all tiers. |
PG_POOL_MAX | 50 | Database layer | Postgres pool max connection count. |
RLS_AUTO_ENABLE | enabled (skipped only when "false") | Database layer | Auto-enable Postgres Row-Level Security on all tenant tables at boot. Set to "false" to skip; never disable in production. |
Auth and MFA
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
NEXTAUTH_URL | https://playground.kaireonai.com | Auth verify endpoint and outbound email composer | Public canonical URL for NextAuth callbacks and outbound email links. |
NEXTAUTH_SECRET | — | Edge middleware (session cookie verification) | NextAuth.js session-cookie signing secret. Required in production. |
JWT_SIGNING_SECRET | — | OAuth2 token issuer | HMAC key for OAuth2 JWT issue + verify. Required in production. Falls back to NEXTAUTH_SECRET when unset. |
API_KEY_PEPPER | — | OAuth2 client-secret hasher | HMAC pepper for client-secret hashing. Required in production. |
AUTH_URL | http://localhost:3000 | Auto-seed dispatcher | Auth.js base URL used by the auto-seed dispatcher when NEXTAUTH_URL is unset. |
GOOGLE_CLIENT_ID | — | NextAuth provider config | Google OAuth client id (sign-in via Google). |
GOOGLE_CLIENT_SECRET | — | NextAuth provider config | Companion to GOOGLE_CLIENT_ID. |
MFA_ENFORCEMENT_DISABLED | — | Edge middleware | Test-only flag. Disables MFA gate so integration tests can write without TOTP. Never set in production. |
CONNECTOR_ENCRYPTION_KEY | — (required in prod) | Platform settings + encryption layer | AES-256 raw key for encrypting connector secrets at rest. SHA-256 derived to a 32-byte AES key. |
CONNECTOR_ENCRYPTION_KEY_VERSION | 1 | Encryption layer | Versioned-secret rotation pointer for the active key. |
CONNECTOR_ENCRYPTION_KEY_PREVIOUS | — | Encryption layer | Previous key, used to decrypt rows still tagged with the prior version. |
CONNECTOR_ENCRYPTION_KEY_PREVIOUS_VERSION | 0 | Encryption layer | Version number paired with CONNECTOR_ENCRYPTION_KEY_PREVIOUS. |
WEBHOOK_SIGNING_SECRET | — (required in prod) | Journey callback receiver and webhook delivery | HMAC-SHA256 secret for outbound + inbound webhook signature verification. Also gates the delivery webhook endpoint. |
ALLOW_UNSIGNED_WEBHOOKS | — | Webhook delivery endpoint | Dev-only override. When "true" and NODE_ENV !== "production", accepts unsigned webhooks. Hard-blocked in production. |
Network, CORS, and CSP
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
NODE_ENV | — | Edge middleware | Standard Node runtime tier (production / development / test). Drives env-validation strictness, CSP, encryption fallback. |
NEXT_PHASE | — | Env validator | Next.js framework env. phase-production-build skips env-validation so next build does not need secrets. |
NEXT_RUNTIME | — | Instrumentation bootstrap | Next.js framework env. Differentiates Edge runtime vs Node runtime when wiring instrumentation. |
NEXT_PUBLIC_APP_URL | http://localhost:3000 | Studio AI tools | Public app URL exposed to the browser bundle. Used by Studio AI tools for internal fetch base. |
NEXT_PUBLIC_BASE_URL | http://localhost:3000 | Data AI tools | Companion to NEXT_PUBLIC_APP_URL used by Data AI tools. |
CORS_ALLOWED_ORIGINS | — (deny all cross-origin) | Edge middleware | Comma-separated allowlist of cross-origin domains. Wildcard * rejected in production. |
CSP_DISABLED | — | Edge middleware | Set to "true" to skip CSP header (dev only). |
CSP_POLICY | computed default | Edge middleware | Override the default Content Security Policy string. |
INTERNAL_SERVICE_SECRET | — | Edge middleware | Shared secret for service-to-service requests inside the cluster. |
INTERNAL_API_URL | — | Seed executor | Base URL for service-to-service API calls from the worker back into the platform. Falls back to NEXTAUTH_URL, then http://localhost:3000. |
API_KEY | — | Edge middleware | Outbound system API key used by internal jobs that talk to themselves over HTTP. |
HTTP listen port for the platform process. Defaults to Next.js’
3000 when unset. Validated to be an integer in the legal port range [1, 65535].Redis, caching, and rate limits
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
REDIS_URL | — (cache disabled) | Env validator and the background worker (default redis://localhost:6379 on the worker side) | Redis/Valkey connection string. Required for the background worker; optional for the platform (cache falls back to in-memory). Format: rediss://default:<password>@<host>:6379 (TLS) or redis://... (plain). For Upstash, use the standard Redis URL form, not the REST URL. |
RATE_LIMIT_TIER | — | Rate-limiter | Per-tenant rate-limit tier override. Used by the unified sliding-window limiter. |
SLOW_API_THRESHOLD_MS | 150 | API instrumentation | Per-route warn threshold in milliseconds. Routes exceeding this log a slow_api warn. |
Upstash REST API base URL (
https://<id>.upstash.io). Distinct from REDIS_URL, which is the standard Redis-protocol connection. Set when you provision an Upstash database; not currently read by platform code (reserved for future REST-API features), but safe to leave configured.Upstash REST API token. Companion to the REST URL; not currently read by platform code.
Upstash quota awareness: the free tier is 500K Redis commands/month. WithWORKER_INPROCESS=1and idle queues, five BullMQ workers polling Upstash burn ~1.3M ops/month with zero useful work. See worker-mode-and-cron-drain runbook for the cost math and theWORKER_INPROCESS=0+ cron-driven drain pattern that fits inside the free tier.
Event publisher (Kafka / MSK / EventBridge / Kinesis / Redpanda)
EVENT_PUBLISHER selects the backend; only the matching group is read.
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
EVENT_PUBLISHER | redis | Dependency-injection container | Backend selector — redis / kafka / redpanda / msk / eventbridge / kinesis. |
KAFKA_BROKERS | — | Kafka publisher backend | Comma-separated host:port list. |
KAFKA_CLIENT_ID | — | Kafka publisher backend | KafkaJS client id. |
KAFKA_TLS_ENABLED | — | Kafka publisher backend | "true" to enable TLS. |
KAFKA_SASL_MECHANISM | — | Kafka publisher backend | plain / scram-sha-256 / scram-sha-512. |
KAFKA_SASL_USERNAME | — | Kafka publisher backend | SASL username. |
KAFKA_SASL_PASSWORD | — | Kafka publisher backend | SASL password. |
KAFKA_CONSUMER_GROUP_ID | — | Kafka publisher backend | Consumer group id (for inbound consumer wiring). |
MSK_BROKERS | — | MSK publisher backend | AWS MSK bootstrap broker list. |
MSK_REGION | us-east-1 | MSK publisher backend | AWS region. |
MSK_AUTH_MODE | iam_role | MSK publisher backend | iam_role / sasl. |
MSK_ROLE_ARN | — | MSK publisher backend | IAM role ARN when MSK_AUTH_MODE=iam_role. |
MSK_SASL_USERNAME | — | MSK publisher backend | SASL username when MSK_AUTH_MODE=sasl. |
MSK_SASL_PASSWORD | — | MSK publisher backend | SASL password when MSK_AUTH_MODE=sasl. |
MSK_CONSUMER_GROUP_ID | — | MSK publisher backend | Consumer group id. |
MSK_TOPIC_PREFIX | — | MSK publisher backend | Topic name prefix. |
EVENTBRIDGE_AUTH_MODE | iam_role | EventBridge publisher backend | iam_role / access_key. |
EVENTBRIDGE_REGION | us-east-1 | EventBridge publisher backend | AWS region. |
EVENTBRIDGE_ROLE_ARN | — | EventBridge publisher backend | IAM role ARN. |
EVENTBRIDGE_ACCESS_KEY_ID | — | EventBridge publisher backend | Static credentials when auth_mode=access_key. |
EVENTBRIDGE_SECRET_ACCESS_KEY | — | EventBridge publisher backend | Companion to access key id. |
EVENTBRIDGE_BUS_NAME | — | EventBridge publisher backend | Target event bus name. |
EVENTBRIDGE_DETAIL_TYPE_PREFIX | — | EventBridge publisher backend | Prepended to every emitted detail-type. |
KINESIS_AUTH_MODE | iam_role | Kinesis publisher backend | iam_role / access_key. |
KINESIS_REGION | us-east-1 | Kinesis publisher backend | AWS region. |
KINESIS_ROLE_ARN | — | Kinesis publisher backend | IAM role ARN. |
KINESIS_ACCESS_KEY_ID | — | Kinesis publisher backend | Static credentials when auth_mode=access_key. |
KINESIS_SECRET_ACCESS_KEY | — | Kinesis publisher backend | Companion to access key id. |
KINESIS_STREAM_NAME | kaireon-events | Kinesis publisher backend | Target stream. |
KINESIS_PARTITION_KEY | — | Kinesis publisher backend | Field name extracted from event payload as the partition key. |
AI providers
The platform uses generic env keys (provider, model, API key, base URL) rather than vendor-native keys. Vendor-native names like the OpenAI or Anthropic API-key environment variables are not read directly — instead, the configured key is passed into the SDK constructor from the generic env key after a Settings-UI override check. Tenant-level overrides in the Settings UI take precedence over env. The four generic AI env keys are dereferenced via a constant map at runtime — the static drift checker can’t trace this indirection, so the keys are documented below as<ParamField> entries (the same is true for several others on this page).
Provider selector. Acceptable values:
google, anthropic, openai, ollama, lm_studio, bedrock.Model id passed to the provider SDK.
API key for the selected provider.
Override base URL — required for Ollama (default
http://127.0.0.1:11434/api) and LM Studio.| Env Var | Default | Read by | Purpose |
|---|---|---|---|
ML_WORKER_URL | — | AI analyze endpoint | Python ml-worker base URL (FastAPI service hosting ONNX scoring + LightGBM training). |
ML_WORKER_API_KEY | — | ml-worker HTTP client | Bearer token for ml-worker requests. |
ML_WORKER_TIMEOUT_MS | — | ml-worker HTTP client | Request timeout for ml-worker calls. |
Scoring and ONNX
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
ONNX_BLOB_STORE_URL | — | ONNX BYO model store | ONNX BYO model store URL. Supports file:// and s3:// schemes. Unset = inline storage in the model state JSON. Throws on unrecognized schemes. |
ONNX_INLINE_BYTES_LIMIT | 10485760 (10 MB) | ONNX BYO model store | Threshold above which model bytes are offloaded to the blob store rather than inlined into the model state. |
RETRAIN_EVERY_N | 100 | Respond endpoint | Per-tenant outcome-debounce. Bayesian, Thompson, epsilon-greedy, and online-learner models recompute every Nth outcome. Set to 1 for true real-time. |
ATTRIBUTION_TIMEOUT_MS | 5000 | Respond endpoint | Timeout for the recommendation lookup that backs outcome attribution. |
Interaction store (Postgres / DynamoDB / Keyspaces / ScyllaDB)
INTERACTION_STORE selects the backend; only the matching group is read.
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
INTERACTION_STORE | pg | Dependency-injection container | Backend selector — pg / dynamodb / keyspaces / scylla / cassandra. |
SCYLLA_CONTACT_POINTS | localhost | ScyllaDB interaction-store backend | Comma-separated host list. |
SCYLLA_LOCAL_DATACENTER | datacenter1 | ScyllaDB interaction-store backend | Local datacenter for token-aware routing. |
SCYLLA_KEYSPACE | kaireon | ScyllaDB interaction-store backend | Target keyspace name. |
SCYLLA_USERNAME | — | ScyllaDB interaction-store backend | Auth username. |
SCYLLA_PASSWORD | — | ScyllaDB interaction-store backend | Auth password. |
SCYLLA_TLS_ENABLED | — | ScyllaDB interaction-store backend | "true" to enable TLS. |
SCYLLA_REPLICATION_FACTOR | — | ScyllaDB interaction-store backend | Keyspace replication factor (integer). |
SCYLLA_CONSISTENCY_LEVEL | — | ScyllaDB interaction-store backend | Cassandra/ScyllaDB consistency level (e.g. local-quorum, quorum, one). |
SCYLLA_POOL_SIZE | — | ScyllaDB interaction-store backend | Driver connection pool size. |
SCYLLA_REQUEST_TIMEOUT_MS | — | ScyllaDB interaction-store backend | Per-request timeout. |
DYNAMODB_TABLE_NAME | kaireon-interactions | DynamoDB interaction-store backend | Target DynamoDB table. |
DYNAMODB_AUTH_MODE | iam_role | DynamoDB interaction-store backend | iam_role / access_key. |
DYNAMODB_REGION | us-east-1 | DynamoDB interaction-store backend | AWS region. |
DYNAMODB_ROLE_ARN | — | DynamoDB interaction-store backend | IAM role ARN. |
DYNAMODB_ACCESS_KEY_ID | — | DynamoDB interaction-store backend | Static credentials. |
DYNAMODB_SECRET_ACCESS_KEY | — | DynamoDB interaction-store backend | Companion to access key id. |
KEYSPACES_KEYSPACE | kaireon | AWS Keyspaces interaction-store backend | AWS Keyspaces target keyspace. |
KEYSPACES_USERNAME | — | AWS Keyspaces interaction-store backend | Service-specific credentials username. |
KEYSPACES_PASSWORD | — | AWS Keyspaces interaction-store backend | Service-specific credentials password. |
KEYSPACES_REGION | us-east-1 | AWS Keyspaces interaction-store backend | AWS region. |
KEYSPACES_AUTH_MODE | access_key | AWS Keyspaces interaction-store backend | access_key / iam_role. |
KEYSPACES_REQUEST_TIMEOUT_MS | — | AWS Keyspaces interaction-store backend | Per-request timeout. |
Search index (Postgres / OpenSearch)
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
SEARCH_INDEX | pg | Dependency-injection container | Backend selector — pg / opensearch. |
OPENSEARCH_NODE_URL | https://localhost:9200 | OpenSearch search-index backend | OpenSearch node URL. |
OPENSEARCH_USERNAME | — | OpenSearch search-index backend | Basic auth username. |
OPENSEARCH_PASSWORD | — | OpenSearch search-index backend | Basic auth password. |
OPENSEARCH_TLS_ENABLED | enabled (skip when "false") | OpenSearch search-index backend | TLS toggle. |
OPENSEARCH_TLS_REJECT_UNAUTHORIZED | enabled (skip when "false") | OpenSearch search-index backend | Cert-validation toggle. |
OPENSEARCH_INDEX_PREFIX | kaireon- | OpenSearch search-index backend | Prepended to every index name. |
OPENSEARCH_REQUEST_TIMEOUT_MS | — | OpenSearch search-index backend | Per-request timeout. |
OPENSEARCH_MAX_RETRIES | — | OpenSearch search-index backend | SDK-level retry count. |
OPENSEARCH_AUTH_MODE | basic | OpenSearch search-index backend | basic / iam_role. |
OPENSEARCH_REGION | — | OpenSearch search-index backend | AWS region (for iam_role mode against AWS-managed OpenSearch). |
OPENSEARCH_ROLE_ARN | — | OpenSearch search-index backend | IAM role ARN. |
OPENSEARCH_ACCESS_KEY_ID | — | OpenSearch search-index backend | Static credentials. |
OPENSEARCH_SECRET_ACCESS_KEY | — | OpenSearch search-index backend | Companion to access key id. |
Storage and attachments
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
STORAGE_BACKEND | local | AI-import attachment storage factory | AI-import attachment backend — local or s3. Unrecognized values throw. |
ATTACHMENT_S3_BUCKET | — | AI-import attachment storage factory | S3 bucket for attachments when STORAGE_BACKEND=s3. Required when backend is s3. |
ATTACHMENT_STORAGE_PATH | /var/kaireon/attachments | AI-import attachment storage factory | Local filesystem base path when STORAGE_BACKEND=local. |
AWS_REGION | us-east-1 | Email + S3 SDK clients | Default AWS region for SES + S3 SDK clients. |
Email and outbound
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
SES_FROM_EMAIL | support@kaireonai.com | Email sender | From-address used by all SES-sent transactional emails. |
Governance, approvals, and license
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
APPROVAL_MAX_AGE_HOURS | 168 (7 days) | Approvals-expire cron job | Pending approvals older than this are auto-expired by the cron sweep. Falls back on invalid input rather than failing. |
NEGOTIATION_GLOBAL_KILL | — | Agent-negotiation apply endpoint | Global kill switch for the agent-negotiation apply path. Set to "true" to reject every apply across all tenants. |
RSA private key (PEM, PKCS8) for signing customer licenses. Read by the license-generator script — script-side only, not by the running platform. When unset, the script auto-generates a fresh RSA key pair on each run rather than failing. The matching public key ships embedded in the platform for verification.
Worker, outbox, and seed
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
WORKER_CONCURRENCY | 5 | Background worker | Background-worker concurrency per process. |
WORKER_METRICS_PORT | 9091 | Background worker | Worker Prometheus exporter listen port (only bound when running the dedicated kaireon-worker container). |
WORKER_INPROCESS | 1 (legacy default, may change to 0 in a future release) | Instrumentation bootstrap and background worker | When 1, the API process also runs the background workers continuously — same Redis queues, same handlers, no extra container. Recommended =0 for free-tier Redis (Upstash 500K-ops/month limit): five idle workers polling Upstash with a blocking pop-and-push burn ~1.3M ops/month with zero queued work, blowing the quota. With =0, schedule POST /api/v1/cron/drain-queues every 5 minutes via cron-job.org / EventBridge / GitHub Actions instead — see worker-mode-and-cron-drain runbook. Set =1 only when you have paid Redis or a dedicated kaireon-worker container with WORKER_INPROCESS=0 on the API. |
WORKER_SECRET | worker-internal | Seed executor | Worker→API shared secret (X-Worker-Secret header). The default is a known string — override in production. |
_SEED_FROM_WORKER | — | Seed executor | Internal flag the seed-executor sets while dispatching its internal HTTP call and deletes once the call returns. Currently has no reader — reserved for future API-side reentrancy detection. Operators should not set this. |
OUTBOX_LIVENESS_FILE | /tmp/outbox-publisher.alive | Outbox publisher worker | Liveness file the outbox publisher touches each loop. K8s liveness probe reads its mtime. |
OUTBOX_REAPER_STALENESS_SECONDS | 300 | Outbox-reaper cron job | Outbox rows older than this are reset by the reaper sweep. Falls back on invalid input. |
Outbox polling interval in milliseconds. Read by the outbox publisher worker via the positive-int env helper.
Max time (ms) to wait for in-flight publishes to drain on shutdown. Read by the outbox publisher worker via the positive-int env helper.
Cron tier
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
CRON_SECRET | — | All /api/v1/cron/* endpoints | Bearer token gating shared /api/v1/cron/* endpoints (cleanup, dsar-purge, approvals-expire, drain-queues fallback, etc.). One secret covers them all when an internal scheduler is the only caller. For external schedulers (cron-job.org, EventBridge, GitHub Actions), prefer the per-endpoint DRAIN_QUEUES_TOKEN so a leak doesn’t compromise this shared secret. |
CRON_TOKEN | — | Legacy cron-tick endpoint and drain-queues fallback | Bearer token gating the legacy /api/cron/tick endpoint AND accepted as a backwards-compat fallback by drain-queues. Distinct from CRON_SECRET. |
DRAIN_QUEUES_TOKEN | — (falls back to CRON_SECRET then CRON_TOKEN) | Drain-queues endpoint | Narrow token scoped to /api/v1/cron/drain-queues only. Recommended for external schedulers (cron-job.org, EventBridge, GitHub Actions). A leak grants the attacker only this one endpoint instead of the broader /api/v1/cron/* surface. Use a 32-byte random hex value. |
DRAIN_QUEUES_RATE_LIMIT | 12 | Drain-queues endpoint | Per-IP sliding-window rate limit (requests per minute) on the drain endpoint. The default fits cron-job.org’s 5-min cadence + retries with headroom; reduce on aggressive abuse. Returns 429 with Retry-After when exceeded. |
CRON_ALLOWED_IPS | — (off — token-only auth) | Drain-queues endpoint | Comma-separated IP allowlist for the drain endpoint, matched against the left-most X-Forwarded-For entry. Off by default since cron-job.org doesn’t publish a stable IP list (worker IPs added over time, e.g. 91.99.23.109 in 2025-06, 128.140.8.200 in 2023-05). Set when you’ve pinned the current IP set. |
Observability — tracing, metrics, logs
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
LOG_LEVEL | info | Structured logger | Log level threshold for the structured logger (silent / error / warn / info / debug). |
OTEL_EXPORTER_OTLP_ENDPOINT | — | Tracing initializer | OTLP traces+metrics exporter endpoint. Tracing disabled when unset. |
errorId, sanitizes the error message, attaches an optional meta block, and returns the errorId for the caller to surface (HTTP response body, downstream worker telemetry, audit row). Use it for any caught error a SIEM or operator may need to correlate later. Adoption is incremental — the API-route layer and the Outbox publisher worker already mint errorId per failed tick; other workers still emit bare logger error calls and are tracked for migration in the engineering residuals list.
Audit + SIEM
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
SIEM_BACKEND | — | Audit SIEM sink | SIEM target — splunk / datadog / elastic. Unset = no SIEM ship-out. Unknown values warn-and-disable. |
SIEM_ENDPOINT | — | Audit SIEM sink | SIEM ingest endpoint URL. Required when SIEM_BACKEND is set. |
SIEM_API_KEY | — | Audit SIEM sink | Bearer token for SIEM ingest (Splunk HEC token, Datadog API key, etc.). |
SIEM_SOURCETYPE | — | Audit SIEM sink | Splunk-only sourcetype name. |
SIEM_INDEX | — | Audit SIEM sink | Elastic-only index name (Splunk uses sourcetype). |
Provenance, supply chain, and signing
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
GIT_SHA | — | Decision provenance endpoint | Build-time git SHA stamped into Decision Provenance Bundles. SLSA attestation skipped when missing. |
GIT_REPO | — | Decision provenance endpoint | Build-time git repo URL. |
IMAGE_NAME | — | Decision provenance endpoint | Container image name (e.g. ghcr.io/kaireonai/platform). |
IMAGE_DIGEST | — | Decision provenance endpoint | Container image digest (sha256:…). |
SBOM_DIGEST_SHA256 | — | Decision provenance endpoint | Hex digest of the published SBOM. Pinned into the SLSA statement. |
BUILDER_ID | kaireon.platform.runtime | Decision provenance endpoint | SLSA builder.id. |
BUILDER_VERSION | unknown | Decision provenance endpoint | SLSA builder.version. |
BUILD_STARTED_ON | request-time fallback | Decision provenance endpoint | SLSA metadata.buildStartedOn. |
BUILD_FINISHED_ON | request-time fallback | Decision provenance endpoint | SLSA metadata.buildFinishedOn. |
GITHUB_RUN_ID | request id fallback | Decision provenance endpoint | SLSA invocationId. |
COSIGN_BINARY | cosign | Supply-chain Cosign signer | Cosign CLI binary path. Override when not on PATH. |
Cosign-format private key bytes (the contents of
cosign.key, not a file path). Cosign itself dereferences the bytes via the env:// URI scheme passed on its CLI; the platform passes the env-var name through. When unset, every /api/v1/decisions/:id/provenance response returns X-Provenance-Signature: unsigned. Required for production. Install via AWS Secrets Manager (cloud) or local key file (self-host) — see Provenance signing install guide.Passphrase paired with the cosign key. Required when the key was generated with a passphrase (the default for
cosign generate-key-pair). Set in env so the cosign subprocess inherits it; the platform itself does not read it.Webhooks and inbound channels
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
WHATSAPP_APP_SECRET | — | WhatsApp inbound webhook | Meta WhatsApp app secret used to verify the X-Hub-Signature-256 header on inbound. |
WHATSAPP_WEBHOOK_VERIFY_TOKEN | — | WhatsApp inbound webhook | Token returned during the Meta hub.verify_token handshake. |
Multi-region and tenant routing
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
KAIREON_REGION | us-east-1 | Multi-region tenant router | Region binding for the running process (e.g. us-east-1). Used by the multi-region router to decide whether a tenant request is local or must redirect. |
MULTI_REGION_ENABLED | — | Multi-region tenant router | "true" enables multi-region routing. |
PLATFORM_OWNER_TENANT_ID | — | Platform-settings endpoint | Tenant id treated as the platform owner — gates write access to platform-wide settings. |
SINGLE_TENANT_MODE | — | Tenant resolver | Single-tenant build mode toggle (test-asserted; production semantics depend on caller). |
MCP server (CLI / SDK side)
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
KAIREON_API_URL | http://localhost:3000 | MCP server auth layer | KaireonAI API base URL the MCP tools call. |
KAIREON_API_KEY | — | MCP server auth layer | API key used by the MCP server when calling v1 routes. |
KAIREON_TENANT_ID | default | MCP server auth layer | Tenant id sent in the X-Tenant-Id header by MCP tool calls. |
MCP_ALLOW_WRITES | — | MCP server tool router | Set to "true" to enable write-mode MCP tools. Read-only by default. See /integrations/mcp for the full tool list. |
Flow streaming
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
FLOW_STREAMING_ENABLED | — | Flow streaming feature flag | Feature flag for Flow streaming consumers (Kafka / Kinesis / Pulsar). Phase 6.4 — disabled by default; enable only on self-host with a running broker. |
External AI/transform endpoints (Flow runtime)
These four endpoint names are declared in the registry theexternal-model-call transform consults at runtime. The actual process.env lookup happens inside the call helper, dispatched off the registry entry’s name field — the static drift checker can’t trace this through the indirection. Operators must set the corresponding env var before invoking the transform; if unset, the transform throws a missing-endpoint error and the row fails fast.
External geo-resolution model URL.
External language-detection model URL.
External sentiment-scoring URL.
External embedding-vector URL.
Playground and demo
| Env Var | Default | Read by | Purpose |
|---|---|---|---|
PLAYGROUND_MODE | — | Auto-seed dispatcher | "true" triggers Starbucks auto-seed for new tenants on registration. |
DEMO_MODE | — | Tenant status endpoint | Demo-environment toggle. Validated as a boolean by the env validator (must be the literal string "true" or "false"). |
Python ml-worker
The Python FastAPI service for ML training and ONNX scoring reads only two env vars.| Env Var | Default | Read by | Purpose |
|---|---|---|---|
DATABASE_URL | postgresql://localhost:5432/kaireon | Python ml-worker config | Postgres connection string for the ml-worker process. Distinct from the platform’s DATABASE_URL only in that it can be overridden per-process. |
FastAPI listen port for the Python ml-worker (Python-side; not read by the TypeScript platform).
Test-only
These keys are read only by the test runner. Operators should not set them in production.Integration-test API key. Read by the decision-flow integration test suite.
Integration-test tenant id. Read by the decision-flow integration test suite.
Base URL used by integration-test HTTP calls. Read by the sample-data end-to-end integration test.
Set by the Vitest test runner. Production code branches on its presence to skip side effects in tests.
Configuration
Loading order
Next.js picks up env files in this order, last-write-wins (per Next.js framework defaults):.env.env.local(gitignored — local overrides).env.${NODE_ENV}(e.g..env.production).env.${NODE_ENV}.local- Process environment (
process.env)
.env.example template ships with the platform — copy it to .env.local for local dev and to a secret manager (Kubernetes Secret, AWS SSM, etc.) for production.
Kubernetes — ConfigMap vs Secret
The shipped Helm chart splits non-secret runtime config from secrets:- Non-secret runtime config (
NODE_ENV,LOG_LEVEL,EVENT_PUBLISHER,INTERACTION_STORE,SEARCH_INDEX,RLS_AUTO_ENABLE,WORKER_CONCURRENCY,OUTBOX_*,SLOW_API_THRESHOLD_MS,RETRAIN_EVERY_N,OPENSEARCH_INDEX_PREFIX,OPENSEARCH_TLS_ENABLED) lives in the chart’s ConfigMap template. - Secrets (
DATABASE_URL,REDIS_URL,NEXTAUTH_SECRET,JWT_SIGNING_SECRET,CONNECTOR_ENCRYPTION_KEY*,WEBHOOK_SIGNING_SECRET,API_KEY_PEPPER,WORKER_SECRET,INTERNAL_SERVICE_SECRET,CRON_SECRET,CRON_TOKEN,KAFKA_SASL_PASSWORD,MSK_SASL_PASSWORD,*_SECRET_ACCESS_KEY,*_PASSWORD,SIEM_API_KEY,WHATSAPP_APP_SECRET, plus the KAIREON_LICENSE_PRIVATE_KEY and COSIGN_KEY documented above) live in the chart’s Secret template.
/self-host/deploy/helm-reference for the full chart values map.
Rotation guidance
| Variable | Rotation strategy |
|---|---|
CONNECTOR_ENCRYPTION_KEY | Set the new key as CONNECTOR_ENCRYPTION_KEY + bump CONNECTOR_ENCRYPTION_KEY_VERSION. Keep the old key as CONNECTOR_ENCRYPTION_KEY_PREVIOUS + matching _PREVIOUS_VERSION until all rows are re-encrypted by a background sweep. |
JWT_SIGNING_SECRET | Rotate inside a maintenance window. All issued tokens become invalid the moment the new secret takes effect. |
API_KEY_PEPPER | Cannot be rotated without invalidating every stored API-key hash. Plan for a hard cutover. |
WEBHOOK_SIGNING_SECRET | Rotate by overlapping — accept either old or new for the rotation window via a transitional verifier (currently single-key; overlap window is an open op). |
NEXTAUTH_SECRET | Logs out every active session. Rotate during low traffic. |
Cloud provider keys (AWS_*, *_ACCESS_KEY_ID) | Rotate through the cloud provider’s secret manager; pod restart picks up the new value (no in-process rotation). |
Honest limits
- TS-only vs Python ml-worker — Of the documented variables, only
DATABASE_URLis read by both the TypeScript platform and the Python ml-worker. The ml-worker port (documented above as a<ParamField>) is Python-only. The platform talks to the ml-worker viaML_WORKER_URL+ML_WORKER_API_KEY(HTTP); the two processes do not share env vars beyondDATABASE_URL. - Once-at-boot vs per-request reads — Most env vars are read once during module load (DI container, database pool, AI provider config, encryption keys). A handful are read per-request:
NEGOTIATION_GLOBAL_KILLandALLOW_UNSIGNED_WEBHOOKS. For the rest, a process restart is required after changing the value. - Deprecated / soft-deprecated —
CRON_TOKENis the legacy auth token for/api/cron/tick. New cron endpoints under/api/v1/cron/*useCRON_SECRET; both are kept until the legacy tick is removed. - Internal-only flags — operators should not set —
_SEED_FROM_WORKERis set by the seed-executor itself before its internal HTTP call. Setting it manually breaks reentrancy detection. - Test-only flags in production —
MFA_ENFORCEMENT_DISABLED,ALLOW_UNSIGNED_WEBHOOKS, plus the four test-only<ParamField>entries above (test API key, tenant id, base URL, and the Vitest runner-detection flag) must not appear in any production environment.ALLOW_UNSIGNED_WEBHOOKSis hard-blocked whenNODE_ENV=production; the others are not enforced — operator discipline only. - License private key is script-side — Read only by the license-generator script, never by the running platform (see the KAIREON_LICENSE_PRIVATE_KEY
<ParamField>above). The platform verifies licenses with the matching public key embedded in source. When unset, the script auto-generates a fresh RSA key pair on each run rather than failing. - AI provider env keys are tenant-overridable — Tenant-level AI settings in the Settings UI take precedence over the four AI-provider env keys documented as
<ParamField>entries above (provider, model, API key, base URL). The env vars are only the fallback when no DB-level setting is configured. - Environment variables not yet wired for the AI sidebar metadata — Two AI-sidebar reserved keys are declared in the AI-env constant map but currently have no read in the platform. They are reserved keys;
<ParamField>entries for both follow.
Reserved AI-sidebar feature flag. No reader in the platform yet.
Reserved AI-sidebar rate-limit knob. No reader in the platform yet.
Related
- Configuration Reference — narrative deep-dive on the most-used variables (database, cache, integrations).
- Helm Reference — chart values map and ConfigMap/Secret split.
- Security Hardening — production-only secret-rotation requirements.
- Installation Guide — step-by-step deploy with the required env-var subset.
- Infrastructure Backends —
EVENT_PUBLISHER,INTERACTION_STORE,SEARCH_INDEXbackend selection. - ML Worker Deployment —
ML_WORKER_*env vars. - MCP Integration —
KAIREON_API_URL,KAIREON_API_KEY,KAIREON_TENANT_ID,MCP_ALLOW_WRITES. - EventBridge Setup —
EVENTBRIDGE_*configuration.