VPC & on-prem deployment

Why self-host

Regulated industries — banking, healthcare, insurance, defense — often require decisioning data to stay inside an owned VPC or physical data center. KaireonAI is Apache 2.0 licensed and ships as a first-class Helm chart. There is no SaaS lock-in; the same code that runs on playground.kaireonai.com runs in your cluster. This page is the production runbook for:

Private VPC deployments (AWS / GCP / Azure) with no public ingress from the platform’s side
Air-gapped clusters (no outbound egress at all)
BYOK (bring-your-own-key) encryption and secrets

Prerequisites

Requirement	Version / Flavor
Kubernetes	1.27+ (EKS, GKE, AKS, OpenShift, k3s, Rancher all supported)
Helm	3.11+
PostgreSQL	14+ (managed RDS / Cloud SQL / self-hosted)
Redis	6.2+ (ElastiCache / Cloud Memorystore / self-hosted)
Container registry	Any OCI-compliant (ECR, Artifact Registry, ACR, Harbor, Nexus)
Ingress controller	nginx / AWS ALB / Istio / any L7

Architecture options

Option A — Private VPC, managed add-ons

Platform (API / worker / ml-worker): Helm chart → your K8s cluster
Database: managed Postgres (RDS / Cloud SQL / Neon VPC)
Cache / queue: managed Redis (ElastiCache / Memorystore / Upstash VPC)
Secrets: AWS Secrets Manager / GCP Secret Manager / Azure Key Vault, mounted via External Secrets Operator
Outbound traffic: optionally allowed to your LLM provider of choice; otherwise disabled

Option B — Fully air-gapped

All three images mirrored to an internal registry
Internal Postgres StatefulSet + internal Redis StatefulSet (the chart ships both)
LLM explanations feature either disabled or pointed at an in-VPC LLM endpoint (vLLM, Ollama, self-hosted Claude via Bedrock PrivateLink, on-prem GPU box, etc.)
All ML training happens in-cluster via the bundled kaireon-ml-worker pod
Zero outbound egress required

Step-by-step (Option A)

1. Mirror images to your registry

# Source images (public ECR)
API=422500312304.dkr.ecr.us-east-1.amazonaws.com/kaireon-api:latest
WORKER=422500312304.dkr.ecr.eu-west-2.amazonaws.com/kaireon/worker:latest
ML=422500312304.dkr.ecr.us-east-1.amazonaws.com/kaireon-ml:latest

# Pull, retag, push to your private registry
for IMG in $API $WORKER $ML; do
  docker pull $IMG
  docker tag $IMG your-registry.internal/$(basename ${IMG%:*}):latest
  docker push your-registry.internal/$(basename ${IMG%:*}):latest
done

2. Verify image signatures (optional but recommended)

Every image is published with provenance. If you require signed images:

# Requires cosign; verify each image before allow-listing in your admission policy
cosign verify your-registry.internal/kaireon-api:latest \
  --certificate-identity=https://github.com/kaireonai/platform/.github/workflows/build.yml@refs/heads/main \
  --certificate-oidc-issuer=https://token.actions.githubusercontent.com

3. Create a values override

values-prod.yaml:

namespace: kaireon-prod

api:
  image:
    repository: your-registry.internal/kaireon-api
    tag: "v1.0.0"
  replicas: 6
  hpa:
    enabled: true
    minReplicas: 6
    maxReplicas: 30
  readOnlyRootFilesystem: false   # Next.js writes cache to /app/.next/cache
  podSecurityContext:
    runAsUser: 1001
    runAsGroup: 1001
    fsGroup: 1001
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app.kubernetes.io/component: api
  priorityClassName: kaireon-critical

worker:
  image:
    repository: your-registry.internal/kaireon-worker
    tag: "v1.0.0"
  replicas: 3
  keda:
    enabled: true
    minReplicas: 3
    maxReplicas: 20

mlWorker:
  enabled: true                    # required for gradient_boosted training
  image:
    repository: your-registry.internal/kaireon-ml
    tag: "v1.0.0"

database:
  mode: external
  external:
    host: kaireon-db.xxx.eu-west-1.rds.amazonaws.com
    port: 5432
    name: kaireon
    username: kaireon
    existingSecret: kaireon-db-credentials
    secretKey: password
    sslMode: require

redis:
  mode: external
  external:
    host: kaireon-cache.xxx.0001.euw1.cache.amazonaws.com
    port: 6379
    tls: true
    existingSecret: kaireon-redis-credentials

ingress:
  enabled: true
  className: alb
  annotations:
    alb.ingress.kubernetes.io/scheme: internal   # private VPC only
    alb.ingress.kubernetes.io/target-type: ip
  hosts:
    - host: kaireon.internal.example.com
      paths:
        - path: /
          pathType: Prefix

4. Apply the chart

helm upgrade --install kaireon ./helm \
  -n kaireon-prod --create-namespace \
  -f values-prod.yaml \
  --wait --timeout 10m

# Verify
kubectl -n kaireon-prod get pods
kubectl -n kaireon-prod rollout status deployment/kaireon-api
kubectl -n kaireon-prod exec deploy/kaireon-api -- curl -sf localhost:3000/api/health

5. Run migrations

kubectl -n kaireon-prod exec deploy/kaireon-api -- npx prisma db push

Hardening checklist

Before going live, confirm each item:

Air-gapped (Option B) additions

Set database.mode: internal and redis.mode: internal — the chart provisions StatefulSets with local storage
Set config.EVENT_PUBLISHER: redis and config.INTERACTION_STORE: pg (no cloud-backed stores)
Disable llmExplanationsEnabled at the tenant level, OR deploy an in-VPC LLM and configure its endpoint via the AI provider settings (Ollama / vLLM / Bedrock PrivateLink)
Mirror the ml-worker Python dependencies to an internal PyPI proxy (the Dockerfile bakes them into the image, so this is only needed if you rebuild)

Upgrades

# Preview what will change
helm diff upgrade kaireon ./helm -n kaireon-prod -f values-prod.yaml

# Apply
helm upgrade kaireon ./helm -n kaireon-prod -f values-prod.yaml --wait

# Run any pending schema migrations
kubectl -n kaireon-prod exec deploy/kaireon-api -- npx prisma db push

A rolling upgrade drains one replica at a time (thanks to PDB). The API pod’s preStop lifecycle hook sleeps 15s so the load balancer has time to stop routing before SIGTERM.

Observability

The chart ships Grafana dashboards in helm/dashboards/. They cover:

API overview — request rate, p50/p95/p99 latency, 4xx/5xx split
Decision engine — recommend latency breakdown by stage (enrich / compute / filter / score / rank)
Decision performance — offer CTR, arbitration weight drift, experiment uplift
Model health — AUC trend per model, drift PSI, training sample count
Worker queues — BullMQ depth per queue, retry count, DLQ depth
Infrastructure — CPU / memory / disk / network per pod

Load these into Grafana via the bundled templates/grafana.yaml (enabled by default) or by importing the JSON files directly.

Support

GitHub: kaireonai/platform
Docs: docs.kaireonai.com
For enterprise support contracts, email support@kaireonai.com.

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

Reports

Release Notes

VPC & on-prem deployment

Why self-host

Prerequisites

Architecture options

Option A — Private VPC, managed add-ons

Option B — Fully air-gapped

Step-by-step (Option A)

1. Mirror images to your registry

2. Verify image signatures (optional but recommended)

3. Create a values override

4. Apply the chart

5. Run migrations

Hardening checklist

Air-gapped (Option B) additions

Upgrades

Observability

Support

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

Reports

Release Notes

Documentation Index

​Why self-host

​Prerequisites

​Architecture options

​Option A — Private VPC, managed add-ons

​Option B — Fully air-gapped

​Step-by-step (Option A)

​1. Mirror images to your registry

​2. Verify image signatures (optional but recommended)

​3. Create a values override

​4. Apply the chart

​5. Run migrations

​Hardening checklist

​Air-gapped (Option B) additions

​Upgrades

​Observability

​Support

Why self-host

Prerequisites

Architecture options

Option A — Private VPC, managed add-ons

Option B — Fully air-gapped

Step-by-step (Option A)

1. Mirror images to your registry

2. Verify image signatures (optional but recommended)

3. Create a values override

4. Apply the chart

5. Run migrations

Hardening checklist

Air-gapped (Option B) additions

Upgrades

Observability

Support