Documentation Index
Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt
Use this file to discover all available pages before exploring further.
Why self-host
Regulated industries — banking, healthcare, insurance, defense — often require decisioning data to stay inside an owned VPC or physical data center. KaireonAI is Apache 2.0 licensed and ships as a first-class Helm chart. There is no SaaS lock-in; the same code that runs onplayground.kaireonai.com runs in your cluster.
This page is the production runbook for:
- Private VPC deployments (AWS / GCP / Azure) with no public ingress from the platform’s side
- Air-gapped clusters (no outbound egress at all)
- BYOK (bring-your-own-key) encryption and secrets
Prerequisites
| Requirement | Version / Flavor |
|---|---|
| Kubernetes | 1.27+ (EKS, GKE, AKS, OpenShift, k3s, Rancher all supported) |
| Helm | 3.11+ |
| PostgreSQL | 14+ (managed RDS / Cloud SQL / self-hosted) |
| Redis | 6.2+ (ElastiCache / Cloud Memorystore / self-hosted) |
| Container registry | Any OCI-compliant (ECR, Artifact Registry, ACR, Harbor, Nexus) |
| Ingress controller | nginx / AWS ALB / Istio / any L7 |
Architecture options
Option A — Private VPC, managed add-ons
- Platform (API / worker / ml-worker): Helm chart → your K8s cluster
- Database: managed Postgres (RDS / Cloud SQL / Neon VPC)
- Cache / queue: managed Redis (ElastiCache / Memorystore / Upstash VPC)
- Secrets: AWS Secrets Manager / GCP Secret Manager / Azure Key Vault, mounted via External Secrets Operator
- Outbound traffic: optionally allowed to your LLM provider of choice; otherwise disabled
Option B — Fully air-gapped
- All three images mirrored to an internal registry
- Internal Postgres StatefulSet + internal Redis StatefulSet (the chart ships both)
- LLM explanations feature either disabled or pointed at an in-VPC LLM endpoint (vLLM, Ollama, self-hosted Claude via Bedrock PrivateLink, on-prem GPU box, etc.)
- All ML training happens in-cluster via the bundled
kaireon-ml-workerpod - Zero outbound egress required
Step-by-step (Option A)
1. Mirror images to your registry
2. Verify image signatures (optional but recommended)
Every image is published with provenance. If you require signed images:3. Create a values override
values-prod.yaml:
4. Apply the chart
5. Run migrations
Hardening checklist
Before going live, confirm each item:-
readOnlyRootFilesystem: truewhere workload permits (requires adding anemptyDirvolume for/app/.next/cacheon the API pod) -
runAsNonRoot: trueon every pod (enabled by default in the chart) -
allowPrivilegeEscalation: false+ all capabilities dropped (enabled by default) -
topologySpreadConstraintswithwhenUnsatisfiable: DoNotScheduleacross 3+ zones -
PodDisruptionBudgetpreserves N-1 availability during drains (shipped intemplates/pdb.yaml) -
NetworkPolicyblocks all pod-to-pod traffic except allow-listed flows (shipped intemplates/networkpolicy.yaml; review for your CNI) - Secrets in External Secrets Operator, not plain
kubernetes.io/Secretobjects (enablesecrets.provider: externalSecretsinvalues.yaml) - Database SSL mode
requireor higher - Redis TLS enabled with
rediss://and password - Admission controller enforcing signed images (cosign + Kyverno / OPA Gatekeeper)
- Log sink configured (Fluent Bit → CloudWatch / Loki / Splunk)
- Metrics scrape configured (Prometheus ServiceMonitor — the chart ships Grafana dashboards in
helm/dashboards/) - Backup policy on the Postgres instance (PITR ≥ 7 days)
-
CONNECTOR_ENCRYPTION_KEYis a 32-byte random, rotated every 90 days, and stored in your secrets backend
Air-gapped (Option B) additions
- Set
database.mode: internalandredis.mode: internal— the chart provisions StatefulSets with local storage - Set
config.EVENT_PUBLISHER: redisandconfig.INTERACTION_STORE: pg(no cloud-backed stores) - Disable
llmExplanationsEnabledat the tenant level, OR deploy an in-VPC LLM and configure its endpoint via the AI provider settings (Ollama / vLLM / Bedrock PrivateLink) - Mirror the ml-worker Python dependencies to an internal PyPI proxy (the
Dockerfilebakes them into the image, so this is only needed if you rebuild)
Upgrades
preStop lifecycle hook sleeps 15s so the load balancer has time to stop routing before SIGTERM.
Observability
The chart ships Grafana dashboards inhelm/dashboards/. They cover:
- API overview — request rate, p50/p95/p99 latency, 4xx/5xx split
- Decision engine — recommend latency breakdown by stage (enrich / compute / filter / score / rank)
- Decision performance — offer CTR, arbitration weight drift, experiment uplift
- Model health — AUC trend per model, drift PSI, training sample count
- Worker queues — BullMQ depth per queue, retry count, DLQ depth
- Infrastructure — CPU / memory / disk / network per pod
templates/grafana.yaml (enabled by default) or by importing the JSON files directly.
Support
- GitHub: kaireonai/platform
- Docs: docs.kaireonai.com
- For enterprise support contracts, email
support@kaireonai.com.