Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt

Use this file to discover all available pages before exploring further.

Where to Find Logs

Before debugging any issue, know where to look.

Local Development

When running npm run dev, all server logs appear in your terminal’s stdout/stderr. Next.js prints compilation errors, API route logs, and unhandled exceptions directly to the console.

Docker

# Follow logs in real time
docker logs -f kaireon-api

# Show last 200 lines
docker logs --tail 200 kaireon-api

# Filter for errors
docker logs kaireon-api 2>&1 | grep -i error

Production (Structured Logging)

In production, KaireonAI outputs structured JSON logs via Winston. Control the verbosity with the LOG_LEVEL environment variable:
# Options: error, warn, info, http, verbose, debug, silly
LOG_LEVEL=info   # default
Example log line:
{
  "level": "info",
  "message": "Recommend API completed",
  "recommendationId": "abc-123-def",
  "durationMs": 142,
  "offersReturned": 3,
  "timestamp": "2026-03-30T10:15:22.000Z"
}

Correlating Logs with API Responses

Every API error response includes a recommendationId field. Use it to find the matching server-side log entry:
{
  "error": "Internal Server Error",
  "recommendationId": "abc-123-def"
}
# Search Docker logs by recommendationId
docker logs kaireon-api 2>&1 | grep "abc-123-def"
Every API response also includes an x-request-id header. You can use either the response body recommendationId or the header value to correlate client requests with server logs.

Startup Issues

These errors typically occur when starting the development server or deploying for the first time.
Cause: KaireonAI uses Prisma 7, which moved the datasource URL out of schema.prisma and into prisma.config.ts.Fix: Open prisma/schema.prisma and ensure the datasource block only contains the provider — no url line:
datasource db {
  provider = "postgresql"
}
The connection URL is defined in prisma.config.ts and reads from the DATABASE_URL environment variable.
Cause: The Prisma client has not been generated yet. The generated client lives in platform/generated/prisma/ and is gitignored, so it must be created locally.Fix:
npx prisma generate
This runs automatically on npm install via the postinstall script, but you may need to run it manually after schema changes.
Cause: The application cannot connect to PostgreSQL.Fix — checklist:
  1. Verify DATABASE_URL is set in your .env file
  2. Ensure PostgreSQL is running (pg_isready or psql to test)
  3. Confirm the hostname, port, database name, and credentials in the URL are correct
  4. If using Docker, make sure the container is up and the port is mapped
# Test connectivity directly
psql "$DATABASE_URL" -c "SELECT 1"
Cause: The authentication system requires a secret key for signing session tokens.Fix: Add NEXTAUTH_SECRET to your .env file:
# Generate a random secret
openssl rand -base64 32
NEXTAUTH_SECRET=your-generated-secret-here
Cause: Another process is already listening on port 3000.Fix: Either kill the existing process or start on a different port:
# Find and kill the process on port 3000
lsof -ti:3000 | xargs kill -9

# Or start on a different port
PORT=3001 npm run dev
Cause: The AI assistant requires an LLM provider to be configured before it can respond.Fix: Go to Settings > AI Configuration and configure your preferred provider:
  • Google Gemini — Free tier available, good starting point
  • OpenAI — GPT-4o and GPT-4o-mini
  • Anthropic — Claude models
  • Ollama — Self-hosted open-source models (no API key needed)
Enter your API key, select a model, and click Save. The AI chat will work immediately after configuration.
Cause: The Docker build process ran out of memory. Next.js builds and Prisma generation are memory-intensive.Fix: Increase Docker Desktop memory allocation to at least 8 GB:
  1. Open Docker Desktop > Settings > Resources
  2. Set Memory to 8.00 GB (or higher)
  3. Click Apply & Restart
  4. Retry the build:
docker build -t kaireon-api .
If you are on a CI server, ensure the build runner has at least 8 GB of RAM available.
Cause: KaireonAI enforces CSRF protection on state-changing requests. Requests from API clients (not the browser UI) must include a specific header.Fix: Add the X-Requested-With header to all POST, PUT, PATCH, and DELETE requests:
curl -X POST http://localhost:3000/api/v1/offers \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -H "X-Tenant-Id: your-tenant-id" \
  -d '{"name": "Summer Sale", ...}'
This header signals that the request is intentional and not a cross-site forgery attempt.
Cause: The dataset loader creates multiple tables and inserts several thousand rows. Failures are typically caused by database connectivity issues or insufficient disk space.Fix — checklist:
  1. Verify the database connection is working: psql "$DATABASE_URL" -c "SELECT 1"
  2. Check available disk space on the database server (the dataset requires ~50 MB)
  3. Ensure the database user has CREATE TABLE and INSERT privileges
  4. If the load partially completed, try again — the loader uses upserts and is safe to re-run
# Check disk space (Linux/macOS)
df -h
Cause: The most common reason is that no Decision Flow is published. The Recommend API only evaluates published flows.Fix:
  1. Go to Studio > Decision Flows
  2. Open your flow and click Publish
  3. Verify the flow has active offers assigned to it
  4. Ensure the channel and placement in your API request match the flow configuration
If the flow is published but still returns 0 decisions, see the No Offers Returned checklist below for additional filter stages that may be excluding offers.

API Errors

All KaireonAI API routes return standard HTTP status codes. Here is a reference for the most common error responses.
StatusMeaningCommon CauseFix
401UnauthorizedMissing X-Tenant-Id header or invalid/expired sessionInclude the X-Tenant-Id header in every API request. Re-authenticate if session expired.
403ForbiddenUser role lacks permission (admin/editor/viewer)Check the user’s role. Write operations require admin or editor.
404Not FoundResource does not exist, or belongs to a different tenantVerify the resource ID and that the X-Tenant-Id matches the owning tenant.
409ConflictDuplicate key violation, or rowVersion mismatch (optimistic locking)Re-fetch the resource to get the latest rowVersion, then retry the update.
429Too Many RequestsRate limit exceededWait for the duration in the Retry-After header before retrying. See Rate Limiting below.
500Internal Server ErrorUnhandled exception on the serverCheck server logs. If reproducible, file a bug with the request payload.

Rate Limiting

KaireonAI uses a sliding-window rate limiter. When you receive a 429 response, the Retry-After header tells you how many seconds to wait. The X-RateLimit-Remaining header shows how many requests remain in the current window.
In multi-node deployments, rate limiting uses Redis sorted sets for global accuracy. If Redis is unavailable, the limiter falls back to per-process in-memory tracking — limits may be less precise across nodes.

Decision Flow Issues

No Offers Returned

If the Recommend API returns an empty list, work through this checklist:
Each item below is a filter in the decision pipeline. Offers must pass all of them to appear in the response.
  1. Offers are active — Check that the offer’s status is active (not draft or paused)
  2. Schedule window — Verify startDate and endDate encompass the current date
  3. Budget remaining — Check the offer has not exhausted its decision or cost budget
  4. Flow inventory — The offer must be assigned to the Decision Flow being evaluated
  5. Qualification rules passing — Review the qualification rules attached to the offer; test with the exact customer attributes being sent
  6. Contact policies not suppressing — Check that contact policy limits (frequency caps, channel fatigue) are not filtering out the offer for this customer
  7. Channel/placement match — The request’s channel and placement must match what the flow is configured for
Enable Decision Traces in tenant settings to get a step-by-step breakdown of why each offer was included or filtered. See Debugging with Decision Traces below.

Scoring Returns 0

  • Model configured? Verify a scoring model is assigned to the Decision Flow and that it has valid weights
  • Circuit breaker tripped? If the model’s error rate exceeded the threshold, the circuit breaker opens and scoring returns a fallback value. Check the Operations Dashboard for circuit breaker status
  • Model health — Check the model health dashboard for drift or degraded performance

Wrong Offers Returned

  • Flow routing — Ensure the channel and placement in the Recommend request match the intended Decision Flow
  • Qualification rule scopes — Rules scoped to the wrong category or sub-category can inadvertently include/exclude offers
  • Optimization weights — If offers are returned but in an unexpected order, review the portfolio optimization profile weights (revenue, margin, propensity, engagement)

Health Checks

KaireonAI exposes two health endpoints for monitoring and orchestration.

GET /api/health

Returns the overall system health including database connectivity, Redis status, and circuit breaker states.
curl -s http://localhost:3000/api/health | jq
Response:
{
  "status": "ok",
  "database": "ok",
  "uptime": 3621.45,
  "timestamp": "2026-03-16T14:30:00.000Z"
}
StatusHTTP CodeMeaning
ok200All systems healthy
degraded (200)200Database is up, but Redis is down or a circuit breaker is open
degraded (503)503Database is unreachable
A degraded status with HTTP 200 means the platform can still serve requests with reduced functionality (e.g., no caching, fallback scoring). A 503 means the database is down and the platform cannot process decisions.

GET /api/ready

A readiness probe suitable for Kubernetes or load balancer health checks. Returns 200 only when both database and cache are connected.
curl -s http://localhost:3000/api/ready | jq
Response:
{
  "status": "ready",
  "checks": {
    "database": "connected",
    "cache": "connected"
  },
  "timestamp": "2026-03-16T14:30:00.000Z"
}
StatusHTTP CodeMeaning
ready200All dependencies healthy
degraded503Database is disconnected
Use /api/health for monitoring dashboards and alerting. Use /api/ready for Kubernetes readiness probes and load balancer target health checks.

Debugging with Decision Traces

Decision traces provide a forensic log of every step in the decision pipeline — which offers were considered, which filters removed them, and the final scoring/ranking.

Enabling Traces

  1. Go to Settings > Tenant Settings
  2. Enable Decision Tracing
  3. Set a sample rate (e.g., 0.1 for 10% of requests, or 1.0 for all requests during debugging)
Setting the sample rate to 1.0 in production will significantly increase database writes and storage usage. Use a low sample rate (0.01 - 0.1) for production monitoring, and 1.0 only during active debugging.

Reading a Trace

Each trace contains:
  • Request context — channel, placement, customer ID, attributes sent
  • Candidate set — all offers that entered the pipeline
  • Filter stages — which offers were removed at each stage (qualification, contact policy, budget, schedule) and why
  • Scoring — the raw and weighted scores for each surviving offer
  • Final ranking — the ordered list returned to the caller

Finding Why an Offer Was Filtered

  1. Go to Studio > Decision Flows and open the flow
  2. Click Recent Traces to see the latest decision trace results
  3. Search by customer ID or request ID
  4. Expand the trace and look at each filter stage — the filtered offer will show the stage name and reason (e.g., qualification_rule: min_balance >= 1000 failed)

Database Issues

Schema Push Fails

npx prisma db push
If this fails:
  • Connection error — Verify DATABASE_URL in .env (see Startup Issues)
  • Schema conflict — If you changed a column type on an existing table with data, Prisma may refuse. Use npx prisma db push --accept-data-loss only if you are okay losing data in that column
  • Permission denied — Ensure the database user has DDL privileges (CREATE TABLE, ALTER TABLE)

Migration Status Check

npx prisma migrate status
This shows pending and applied migrations. If migrations are out of sync, you may need to run:
npx prisma migrate deploy    # Apply pending migrations (production)
npx prisma migrate dev       # Create + apply migrations (development)

Connection Pool Exhaustion

Symptoms: requests hang or timeout, P2024: Timed out fetching a new connection from the connection pool errors. The default pool is configured with a maximum of 50 connections and a 2-second connection timeout. Fix:
  • Check for long-running queries or uncommitted transactions
  • Increase the pool size via the PG_POOL_MAX environment variable if your database supports more connections
  • Ensure the application is using the Prisma singleton (not creating new clients per request)
PG_POOL_MAX=100

Redis Issues

Rate Limiter Not Working as Expected

If rate limits are inconsistent across nodes, Redis may not be connected. The rate limiter falls back to in-memory mode when Redis is unavailable, which means each process tracks limits independently. Check Redis connectivity:
curl -s http://localhost:3000/api/ready | jq '.checks.cache'
If the response shows "unavailable" or "disconnected", verify your Redis connection configuration.

Stale Cache Data

If you see outdated data after making changes:
  • Clear the Redis cache by restarting Redis or flushing the relevant keys
  • Check that cache invalidation is wired up correctly for the entity you changed
  • As a workaround, restart the application to clear in-memory caches

Redis Connection Errors

Common causes:
  • Redis not running — Start your Redis instance
  • Wrong host/port — Verify REDIS_URL in your .env
  • Max connections exceeded — Check Redis maxclients setting
  • Network/firewall — Ensure the application can reach the Redis host
KaireonAI is designed for graceful degradation. If Redis is unavailable, the platform continues to operate — caching and distributed rate limiting fall back to in-memory alternatives. However, performance and accuracy of rate limits may be reduced.

Debug Mode

When standard logs are not enough, enable debug mode for verbose output across all subsystems.

Verbose Logging

Set LOG_LEVEL=debug to see detailed internal operations — query execution, cache hits/misses, scoring calculations, and middleware processing:
LOG_LEVEL=debug
# Docker
docker run -e LOG_LEVEL=debug kaireon-api

# Local development
LOG_LEVEL=debug npm run dev
Debug logging produces a large volume of output. Do not leave it enabled in production — it will fill your log storage quickly and may impact performance.

Decision Traces

For deep visibility into why the Recommend API returned specific results, enable decision tracing:
  1. Go to Settings > General > Decision Tracing
  2. Toggle tracing on and set the sample rate to 1.0 (100%) for debugging
  3. Make a Recommend API call
  4. Go to Studio > Decision Flows > Recent Traces to inspect the step-by-step pipeline execution
See Debugging with Decision Traces for details on reading trace output.

API Request IDs

Every API response includes an x-request-id header. Use this to correlate a specific client request with its server-side processing:
curl -v http://localhost:3000/api/v1/recommend \
  -H "Content-Type: application/json" \
  -d '{"customerId": "C001", "channel": "web"}' 2>&1 | grep x-request-id

# Output: < x-request-id: req_a1b2c3d4e5
Search your logs for this ID to find every log line related to that request.

Getting Help

If the troubleshooting steps above do not resolve your issue:

GitHub Issues

Search existing issues or file a new one. Include your error message, recommendationId, and steps to reproduce.

Documentation

Full platform documentation including API reference, deployment guides, and tutorials.
When filing an issue, include:
  • Error message — the full error text, not a summary
  • recommendationId / x-request-id — from the API response
  • Steps to reproduce — what you did before the error occurred
  • Environment — Docker or local dev, OS, Node.js version, PostgreSQL version
  • Relevant logs — the surrounding log lines (with LOG_LEVEL=debug if possible)

Operations Dashboard

Monitor pipeline metrics, DLQ, and circuit breakers

Decision Traces

Forensic tracing of decision pipeline execution

Architecture Overview

System architecture and scaling guidance