Documentation Index
Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The ML Worker is a standalone Python/FastAPI service that provides scikit-learn-based analysis and LightGBM training for KaireonAI’s AI features. It handles computationally intensive tasks — K-Means clustering for segmentation, logistic regression for policy analysis, TF-IDF for content analysis, and LightGBM training for thegradient_boosted model type — that exceed what LLM-based analysis can do accurately.
Training a
gradient_boosted algorithm model requires the ML Worker. Scoring does not — the trained tree ensemble is serialized to JSON and scored in-process in Node, so the /recommend hot path never calls Python. Only the Train button hits this service.When to Use the ML Worker
| Scenario | Without ML Worker | With ML Worker |
|---|---|---|
| Auto-Segmentation | LLM percentile-based grouping | K-Means on full dataset with silhouette scoring |
| Policy Recommender | Heuristic pattern recognition | Logistic regression and statistical analysis |
| Content Intelligence | CTR/CVR heuristics | TF-IDF + Random Forest feature importance |
| Dataset size | Works well under 5K rows | Required for 5K+ rows for accurate results |
Local Development
Set up environment
.env to set your local database URL:Configure the platform
Add to your Restart the Next.js dev server to pick up the change.
platform/.env:Docker Setup
Standalone Docker
Docker Compose
The platform’sdocker-compose.yml includes the ML Worker under the ml profile:
DATABASE_URL as the platform.
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
DATABASE_URL | Yes | — | PostgreSQL connection string (same database as the platform) |
ML_WORKER_PORT | No | 8000 | Port to listen on |
| Variable | Required | Default | Description |
|---|---|---|---|
ML_WORKER_URL | No | — | Base URL of the ML Worker. Required for gradient_boosted training and ML-Worker-backed AI features. If unset, GBM training returns MLWorkerUnavailableError. |
ML_WORKER_API_KEY | No | — | Optional shared secret. If set, the platform sends it as X-Api-Key on every ML Worker request. |
ML_WORKER_TIMEOUT_MS | No | 300000 | Per-request timeout (ms). GBM training on 50K rows + 100 trees typically completes in under 5 seconds, but allow headroom for pathological inputs. |
Kubernetes (Helm)
The Helm chart includes ML Worker deployment as an optional component. Enable it with:mlWorker.enabled=true, the chart automatically:
- Creates a Deployment and Service for the ML Worker
- Injects
ML_WORKER_URLinto the API pods so the platform auto-connects - Creates a ServiceAccount for the ML Worker pods
Helm Values
| Value | Default | Description |
|---|---|---|
mlWorker.enabled | false | Enable ML Worker deployment |
mlWorker.replicas | 1 | Number of replicas |
mlWorker.image.repository | <YOUR_ACCOUNT_ID>.dkr.ecr.<REGION>.amazonaws.com/kaireon-ml | Container image |
mlWorker.image.tag | latest | Image tag |
mlWorker.resources.requests.cpu | 500m | CPU request |
mlWorker.resources.requests.memory | 1Gi | Memory request |
mlWorker.resources.limits.cpu | 2000m | CPU limit |
mlWorker.resources.limits.memory | 4Gi | Memory limit |
Connecting from the Platform
There are two ways to connect the platform to the ML Worker:1. Environment Variable (Recommended for local dev and Kubernetes)
SetML_WORKER_URL in the platform’s environment. The Helm chart does this automatically when mlWorker.enabled=true.
2. Settings UI (Runtime configuration)
- Navigate to Settings > Integrations in the KaireonAI UI
- Find the ML Worker section
- Enter the ML Worker URL
- Click Test Connection to verify
- Save the configuration
API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/health | GET | Health check with capabilities list |
/analyze/policies | POST | Submit policy analysis job |
/analyze/segments | POST | Submit segmentation job |
/analyze/content | POST | Submit content analysis job |
/status/{job_id} | GET | Poll job status and results |
/train/gbm | POST | Synchronous LightGBM training. Returns the serialized tree ensemble, metrics, and feature importances. Called by the platform when a gradient_boosted model is trained. |
jobId immediately and process in the background. /train/gbm is synchronous and returns the trained model JSON directly.
Platform-side health probe
The platform exposesGET /api/v1/ml-worker/health as a pass-through probe so the UI can warn users before attempting GBM training. Response:
available is true only if ML_WORKER_URL is configured and the worker’s /health endpoint responds with status: ok.
Large Dataset Warning Flow
When a dataset contains 5,000 or more rows, the KaireonAI UI shows a confirmation dialog before starting analysis. The dialog provides:- Accuracy — ML Worker algorithms (K-Means, logistic regression, TF-IDF) are more accurate than LLM pattern matching on large datasets
- Cost estimate — Approximate token count and cost if the user proceeds with LLM analysis
- Speed — ML Worker processes data locally in seconds vs. LLM round-trip latency
Troubleshooting
ML Worker not detected by the platform
ML Worker not detected by the platform
The platform checks for the ML Worker at startup via the
ML_WORKER_URL environment variable, or at runtime via Settings > Integrations. Verify:- The ML Worker is running and responding:
curl http://localhost:8000/health - The URL is accessible from the platform (same network/cluster)
- The environment variable is set correctly:
ML_WORKER_URL=http://localhost:8000
mlWorker.enabled=true.Python version mismatch or pip install failures
Python version mismatch or pip install failures
The ML Worker requires Python 3.11+. Verify your version:If using a virtual environment, ensure it is activated before installing:On macOS, you may need to install
python3.11 explicitly via Homebrew: brew install python@3.11.ModuleNotFoundError: sklearn
ModuleNotFoundError: sklearn
The Python package name is This is a common source of confusion. The
scikit-learn, not sklearn:requirements.txt uses the correct package name, so running pip install -r requirements.txt avoids this issue.Health check fails ('connection refused' to PostgreSQL)
Health check fails ('connection refused' to PostgreSQL)
The ML Worker connects directly to PostgreSQL to read schema data. Verify:
DATABASE_URLis set in the ML Worker environment and matches the platform database- PostgreSQL is reachable from the ML Worker host/container
- Check logs for connection errors:
docker logs kaireon-ml-worker
Out of memory during analysis
Out of memory during analysis
K-Means clustering and TF-IDF vectorization load the full dataset into memory. Allocate memory based on dataset size:
- Under 100K rows: 1Gi is sufficient
- 100K-500K rows: 2Gi recommended
- Over 500K rows: 4Gi+ recommended
Next Steps
Auto-Segmentation
Use the ML Worker for full-dataset clustering.
Smart Policy Recommender
Enhanced frequency analysis with the ML Worker.
Kubernetes Deployment
Deploy the full stack on Kubernetes.