Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt

Use this file to discover all available pages before exploring further.

modelType: "online_learner" — a perceptron / online logistic regression that updates its weight vector on every observed outcome. No batch retrain phase; the model is always current as of the most recent reward.

When to use

  • You can’t afford a retrain pipeline — limited engineering capacity, fast-changing data.
  • You want a model that adapts to drift — seasonal patterns, campaign changes, news-driven shifts in customer behavior.
  • Numeric features only — the engine version handles numeric inputs natively (categoricals need preprocessing).
Skip it when you need calibrated probabilities (online updates can leave the model uncalibrated between adjustments) or when features are highly non-linear (use gradient_boosted).

The math

# Inference (same as logistic_regression):
z      = bias + Σ_i (weights[xᵢ] × xᵢ)
score  = sigmoid(z)

# After observing outcome:
error  = (observed_reward - score)
weights[xᵢ] += learningRate × error × xᵢ
bias        += learningRate × error
The learning rate η controls how aggressively each new outcome moves the weights. Common values: 0.01 (slow, stable) to 0.1 (fast, noisy).

Fixture config

{
  "modelType": "online_learner",
  "modelState": {
    "weights": { "credit_score": 0.004, "income": 0.000015, "age": 0.005 },
    "bias": -3.2,
    "learningRate": 0.01
  }
}
The proof script verifies this scores 0.8108 for the standard test customer — close to logistic_regression’s 0.93 (different weights, same shape). The slightly lower score reflects the more conservative weights you’d expect from an online learner that hasn’t seen as many outcomes as a batch logistic regression.

Training

Updates happen via POST /api/v1/respondauto-learn.ts applies the gradient update for each incoming outcome. No separate train endpoint. To bootstrap: seed weights = {} and bias = 0 (cold start, every candidate scores 0.5). Or seed with weights from a batch-trained logistic_regression model — the online learner can then refine them as outcomes arrive.

Score interpretation

Same as logistic_regression — calibrated probability in [0, 1], conditional on the weight vector being stable. During rapid drift, the score is more “current estimate” than “calibrated probability”.

Pitfalls

  • High learning rate → instability — at η = 0.5, a single bad outcome can flip the sign of a weight. Stick to 0.01–0.1.
  • Feature scaling — same issue as logistic_regression. Standardize features before training.
  • No regularization — the engine’s online learner has no L2 by default; weights can drift unboundedly on rare features. If you see exploding magnitudes, add a decay step in auto-learn.
  • Lost-update on concurrent writes — two parallel respond calls can race on the same weight vector. The engine serializes writes to ModelAdaptation per (modelId, scopeId); if you customize the update path, preserve that contract.
  • No held-out evaluation — there’s no train/test split; you’re updating against the same stream that’s being scored. Watch for self-confirming loops (the model believes “premium customers respond” → only shows offers to premium → only learns from premium → over-confident on premium). Mix in exploration via shadowModelKeys if you suspect this.

Cross-reference