Documentation Index
Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt
Use this file to discover all available pages before exploring further.
modelType: "bayesian" — a Naive Bayes classifier with Laplace smoothing. Returns a calibrated posterior probability that a customer responds positively to an offer, given a set of binned predictors. The model is small, fast to train, and produces per-feature log-likelihood contributions that explain each score.
When to use
- You have 1k–10k labeled outcomes — enough to estimate per-feature likelihoods, not enough to justify gradient_boosted.
- You need a calibrated probability — the score IS a probability, not a relative ranking. Useful when downstream consumers (e.g. budget pacers) need to multiply by expected value.
- Some predictors are categorical with moderate cardinality — Naive Bayes handles
segment,tier,statenatively; logistic_regression needs one-hot encoding.
The math
binEdges[field] defines bucket boundaries. Categorical predictors are used directly.
Fixture config
score = 0.812 for the standard test customer (credit_score=760, income=95000, age=38, segment=Gold), with credit_score contributing the most positive log-likelihood and segment + income reinforcing.
Training
The engine’strain.ts increments priors[outcome] and likelihoods[field][outcome][bin] for every observed interaction. No SGD, no learning rate — counts go up, smoothed estimates fall out for free.
To retrain: hit POST /api/v1/algorithm-models/<id>/train after a batch of new interactions. Or set the model’s auto-learn cron to incrementally update from each new respond call.
Score interpretation
score∈[0, 1]— posterior probability of positive response.confidence∈[0, 1]— heuristic based onsqrt(sample_count / 1000). A confidence of 1.0 means we’ve seen 1k+ outcomes for this model.explanations[]— sorted by absolute contribution. Each entry is{field, contribution: logLik(pos) - logLik(neg)}. Positive contributions push toward responding; negative ones away.
Pitfalls
- Strong predictor correlation — Naive Bayes double-counts correlated features (
credit_scoreandincomeboth signal affluence). Result: overconfident scores in the extremes. If you see scores clustering at 0.01 or 0.99, this is usually why. - Sparse bins — if a bin has 0 positive outcomes but
α=1, smoothing pulls it to a tiny but non-zero rate. Acceptable as long asαis small relative to total samples. - Imbalanced priors — if conversion rate is 1% and priors aren’t reset before retrain, the prior overwhelms the per-feature likelihoods. Bayesian needs the priors to track actual class balance.
- Categorical drift — if a new
segmentvalue appears that wasn’t in training data, it gets the smoothing-only count and basically falls back to the prior. Retrain when categories change.
Cross-reference
- Algorithm Selection Guide — when to pick Bayesian over alternatives.
- SHAP — Bayesian explanations are already per-feature log-likelihood contributions; no separate SHAP pass needed.