Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt

Use this file to discover all available pages before exploring further.

modelType: "gradient_boosted" — a sum-of-trees ensemble (gradient boosting machine). Each tree contributes a small additive margin; the final sigmoid converts the cumulative margin to a probability. Often abbreviated “AGB” in operator parlance (Adaptive Gradient Boosted). The accuracy ceiling on tabular data is hard to beat with anything that’s not also a tree ensemble.

When to use

  • You have ≥ 10k labeled outcomes, especially with many feature columns.
  • You suspect non-linear interactions — “income matters more for high-credit-score customers” — that logistic_regression can’t capture without manual feature engineering.
  • You want SHAP-explainable predictions — TreeSHAP runs in polynomial time on tree ensembles and gives exact per-feature attributions.
Skip it when the dataset is small (< 1k outcomes) — trees overfit unless you have enough samples per leaf. Use logistic_regression or bayesian first.

The math

For each tree t = 1..T:
  margin_t = walkTree(root, featureVector)   # leaf value at the path's end

rawMargin = Σ_t margin_t
score     = sigmoid(rawMargin)
Each tree is grown by gradient descent on the loss function (binary cross-entropy by default), where the gradient at each step is the negative residual of the previous ensemble. Each leaf stores a single numeric value (the contribution to the margin).

Fixture config

A minimal 2-tree ensemble (the proof script uses this exact fixture):
{
  "modelType": "gradient_boosted",
  "modelState": {
    "feature_names": ["credit_score", "income", "age"],
    "trees": [
      {
        "tree_structure": {
          "split_feature": 0,
          "threshold": 740,
          "left_child":  { "leaf_value": -0.8 },
          "right_child": {
            "split_feature": 1,
            "threshold": 75000,
            "left_child":  { "leaf_value": 0.2 },
            "right_child": { "leaf_value": 0.9 }
          }
        }
      },
      {
        "tree_structure": {
          "split_feature": 2,
          "threshold": 30,
          "left_child":  { "leaf_value": -0.3 },
          "right_child": {
            "split_feature": 1,
            "threshold": 60000,
            "left_child":  { "leaf_value": -0.1 },
            "right_child": { "leaf_value": 0.6 }
          }
        }
      }
    ]
  }
}
Produces score=0.8176 with rawMargin=1.50 for the standard test customer. The path-contribution explanation correctly identifies income as the dominant feature (it appears in both trees as a deep split).

Training

The engine doesn’t train GBTs in-process — it’s too expensive for a request-time pipeline. Train offline (Python + LightGBM or XGBoost, or your favorite GBT library) and import the model JSON into modelState.trees. The shape above matches LightGBM’s JSON export format. For production: run training on a schedule, store the latest model artifact, swap modelState via PUT /api/v1/algorithm-models/<id> after a hold-out evaluation.

Score interpretation

  • score[0, 1] — calibrated probability (well-calibrated when the ensemble has enough trees and isotonic post-calibration was applied during training).
  • rawMargin — the pre-sigmoid log-odds. Operators reading the trace can see how many trees voted positively vs negatively.
  • explanations[] — top contributing features along the chosen path through each tree. Per-feature contributions sum to the rawMargin.
  • shapValues — full TreeSHAP attributions when computeShap: true. More expensive but exact. See SHAP.

Pitfalls

  • Overfitting on small data — trees memorize. If you have < 1k outcomes, the test-set accuracy will be much worse than the training-set accuracy. Use early stopping during offline training.
  • Categorical encoding — GBT libraries handle categoricals natively if told, but the engine’s tree format assumes numeric inputs. Pre-encode categoricals as ordinal (and let the GBT pick split points) or one-hot.
  • Drift over time — tree splits are brittle to feature distribution shifts. Retrain monthly at minimum, weekly if conversion rate or customer mix is moving.
  • Calibration drift after retraining — without isotonic post-calibration, the raw GBT score is a margin, not a probability. Run isotonic on a held-out set to keep sigmoid(margin) calibrated.
  • Large model JSON — a 500-tree ensemble can be 5–50 MB. Watch modelState size; the engine reads it on every score call. Consider downsampling trees or using leaf quantization.

Cross-reference