Gradient Boosted Trees (AGB)

When to use
The math
Fixture config
Training
Score interpretation
Pitfalls
Cross-reference

modelType: "gradient_boosted" — a sum-of-trees ensemble (gradient boosting machine). Each tree contributes a small additive margin; the final sigmoid converts the cumulative margin to a probability. Often abbreviated “AGB” in operator parlance (Adaptive Gradient Boosted). The accuracy ceiling on tabular data is hard to beat with anything that’s not also a tree ensemble.

When to use

You have ≥ 10k labeled outcomes, especially with many feature columns.
You suspect non-linear interactions — “income matters more for high-credit-score customers” — that logistic_regression can’t capture without manual feature engineering.
You want SHAP-explainable predictions — TreeSHAP runs in polynomial time on tree ensembles and gives exact per-feature attributions.

Skip it when the dataset is small (< 1k outcomes) — trees overfit unless you have enough samples per leaf. Use logistic_regression or bayesian first.

The math

For each tree t = 1..T:
  margin_t = walkTree(root, featureVector)   # leaf value at the path's end

rawMargin = Σ_t margin_t
score     = sigmoid(rawMargin)

Each tree is grown by gradient descent on the loss function (binary cross-entropy by default), where the gradient at each step is the negative residual of the previous ensemble. Each leaf stores a single numeric value (the contribution to the margin).

Fixture config

A minimal 2-tree ensemble (the proof script uses this exact fixture):

{
  "modelType": "gradient_boosted",
  "modelState": {
    "feature_names": ["credit_score", "income", "age"],
    "trees": [
      {
        "tree_structure": {
          "split_feature": 0,
          "threshold": 740,
          "left_child":  { "leaf_value": -0.8 },
          "right_child": {
            "split_feature": 1,
            "threshold": 75000,
            "left_child":  { "leaf_value": 0.2 },
            "right_child": { "leaf_value": 0.9 }
          }
        }
      },
      {
        "tree_structure": {
          "split_feature": 2,
          "threshold": 30,
          "left_child":  { "leaf_value": -0.3 },
          "right_child": {
            "split_feature": 1,
            "threshold": 60000,
            "left_child":  { "leaf_value": -0.1 },
            "right_child": { "leaf_value": 0.6 }
          }
        }
      }
    ]
  }
}

Produces score=0.8176 with rawMargin=1.50 for the standard test customer. The path-contribution explanation correctly identifies income as the dominant feature (it appears in both trees as a deep split).

Training

The engine doesn’t train GBTs in-process — it’s too expensive for a request-time pipeline. Train offline (Python + LightGBM or XGBoost, or your favorite GBT library) and import the model JSON into modelState.trees. The shape above matches LightGBM’s JSON export format. For production: run training on a schedule, store the latest model artifact, swap modelState via PUT /api/v1/algorithm-models/<id> after a hold-out evaluation.

Score interpretation

score ∈ [0, 1] — calibrated probability (well-calibrated when the ensemble has enough trees and isotonic post-calibration was applied during training).
rawMargin — the pre-sigmoid log-odds. Operators reading the trace can see how many trees voted positively vs negatively.
explanations[] — top contributing features along the chosen path through each tree. Per-feature contributions sum to the rawMargin.
shapValues — full TreeSHAP attributions when computeShap: true. More expensive but exact. See SHAP.

Pitfalls

Overfitting on small data — trees memorize. If you have < 1k outcomes, the test-set accuracy will be much worse than the training-set accuracy. Use early stopping during offline training.
Categorical encoding — GBT libraries handle categoricals natively if told, but the engine’s tree format assumes numeric inputs. Pre-encode categoricals as ordinal (and let the GBT pick split points) or one-hot.
Drift over time — tree splits are brittle to feature distribution shifts. Retrain monthly at minimum, weekly if conversion rate or customer mix is moving.
Calibration drift after retraining — without isotonic post-calibration, the raw GBT score is a margin, not a probability. Run isotonic on a held-out set to keep sigmoid(margin) calibrated.
Large model JSON — a 500-tree ensemble can be 5–50 MB. Watch modelState size; the engine reads it on every score call. Consider downsampling trees or using leaf quantization.

Cross-reference

Algorithm Selection Guide.
SHAP — TreeSHAP is exact and fast for this algorithm.
Logistic Regression — try this first; promote to GBT only if it materially beats logistic on hold-out.

Logistic Regression Thompson Sampling Bandit

Get Started

Tutorials

Decisioning

Studio

Data Pipelines

AI & ML

Operations & Reporting

Governance & Security

Integrations

Reference

Gradient Boosted Trees (AGB)

When to use

The math

Fixture config

Training

Score interpretation

Pitfalls

Cross-reference

Get Started

Tutorials

Decisioning

Studio

Data Pipelines

AI & ML

Operations & Reporting

Governance & Security

Integrations

Reference

Documentation Index

​When to use

​The math

​Fixture config

​Training

​Score interpretation

​Pitfalls

​Cross-reference

When to use

The math

Fixture config

Training

Score interpretation

Pitfalls

Cross-reference