Uplift Modeling
Propensity says “will this customer convert?”. Uplift says “will showing this offer cause the conversion, or would they have converted anyway?”. The first rankssure things (always-takers) at the top — wasting budget on people who’d buy without a touch. The second ranks persuadables at the top — exactly what you want.
Kaireon implements two canonical metalearners from Künzel, Sekhon, Bickel & Yu (PNAS 2019): the T-learner and X-learner. Both are exposed via an HTTP endpoint and a small math library that takes pluggable base learners.
The four uplift segments
For each (customer × offer) pair, the CATEτ = E[Y(1) − Y(0) | X = x] and the per-arm conversion rates μ_T = E[Y | X, T=1] and μ_C = E[Y | X, T=0] together classify the customer into one of four segments:
| Segment | When | Decisioning implication |
|---|---|---|
| Persuadable | τ high, μ_C low | Treatment causes the conversion. Rank these first. |
| Sure thing (always-taker) | τ ≈ 0, μ_C high | Would convert anyway. Save the impression. |
| Lost cause (never-taker) | τ ≈ 0, μ_T low | Won’t convert either way. Skip. |
| Sleeping dog (defier) | τ negative | Treatment causes them NOT to convert. Hide the offer. |
uncertain, is returned when a (τ, μ_T, μ_C) triple matches none of the four threshold rules — so the segment field is one of five values, not four.
Pure-propensity scoring confuses persuadables with sure-things — both have high μ_T. Uplift modeling is the only way to tell them apart.
T-learner
Fit two regressions on disjoint subsets ofinteraction_history:
X-learner
The X-learner addresses the imbalance weakness via a two-stage fit: Stage 1: sameμ_T, μ_C as T-learner.
Stage 2: compute imputed treatment effects:
τ_T(x) on (X_T, D_T) and τ_C(x) on (X_C, D_C).
Stage 3: combine using propensity g(x) = P(T=1 | X=x):
API
| Param | Required | Default | Description |
|---|---|---|---|
customerId | yes | — | Target customer. |
method | no | t_learner | One of t_learner, x_learner. |
offerIds | no | all active | Comma-separated offer IDs to score. |
mode | no | marginal | marginal (cheap offer-vs-category posteriors) or fitted (real per-row T-/X-learner trained on interaction_history). |
channelId | no | — | Score-time channel context (only used by mode=fitted). |
direction | no | inbound | Score-time direction: inbound or outbound (only used by mode=fitted). |
confidence is a sample-size heuristic: 1 − exp(−min(n_offer, n_category) / 50). At n=50 → 0.63, n=200 → 0.98.
Honest scope
The endpoint at/algorithm-models/{id}/uplift uses the platform’s existing ModelAdaptation posteriors as the base learners:
μ_T(offer)= offer-scopepositiveRateμ_C(offer)= category-scopepositiveRate(control proxy: “what this customer would do if shown a different offer in the same category”)g(offer)= offer-evidence / (offer-evidence + category-evidence)
mode=marginal, the default) — per-offer, not per-customer-features. It’s the right starting point because we already have the data.
Passing ?mode=fitted switches the same endpoint to a real per-row T-learner / X-learner: it fits μ_T / μ_C as separate logistic regressions on the treated vs. control (same-category) subsets of interaction_history — up to MAX_TRAINING_ROWS = 5000 most-recent outcome rows — scored at the request-time context (channelId, direction). For method=x_learner in fitted mode it additionally fits the stage-2 lift regressors and a real propensity model g(x), but only when both arms have ≥ 30 rows; otherwise it falls back to constant stage-2 closures + the evidence-fraction propensity. Fitted results are cached in-process (100-entry LRU, 5-minute TTL, keyed on the latest interaction_history timestamp so new training data invalidates the cache).
To plug entirely custom base learners, call the math library directly — see platform/src/lib/experimentation/uplift-cate.ts. The math is identical; only the model fitting changes.
How it improves ranking
When the uplift weight is non-zero, the final decision score becomes:Wu is the uplift weight (default 0 for backward compat). Set it through a
Ranking Profile’s uplift weight key (range
0..1) or the inline Score-node formula.upliftWeight (range 0..2); both map
to the same upliftWeight term. The profile uplift key is now actually wired
into the formula — it was previously documented but stripped by validation. With
Wu > 0, the persuadable segment (positive τ) dominates the top of the ranking
and sleeping-dogs (negative τ) drop. The engine stamps upliftTau and
upliftMultiplier on each candidate’s trace. Test this on a holdout cohort
before raising Wu above ~0.1 in production — uplift is a more aggressive ranker
than pure propensity and can suppress evergreen offers if mis-tuned.
Configuration
Per-tenant in Settings → Models → Uplift:| Setting | Default | Effect |
|---|---|---|
upliftMethodDefault | t_learner | Default method used when callers don’t pass method; the only persisted tenant-level uplift setting (GET/PUT /api/v1/tenant-settings). |
classifyConfig argument.
The ranking weight Wu is not a tenant setting — it lives on the scoring
config: set it via a Ranking Profile’s uplift
weight key (range 0..1) or the Score node’s inline formula.upliftWeight
(range 0..2). Both default to 0. Raise carefully.
References
- Künzel, Sekhon, Bickel, Yu (2019). “Metalearners for estimating heterogeneous treatment effects using machine learning.” PNAS 116(10): 4156–4165. The canonical reference for both T-learner and X-learner. pnas.org/doi/10.1073/pnas.1804597116 · arxiv.org/abs/1706.03461
- Athey & Wager (2017). “Estimation and Inference of Heterogeneous Treatment Effects using Random Forests.” arxiv.org/abs/1510.04342. Causal Forests — the obvious next step after the metalearners.
- Dudík, Langford, Li (2011). “Doubly Robust Policy Evaluation and Learning.” ICML. arxiv.org/abs/1103.4601. Off-policy evaluator for contextual bandits — pairs naturally with X-learner.
- Radcliffe & Surry (2011). “Real-World Uplift Modelling with Significance-Based Uplift Trees.” Stochastic Solutions white paper. Practitioner-style intro; useful framing for marketing teams.
Code
- Library:
platform/src/lib/experimentation/uplift-cate.ts - HTTP route:
platform/src/app/api/v1/algorithm-models/[id]/uplift/route.ts - Population-level uplift (pre-existing, two-proportion z-test):
platform/src/lib/experimentation/uplift.ts