Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt

Use this file to discover all available pages before exploring further.

modelType: "neural_cf" — a neural collaborative-filtering model with separate user and offer embedding tables feeding into a small MLP. The score for a (customer, offer) pair is sigmoid(MLP(userEmbedding ⊕ offerEmbedding)). Trained with binary cross-entropy via SGD.

When to use

  • You have meaningful interaction density — at least 50–100 interactions per high-traffic customer or offer. Below this, embeddings are noise.
  • You suspect collaborative signal matters — “customers similar to Alice respond to similar offers”. Demographic features alone don’t capture this.
  • You want a model that improves with scale — embeddings get sharper as more interactions land.
Skip it when the customer cohort is small (< 10k customers with > 5 interactions each), when offers churn rapidly (each new offer needs an embedding learned from interactions), or when interpretability is critical (KernelSHAP works but is approximate).

The math

userEmb  = userEmbeddings[customerId]          ∈ R^D
itemEmb  = itemEmbeddings[offerId]              ∈ R^D
input    = [userEmb; itemEmb]                  ∈ R^(2D)
hidden   = relu(input · W_hidden + b_hidden)    ∈ R^H
output   = hidden · W_output + b_output         ∈ R
score    = sigmoid(output)                      ∈ [0, 1]
D = embeddingDim (typically 8–32), H = hiddenDim (typically 16–64). Larger dimensions capture more nuance but need more data and compute.

Fixture config

The engine ships an initNeuralCF(embDim, hiddenDim) helper that produces Xavier-initialized weights and empty embedding tables:
import { initNeuralCF } from "@/lib/scoring/neural-cf";

const state = initNeuralCF(4, 8);  // 4-dim embeddings, 8-dim hidden
state.userEmbeddings["cust-42"]    = [0.2, -0.1, 0.5,  0.3];
state.itemEmbeddings["off-travel"] = [0.3,  0.2, 0.4, -0.1];
// ...etc per (customer, offer) seen during training
The proof script with the standard test customer + 3 fixture offers produces (random Xavier init, exact values vary per process):
  • off-travel: ~0.55
  • off-cashback: ~0.59
  • off-nofee: ~0.65
The differential between offers — driven by embedding similarity — is the signal even when the absolute values are random.

Training

Mini-batch SGD with binary cross-entropy. The engine’s training routine accepts a list of {customerId, offerId, label} tuples and runs gradient descent over the full network (embeddings + MLP). Initial training: run offline on the full interaction history. Subsequent updates: incremental retraining on rolling windows. Don’t update embeddings on every interaction in production — too slow and too noisy. Batch up.

Score interpretation

  • score[0, 1] — the neural network’s prediction of positive response.
  • Not as calibrated as logistic_regression or Bayesian — the sigmoid output drifts from true probability as the MLP saturates.
  • KernelSHAP explanationsshapValues (when computeShap: true) attribute the score to each embedding dimension. Useful for debugging “why did this customer get this score” but not human-readable like Bayesian’s per-field log-likelihoods.

Pitfalls

  • Cold-start for new users or offers — the embedding is initialized to zeros (or Xavier random). Predictions are noise until the model has seen ≥ 5 interactions with that user or offer.
  • Embedding drift — embeddings change every retrain. Two scores from before-and-after a retrain aren’t directly comparable. Pin the training cadence and version your embeddings.
  • Embedding-space collapse — without regularization or a contrastive loss, embeddings can converge to the same vector for all users / all offers. Watch the variance of the embedding tables; if it drops over training epochs, add weight decay.
  • One-tower simpler models work better at small scale — if you have < 100k interactions, a logistic_regression with one-hot offer features will likely outperform neural_cf.
  • Inference cost — every score call runs the full forward pass. For a 32-dim embedding + 64-dim hidden + 100 candidates, that’s ~6k multiply-adds per request. Profile if latency matters.

Cross-reference