Documentation Index
Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt
Use this file to discover all available pages before exploring further.
modelType: "neural_cf" — a neural collaborative-filtering model with separate user and offer embedding tables feeding into a small MLP. The score for a (customer, offer) pair is sigmoid(MLP(userEmbedding ⊕ offerEmbedding)). Trained with binary cross-entropy via SGD.
When to use
- You have meaningful interaction density — at least 50–100 interactions per high-traffic customer or offer. Below this, embeddings are noise.
- You suspect collaborative signal matters — “customers similar to Alice respond to similar offers”. Demographic features alone don’t capture this.
- You want a model that improves with scale — embeddings get sharper as more interactions land.
The math
D = embeddingDim (typically 8–32), H = hiddenDim (typically 16–64). Larger dimensions capture more nuance but need more data and compute.
Fixture config
The engine ships aninitNeuralCF(embDim, hiddenDim) helper that produces Xavier-initialized weights and empty embedding tables:
- off-travel: ~0.55
- off-cashback: ~0.59
- off-nofee: ~0.65
Training
Mini-batch SGD with binary cross-entropy. The engine’s training routine accepts a list of{customerId, offerId, label} tuples and runs gradient descent over the full network (embeddings + MLP).
Initial training: run offline on the full interaction history. Subsequent updates: incremental retraining on rolling windows. Don’t update embeddings on every interaction in production — too slow and too noisy. Batch up.
Score interpretation
score∈[0, 1]— the neural network’s prediction of positive response.- Not as calibrated as logistic_regression or Bayesian — the sigmoid output drifts from true probability as the MLP saturates.
- KernelSHAP explanations —
shapValues(whencomputeShap: true) attribute the score to each embedding dimension. Useful for debugging “why did this customer get this score” but not human-readable like Bayesian’s per-field log-likelihoods.
Pitfalls
- Cold-start for new users or offers — the embedding is initialized to zeros (or Xavier random). Predictions are noise until the model has seen ≥ 5 interactions with that user or offer.
- Embedding drift — embeddings change every retrain. Two scores from before-and-after a retrain aren’t directly comparable. Pin the training cadence and version your embeddings.
- Embedding-space collapse — without regularization or a contrastive loss, embeddings can converge to the same vector for all users / all offers. Watch the variance of the embedding tables; if it drops over training epochs, add weight decay.
- One-tower simpler models work better at small scale — if you have < 100k interactions, a logistic_regression with one-hot offer features will likely outperform neural_cf.
- Inference cost — every score call runs the full forward pass. For a 32-dim embedding + 64-dim hidden + 100 candidates, that’s ~6k multiply-adds per request. Profile if latency matters.
Cross-reference
- Algorithm Selection Guide.
- SHAP — KernelSHAP runs on neural_cf when
computeShap: true.