Skip to main content
The Starbucks dataset is the flagship dataset pack for KaireonAI. It creates a complete decisioning environment modeled after a coffee rewards program with 10 promotional offers across 4 channels, qualification rules, contact policies, adaptive models, experiments, and a customer segment. This guide walks you through loading the dataset, exploring what gets created, running recommendations, recording outcomes, and using adaptive models.

What Gets Created

EntityCountDetails
Data Schemas + DDL Tables3starbucks_customers (1000 rows), starbucks_offers (10 rows), starbucks_events (500 rows)
Categories3Acquisition (BOGO), Retention (Discount), Engagement (Informational)
Channels4Web (banner), Email (email), Mobile (push), Social (social_post)
Offers104 BOGO, 4 Discount, 2 Informational — priorities 55-85
Creatives4010 offers x 4 channels — each with channel-specific content
Qualification Rules2Min Age 18, Min Income $30k
Contact Policies2Max 3/day frequency cap, 24hr cooldown
Outcome Types5impression, click, accept, convert, dismiss
Algorithm Models3Scorecard, Bayesian (Naive Bayes), Thompson Bandit
Experiment1Scorecard vs Bayesian (80/20 split)
Decision Flow1Full pipeline with 7 stages
Segment1High-Income Members (income > $70k, tenure > 365 days)
Interaction History50090-day window of synthetic interactions
Interaction Summaries500Materialized aggregates for contact policy enforcement

Step-by-Step Walkthrough

1
Load the Starbucks Dataset
2
curl -X POST "http://localhost:3000/api/v1/seed-dataset/starbucks?force=true" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest"
3

Expected Response (201 Created)

{
  "success": true,
  "dataset": "starbucks",
  "created": {
    "schemas": 3,
    "categories": 3,
    "subCategories": 3,
    "channels": 4,
    "offers": 10,
    "creatives": 40,
    "qualificationRules": 2,
    "contactPolicies": 2,
    "outcomeTypes": 5,
    "models": 3,
    "experiments": 1,
    "decisionFlows": 1,
    "segments": 1,
    "customerRows": 1000,
    "offerRows": 10,
    "eventRows": 500,
    "interactionHistory": 500,
    "interactionSummaries": 500
  }
}
4
The ?force=true parameter removes any previously loaded dataset before seeding. Without it, you will get a 409 Conflict if data already exists.
5
Explore the Offers
6
The dataset creates 10 offers across 3 categories. List them all:
7
curl "http://localhost:3000/api/v1/actions" \
  -H "X-Requested-With: XMLHttpRequest"
8

Expected Response — 10 Offers

[
  {
    "id": "uuid-...",
    "name": "Starbucks: Discount — 10 Day Low",
    "status": "active",
    "priority": 85,
    "category": "retention",
    "budget": { "monthlyBudget": 60000, "costPerUnit": 2 },
    "schedule": { "startDate": "2026-01-01", "endDate": "2026-12-31" },
    "creatives": [ "4 creatives, one per channel" ],
    "categoryRef": { "name": "Starbucks: Retention" },
    "subCategoryRef": { "name": "Starbucks: Discount" }
  }
]
9
Priority (0-100) determines offer ranking. Score formula: (priority / 100) x (creative.weight / 100) x fitMultiplier. Higher priority offers score higher in the pipeline.
10
Run a Recommendation
11
Call the Recommend API with customer attributes to get personalized offers:
12
Qualified Customer
curl -X POST "http://localhost:3000/api/v1/recommend" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "cust_001",
  "limit": 5,
  "attributes": {
    "starbucks_customers.age": 30,
    "starbucks_customers.income": 65000
  }
}'
With Debug Trace
curl -X POST "http://localhost:3000/api/v1/recommend?debug=true" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "cust_002",
  "limit": 3,
  "channel": "email",
  "attributes": {
    "starbucks_customers.age": 25,
    "starbucks_customers.income": 45000
  }
}'
Missing Attributes (Blocked)
curl -X POST "http://localhost:3000/api/v1/recommend?debug=true" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "cust_003",
  "limit": 5
}'
13

Response — Qualified Customer

{
  "interactionId": "a1b2c3d4-...",
  "customerId": "cust_001",
  "count": 5,
  "decisions": [
    {
      "creativeId": "creative-uuid-1",
      "creativeName": "Starbucks: Discount — 10 Day Low — Web",
      "offerId": "offer-uuid-1",
      "offerName": "Starbucks: Discount — 10 Day Low",
      "category": "retention",
      "channelType": "Banner",
      "channelName": "Starbucks: Web",
      "content": {
        "headline": "Save on Your Next Visit",
        "subline": "10% off any drink for 10 days",
        "cta": "Claim Offer",
        "imageUrl": "/images/starbucks-discount.jpg"
      },
      "score": 0.85,
      "rank": 1,
      "priority": 85,
      "scoreExplanation": {
        "method": "priority_weighted",
        "priority": 85,
        "weight": 100,
        "fitMultiplier": 1.0,
        "finalScore": 0.85
      }
    }
  ]
}
14
Attributes use schema-prefixed dot notation as literal keys, NOT nested objects. The qualification engine does context.attributes["starbucks_customers.age"] — so the key must match exactly:
// CORRECT
"attributes": { "starbucks_customers.age": 30 }

// WRONG — will not match qualification rules
"attributes": { "age": 30 }
15
Record an Outcome
16
After delivering an offer, record the customer’s response via the Respond API:
17
Record a Click
curl -X POST "http://localhost:3000/api/v1/respond" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "cust_001",
  "creativeId": "<creative-uuid-from-decision>",
  "outcome": "click",
  "interactionId": "<interactionId-from-recommend-response>",
  "idempotencyKey": "click-cust001-creative123-20260304",
  "context": {
    "device": "mobile",
    "sessionId": "sess_abc123"
  }
}'
Record a Conversion
curl -X POST "http://localhost:3000/api/v1/respond" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "cust_001",
  "creativeId": "<creative-uuid>",
  "outcome": "convert",
  "interactionId": "<interactionId>",
  "idempotencyKey": "convert-cust001-order-12345",
  "conversionValue": 47.50,
  "outcomeDetails": {
    "orderId": "order-12345",
    "productSku": "frappuccino-venti"
  }
}'
18
Train an Adaptive Model
19
Trigger a manual training run for any of the 3 pre-loaded models:
20
curl -X POST "http://localhost:3000/api/v1/algorithm-models/<model-uuid>/train" \
  -H "X-Requested-With: XMLHttpRequest"
21

Training Response

{
  "modelId": "model-uuid",
  "modelType": "bayesian",
  "sampleCount": 500,
  "metrics": {
    "accuracy": 0.72,
    "precision": 0.68,
    "recall": 0.76,
    "f1": 0.72,
    "auc": 0.74,
    "trainedAt": "2026-03-04T15:10:00Z"
  },
  "status": "success"
}
22
Check the Experiment
23
The pre-loaded experiment splits traffic 80/20 between Scorecard (champion) and Bayesian (challenger):
24
curl "http://localhost:3000/api/v1/experiments" \
  -H "X-Requested-With: XMLHttpRequest"

Key Features Demonstrated

Qualification Rules

Two hard-gate rules filter customers:
RuleTypeConditionPriority
Min Age 18attribute_conditionstarbucks_customers.age >= 18100 (Critical)
Min Income $30kattribute_conditionstarbucks_customers.income >= 3000090 (Critical)
Missing attributes cause the rule to fail closed — the customer is ineligible.

Contact Policies

PolicyTypeConfig
Max 3 per Dayfrequency_capmaxPerDay: 3
24hr CooldowncooldowncooldownHours: 24

Three Model Types

ModelTypeHow It LearnsBest For
ScorecardscorecardRule-based (manual)Transparent, explainable scoring
BayesianbayesianAuto-incremental, updates every 100 outcomesAdapts to changing preferences
Thompson Banditthompson_banditUpdates Beta(alpha, beta) per arm on each outcomeExploration/exploitation for new offers

Decision Pipeline

The pipeline processes candidates through 7 stages:
  1. Inventory — Filters by schedule, flattens offers x creatives into candidates
  2. Qualification — Hard-gate rules (age, income, segment). Fail = removed
  3. Contact Policy — Frequency cap, cooldown, mutual exclusion
  4. Consent — Channel-level consent check
  5. Guardrails — Business constraint rules
  6. Scoring — Model-based or priority-weighted
  7. Ranking — Sort by score, apply limit, diversity constraints

Batch Email via Segments

Use the pre-loaded High-Income Members segment for batch email campaigns:
curl -X POST "http://localhost:3000/api/v1/simulate" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "segmentId": "<segment-uuid>",
  "sampleSize": 1000,
  "channelFilter": "email"
}'
This runs the decision pipeline for all ~142 segment members and returns per-customer decisions with email content (subject, body, CTA) ready for your ESP.

What to Look For

  • Qualification blocking: Request without attributes returns 0 decisions. The debugTrace shows which rules blocked each offer.
  • Contact policy enforcement: After 3 recommendations in a day, the frequency cap blocks further contact. The 24hr cooldown prevents re-contact too soon.
  • Experiment routing: The same customer always lands in the same experiment arm (deterministic hashing). 80% of customers get Scorecard scoring, 20% get Bayesian.
  • Adaptive learning: After recording outcomes via /respond, Bayesian models auto-update every 100 outcomes. Thompson Bandit updates alpha/beta per outcome immediately.
  • Multi-channel creatives: Each of the 10 offers has 4 creatives (web banner, email, push, social) with channel-appropriate content.

API Quick Reference

ActionMethodEndpoint
Load sample dataPOST/api/v1/seed-dataset/starbucks
List offersGET/api/v1/actions
List channelsGET/api/v1/channels
List creativesGET/api/v1/treatments
Get recommendationsPOST/api/v1/recommend
Record outcomePOST/api/v1/respond
Query interaction historyGET/api/v1/interaction-history
Run batch simulationPOST/api/v1/simulate
Train modelPOST/api/v1/algorithm-models/{id}/train
Score customerPOST/api/v1/algorithm-models/{id}/score
Create experimentPOST/api/v1/experiments