Starbucks Dataset Guide

The Starbucks dataset is the flagship dataset pack for KaireonAI. It creates a complete decisioning environment modeled after a coffee rewards program with 10 promotional offers across 6 channels, qualification rules, contact policies, adaptive models, experiments, and a customer segment. This guide walks you through loading the dataset, exploring what gets created, running recommendations, recording outcomes, and using adaptive models.

What Gets Created

Entity	Count	Details
Data Schemas + DDL Tables	3	`starbucks_customers` (100 rows), `starbucks_offers` (10 rows), `starbucks_events` (500 rows)
Categories	3	Acquisition (BOGO), Retention (Discount), Engagement (Informational)
Channels	4	Web (banner), Email (email), Mobile (push), Social (social_post)
Offers	10	4 BOGO, 4 Discount, 2 Informational — priorities 55-85
Creatives	40	10 offers x 6 channels — each with channel-specific content
Qualification Rules	2	Min Age 18, Min Income $30k
Contact Policies	2	Max 3/day frequency cap, 24hr cooldown
Outcome Types	5	impression, click, accept, convert, dismiss
Algorithm Models	3	Scorecard, Bayesian (Naive Bayes), Thompson Bandit
Experiment	1	Scorecard vs Bayesian (80/20 split)
Decision Flow	1	Full pipeline with 7 stages
Segment	1	High-Income Members (income > $70k, tenure > 365 days)
Interaction History	500	90-day window of synthetic interactions
Interaction Summaries	500	Materialized aggregates for contact policy enforcement

Step-by-Step Walkthrough

Load the Starbucks Dataset

curl -X POST "http://localhost:3000/api/v1/seed-dataset/starbucks?force=true" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest"

Expected Response (201 Created)

{
  "success": true,
  "dataset": "starbucks",
  "created": {
    "schemas": 3,
    "categories": 3,
    "subCategories": 3,
    "channels": 4,
    "offers": 10,
    "creatives": 40,
    "qualificationRules": 2,
    "contactPolicies": 2,
    "outcomeTypes": 5,
    "models": 3,
    "experiments": 1,
    "decisionFlows": 1,
    "segments": 1,
    "customerRows": 100,
    "offerRows": 10,
    "eventRows": 500,
    "interactionHistory": 500,
    "interactionSummaries": 500
  }
}

The ?force=true parameter removes any previously loaded dataset before seeding. Without it, you will get a 409 Conflict if data already exists.

Explore the Offers

The dataset creates 10 offers across 3 categories. List them all:

curl "http://localhost:3000/api/v1/actions" \
  -H "X-Requested-With: XMLHttpRequest"

Expected Response — 10 Offers

[
  {
    "id": "uuid-...",
    "name": "Starbucks: Discount — 10 Day Low",
    "status": "active",
    "priority": 85,
    "category": "retention",
    "budget": { "monthlyBudget": 60000, "costPerUnit": 2 },
    "schedule": { "startDate": "2026-01-01", "endDate": "2026-12-31" },
    "creatives": [ "4 creatives, one per channel" ],
    "categoryRef": { "name": "Starbucks: Retention" },
    "subCategoryRef": { "name": "Starbucks: Discount" }
  }
]

Priority (0-100) determines offer ranking. Score formula: (priority / 100) x (creative.weight / 100) x fitMultiplier. Higher priority offers score higher in the pipeline.

Run a Recommendation

Call the Recommend API with customer attributes to get personalized offers:

Qualified Customer

curl -X POST "http://localhost:3000/api/v1/recommend" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "cust_001",
  "limit": 5,
  "attributes": {
    "starbucks_customers.age": 30,
    "starbucks_customers.income": 65000
  }
}'

With Debug Trace

curl -X POST "http://localhost:3000/api/v1/recommend?debug=true" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "cust_002",
  "limit": 3,
  "channel": "email",
  "attributes": {
    "starbucks_customers.age": 25,
    "starbucks_customers.income": 45000
  }
}'

Missing Attributes (Blocked)

curl -X POST "http://localhost:3000/api/v1/recommend?debug=true" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "cust_003",
  "limit": 5
}'

Response — Qualified Customer

{
  "interactionId": "a1b2c3d4-...",
  "customerId": "cust_001",
  "count": 5,
  "decisions": [
    {
      "creativeId": "creative-uuid-1",
      "creativeName": "Starbucks: Discount — 10 Day Low — Web",
      "offerId": "offer-uuid-1",
      "offerName": "Starbucks: Discount — 10 Day Low",
      "category": "retention",
      "channelType": "Banner",
      "channelName": "Starbucks: Web",
      "content": {
        "headline": "Save on Your Next Visit",
        "subline": "10% off any drink for 10 days",
        "cta": "Claim Offer",
        "imageUrl": "/images/starbucks-discount.jpg"
      },
      "score": 0.85,
      "rank": 1,
      "priority": 85,
      "scoreExplanation": {
        "method": "priority_weighted",
        "priority": 85,
        "weight": 100,
        "fitMultiplier": 1.0,
        "finalScore": 0.85
      }
    }
  ]
}

Attributes use schema-prefixed dot notation as literal keys, NOT nested objects. The qualification engine does context.attributes["starbucks_customers.age"] — so the key must match exactly:

// CORRECT
"attributes": { "starbucks_customers.age": 30 }

// WRONG — will not match qualification rules
"attributes": { "age": 30 }

Record an Outcome

After delivering an offer, record the customer’s response via the Respond API:

Record a Click

curl -X POST "http://localhost:3000/api/v1/respond" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "cust_001",
  "creativeId": "<creative-uuid-from-decision>",
  "outcome": "click",
  "interactionId": "<interactionId-from-recommend-response>",
  "idempotencyKey": "click-cust001-creative123-20260304",
  "context": {
    "device": "mobile",
    "sessionId": "sess_abc123"
  }
}'

Record a Conversion

curl -X POST "http://localhost:3000/api/v1/respond" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "cust_001",
  "creativeId": "<creative-uuid>",
  "outcome": "convert",
  "interactionId": "<interactionId>",
  "idempotencyKey": "convert-cust001-order-12345",
  "conversionValue": 47.50,
  "outcomeDetails": {
    "orderId": "order-12345",
    "productSku": "frappuccino-venti"
  }
}'

Train an Adaptive Model

Trigger a manual training run for any of the 3 pre-loaded models:

curl -X POST "http://localhost:3000/api/v1/algorithm-models/<model-uuid>/train" \
  -H "X-Requested-With: XMLHttpRequest"

Training Response

{
  "modelId": "model-uuid",
  "modelType": "bayesian",
  "sampleCount": 500,
  "metrics": {
    "accuracy": 0.72,
    "precision": 0.68,
    "recall": 0.76,
    "f1": 0.72,
    "auc": 0.74,
    "trainedAt": "2026-03-04T15:10:00Z"
  },
  "status": "success"
}

Check the Experiment

The pre-loaded experiment splits traffic 80/20 between Scorecard (champion) and Bayesian (challenger):

curl "http://localhost:3000/api/v1/experiments" \
  -H "X-Requested-With: XMLHttpRequest"

Key Features Demonstrated

Qualification Rules

Two hard-gate rules filter customers:

Rule	Type	Condition	Priority
Min Age 18	attribute_condition	`starbucks_customers.age >= 18`	100 (Critical)
Min Income $30k	attribute_condition	`starbucks_customers.income >= 30000`	90 (Critical)

Missing attributes cause the rule to fail closed — the customer is ineligible.

Contact Policies

Policy	Type	Config
Max 3 per Day	frequency_cap	`maxPerDay: 3`
24hr Cooldown	cooldown	`cooldownHours: 24`

Three Model Types

Model	Type	How It Learns	Best For
Scorecard	`scorecard`	Rule-based (manual)	Transparent, explainable scoring
Bayesian	`bayesian`	Auto-incremental, updates every 100 outcomes	Adapts to changing preferences
Thompson Bandit	`thompson_bandit`	Updates Beta(alpha, beta) per arm on each outcome	Exploration/exploitation for new offers

Decision Pipeline

The pipeline processes candidates through 7 stages:

Inventory — Filters by schedule, flattens offers x creatives into candidates
Qualification — Hard-gate rules (age, income, segment). Fail = removed
Contact Policy — Frequency cap, cooldown, mutual exclusion
Consent — Channel-level consent check
Guardrails — Business constraint rules
Scoring — Model-based or priority-weighted
Ranking — Sort by score, apply limit, diversity constraints

Batch Email via Segments

Use the pre-loaded High-Income Members segment for batch email campaigns:

curl -X POST "http://localhost:3000/api/v1/simulate" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "segmentId": "<segment-uuid>",
  "sampleSize": 100,
  "channelFilter": "email"
}'

This runs the decision pipeline for all ~142 segment members and returns per-customer decisions with email content (subject, body, CTA) ready for your ESP.

What to Look For

Qualification blocking: Request without attributes returns 0 decisions. The debugTrace shows which rules blocked each offer.
Contact policy enforcement: After 3 recommendations in a day, the frequency cap blocks further contact. The 24hr cooldown prevents re-contact too soon.
Experiment routing: The same customer always lands in the same experiment arm (deterministic hashing). 80% of customers get Scorecard scoring, 20% get Bayesian.
Adaptive learning: After recording outcomes via /respond, Bayesian models auto-update every 100 outcomes. Thompson Bandit updates alpha/beta per outcome immediately.
Multi-channel creatives: Each of the 10 offers has 4 creatives (web banner, email, push, social) with channel-appropriate content.

API Quick Reference

Action	Method	Endpoint
Load sample data	POST	`/api/v1/seed-dataset/starbucks`
List offers	GET	`/api/v1/actions`
List channels	GET	`/api/v1/channels`
List creatives	GET	`/api/v1/treatments`
Get recommendations	POST	`/api/v1/recommend`
Record outcome	POST	`/api/v1/respond`
Query interaction history	GET	`/api/v1/interaction-history`
Run batch simulation	POST	`/api/v1/simulate`
Train model	POST	`/api/v1/algorithm-models/{id}/train`
Score customer	POST	`/api/v1/algorithm-models/{id}/score`
Create experiment	POST	`/api/v1/experiments`

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

Reports

Release Notes

Starbucks Dataset Guide

What Gets Created

Step-by-Step Walkthrough

Key Features Demonstrated

Qualification Rules

Contact Policies

Three Model Types

Decision Pipeline

Batch Email via Segments

What to Look For

API Quick Reference

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

Reports

Release Notes

Documentation Index

​What Gets Created

​Step-by-Step Walkthrough

​Key Features Demonstrated

​Qualification Rules

​Contact Policies

​Three Model Types

​Decision Pipeline

​Batch Email via Segments

​What to Look For

​API Quick Reference

What Gets Created

Step-by-Step Walkthrough

Key Features Demonstrated

Qualification Rules

Contact Policies

Three Model Types

Decision Pipeline

Batch Email via Segments

What to Look For

API Quick Reference