Hillstrom Email Marketing Dataset Guide

The Hillstrom dataset creates a focused email marketing experiment environment modeled after the classic Kevin Hillstrom MineThatData E-Mail Analytics Challenge. It is a 3-arm A/B/C experiment testing whether gender-targeted email (men’s merchandise vs women’s merchandise) outperforms a no-email control group. Source: Kevin Hillstrom MineThatData E-Mail Analytics Challenge (CC0: Public Domain). Original data: 64,000 customers.

This is the only dataset pack with a spend column in the results schema, enabling revenue-based uplift analysis in addition to conversion rate analysis.

What Gets Created

Entity Type	Count	Details
Schemas	2	`hillstrom_customers` (8 fields), `hillstrom_results` (5 fields)
Categories	2	Men’s Merchandise, Women’s Merchandise
Channels	1	Email (api delivery)
Offers	2	Men’s Promo (priority 85), Women’s Promo (priority 80)
Creatives	2	One email creative per offer
Qualification Rules	0	None — open eligibility
Models	2	Scorecard + Bayesian
Outcome Types	5	impression, click, accept, convert, dismiss
Experiment	1	Three-Arm Email Test (33/33/34 split)
Customer Rows	1,000	HILL-000000 through HILL-000999
Result Rows	1,000	Segment assignment + visit/conversion/spend

Step-by-Step Walkthrough

Load the Hillstrom Dataset

curl -X POST "http://localhost:3000/api/v1/seed-dataset/hillstrom?force=true" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest"

Expected Response (201 Created)

{
  "success": true,
  "dataset": "hillstrom",
  "created": {
    "schemas": 2,
    "categories": 2,
    "subCategories": 2,
    "channels": 1,
    "offers": 2,
    "creatives": 2,
    "qualificationRules": 0,
    "contactPolicies": 0,
    "outcomeTypes": 5,
    "models": 2,
    "experiments": 1,
    "decisionFlows": 1,
    "segments": 0,
    "customerRows": 1000,
    "resultRows": 1000,
    "interactionHistory": 500,
    "interactionSummaries": 500
  }
}

Verify All Entities

# Check schemas (expect 2: hillstrom_customers, hillstrom_results)
curl "http://localhost:3000/api/v1/schemas" -H "X-Requested-With: XMLHttpRequest"

# Check offers (expect 2: Men's Promo, Women's Promo)
curl "http://localhost:3000/api/v1/actions" -H "X-Requested-With: XMLHttpRequest"

# Check channels (expect 1: Email)
curl "http://localhost:3000/api/v1/channels" -H "X-Requested-With: XMLHttpRequest"

# Check models (expect 2: scorecard + bayesian)
curl "http://localhost:3000/api/v1/algorithm-models" -H "X-Requested-With: XMLHttpRequest"

# Check experiments (expect 1: three-arm test)
curl "http://localhost:3000/api/v1/experiments" -H "X-Requested-With: XMLHttpRequest"

Get Recommendations for a Customer

Men's-Leaning Customer

curl -X POST "http://localhost:3000/api/v1/recommend" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "HILL-000042",
  "channel": "email",
  "context": {
    "recency_months": 3,
    "history_amount": 450.00,
    "mens_flag": 1,
    "womens_flag": 0,
    "newbie_flag": 0,
    "history_segment": "4) $350 - $500",
    "zip_code": "Urban"
  },
  "limit": 2
}'

Women's-Leaning Customer

curl -X POST "http://localhost:3000/api/v1/recommend" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "HILL-000088",
  "channel": "email",
  "context": {
    "recency_months": 8,
    "history_amount": 125.50,
    "mens_flag": 0,
    "womens_flag": 1,
    "newbie_flag": 1,
    "history_segment": "2) $100 - $200",
    "zip_code": "Suburban"
  },
  "limit": 1
}'

Response — Men's-Leaning Customer

{
  "customerId": "HILL-000042",
  "recommendations": [
    {
      "rank": 1,
      "offerId": "<mens-offer-id>",
      "offerName": "Hillstrom: Men's Promo",
      "score": 0.95,
      "creativeId": "<mens-creative-id>",
      "channelId": "<email-channel-id>",
      "content": {
        "subject": "New arrivals in men's merchandise!",
        "body": "Check out our latest men's collection — handpicked deals just for you.",
        "cta": "Shop Now",
        "preheader": "Men's merchandise picks"
      }
    },
    {
      "rank": 2,
      "offerId": "<womens-offer-id>",
      "offerName": "Hillstrom: Women's Promo",
      "score": 0.72
    }
  ],
  "modelUsed": "hillstrom-scorecard"
}

Record Outcomes

Record a Visit (Click)

curl -X POST "http://localhost:3000/api/v1/respond" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "HILL-000042",
  "decisionId": "<trace-uuid-from-recommend>",
  "offerId": "<mens-offer-id>",
  "outcomeType": "click",
  "channel": "email",
  "context": {
    "source": "email_click",
    "visitedAt": "2026-03-04T14:30:00.000Z"
  }
}'

Record a Conversion with Spend

curl -X POST "http://localhost:3000/api/v1/respond" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "customerId": "HILL-000042",
  "decisionId": "<trace-uuid-from-recommend>",
  "offerId": "<mens-offer-id>",
  "outcomeType": "convert",
  "channel": "email",
  "conversionValue": 147.50,
  "context": {
    "source": "purchase",
    "spend": 147.50
  }
}'

Check Experiment Uplift

curl "http://localhost:3000/api/v1/experiments/<experiment-id>" \
  -H "X-Requested-With: XMLHttpRequest"

Experiment Results

{
  "id": "<experiment-id>",
  "name": "Hillstrom: Three-Arm Email Test",
  "status": "active",
  "metrics": {
    "champion": {
      "model": "hillstrom-scorecard",
      "impressions": 165,
      "conversions": 25,
      "conversionRate": 0.1515,
      "avgSpend": 98.50
    },
    "challengers": [
      {
        "model": "hillstrom-bayesian",
        "impressions": 165,
        "conversions": 28,
        "conversionRate": 0.1697,
        "avgSpend": 104.20,
        "uplift": {
          "absoluteLift": 0.0182,
          "relativeLift": 0.12,
          "zScore": 1.45,
          "pValue": 0.074,
          "significant": false
        }
      }
    ]
  }
}

Retrain the Winning Model

curl -X POST "http://localhost:3000/api/v1/algorithm-models/<model-id>/train" \
  -H "Content-Type: application/json" \
  -H "X-Requested-With: XMLHttpRequest" \
  -d '{
  "trainingConfig": {
    "sampleSize": 1000,
    "validationSplit": 0.2,
    "targetField": "conversion",
    "features": ["recency_months", "history_amount", "mens_flag", "womens_flag", "newbie_flag"]
  }
}'

Customer Schema

The hillstrom_customers schema captures purchase history and demographic signals:

Field	Type	Description
`customer_id`	varchar(20)	Unique identifier (HILL-000000 format)
`recency_months`	integer	Months since last purchase (1-12)
`history_segment`	varchar(30)	Spend tier: “ $0-$ 100” through “$1,000+“
`history_amount`	numeric(10,2)	Total historical purchase amount
`mens_flag`	integer	1 = purchased men’s merchandise
`womens_flag`	integer	1 = purchased women’s merchandise
`zip_code`	varchar(20)	Urban / Suburban / Rural
`newbie_flag`	integer	1 = new customer (12 months)

Gender flags distribution: ~40% have mens_flag=1 only, ~40% have womens_flag=1 only, ~20% have both at 0. The flags are never both 1 simultaneously.

Results Schema

The hillstrom_results schema records the 3-arm experiment outcomes:

Field	Type	Description
`customer_id`	varchar(20)	Links to hillstrom_customers
`segment`	varchar(30)	“Mens E-Mail”, “Womens E-Mail”, or “No E-Mail”
`visit`	integer	1 = visited site within 2 weeks
`conversion`	integer	1 = purchased within 2 weeks
`spend`	numeric(10,2)	Total spend amount (0.00 if no conversion)

Expected Outcome Rates

Segment	Share	Visit Rate	Conversion Rate
Mens E-Mail	~33%	15%	5%
Womens E-Mail	~33%	15%	5%
No E-Mail (control)	~34%	10%	2%

Scorecard Model

The scorecard uses three deterministic rules:

Rule	Field	Condition	Points	Rationale
r1	`recency_months`	≤ 6	+15	Recent buyers are more responsive
r2	`history_amount`	≥ 200	+20	Higher spenders convert more
r3	`newbie_flag`	= 0	+10	Established customers have brand affinity

Base score: 50 | Max: 100 | Normalization: min-max

Score Examples

Customer Profile	Recency	History $	Newbie	Points	Score
High-value recent buyer	3 (+15)	$450 (+20)	0 (+10)	50+45	0.95
New customer, low spend	2 (+15)	$75 (0)	1 (0)	50+15	0.65
Lapsed customer	11 (0)	$320 (+20)	0 (+10)	50+30	0.80
Worst case	12 (0)	$50 (0)	1 (0)	50+0	0.50

Bayesian Model

A Naive Bayes classifier with 5 predictors including gender flags:

Feature	Importance	Type	Role
`recency_months`	0.25	Continuous (binned)	Purchase recency signal
`history_amount`	0.30	Continuous (binned)	Strongest predictor — spending power
`mens_flag`	0.15	Binary	Gender merchandise preference
`womens_flag`	0.15	Binary	Gender merchandise preference
`newbie_flag`	0.15	Binary	Customer tenure signal

The Bayesian model incorporates mens_flag and womens_flag directly, enabling gender-aware propensity estimation. The scorecard does not use gender — this is the key difference tested by the experiment.

Three-Arm Experiment

Arm	Model	Traffic %	Purpose
Champion	hillstrom-scorecard	33%	Scorecard baseline
Challenger 1	hillstrom-bayesian	33%	Test Bayesian classifier
Challenger 2 (Control)	hillstrom-scorecard	34%	Control arm (reuses scorecard)

Expected Email vs No-Email Uplift

Metric	Email Groups (combined)	No E-Mail (control)	Absolute Uplift	Relative Uplift
Visit Rate	15%	10%	+5 pp	+50%
Conversion Rate	5%	2%	+3 pp	+150%

What to Look For

Single channel: Email-only simplifies the analysis — no channel arbitration or multi-channel routing.
Gender-segmented offers: The men’s promo (priority 85) outranks the women’s promo (priority 80). A customer with mens_flag=1 gets the men’s email first.
Spend tracking: The spend column in hillstrom_results enables revenue-based analysis, not just conversion rate. Use conversionValue in the Respond API to track this.
3-arm A/B/C: Unlike binary treatment/control, this tests whether men’s email, women’s email, or no email performs best.
7 history tiers: From “ $0-$ 100” to “$1,000+” — provides rich customer segmentation by purchase amount.
Scorecard vs Bayesian: The scorecard is interpretable but ignores gender. The Bayesian model uses gender flags, making it better at matching email content to customer preference.

Key Differences from Starbucks

Aspect	Hillstrom	Starbucks
Channels	1 (email only)	4 (web, mobile, email, social)
Offers	2	10
Qualification rules	0	2
Contact policies	0	2
Experiment arms	3 (A/B/C with no-email control)	2 (80/20 champion/challenger)
Spend tracking	In results schema	Via conversionValue only
Gender segmentation	Core feature	Not present
History tiers	7 tiers	Not present

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

Hillstrom Email Marketing Dataset Guide

What Gets Created

Step-by-Step Walkthrough

Customer Schema

Results Schema

Expected Outcome Rates

Scorecard Model

Score Examples

Bayesian Model

Three-Arm Experiment

Expected Email vs No-Email Uplift

What to Look For

Key Differences from Starbucks

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

​What Gets Created

​Step-by-Step Walkthrough

​Customer Schema

​Results Schema

​Expected Outcome Rates

​Scorecard Model

​Score Examples

​Bayesian Model

​Three-Arm Experiment

​Expected Email vs No-Email Uplift

​What to Look For

​Key Differences from Starbucks

What Gets Created

Step-by-Step Walkthrough

Customer Schema

Results Schema

Expected Outcome Rates

Scorecard Model

Score Examples

Bayesian Model

Three-Arm Experiment

Expected Email vs No-Email Uplift

What to Look For

Key Differences from Starbucks