The Hillstrom dataset creates a focused email marketing experiment environment modeled after the classic Kevin Hillstrom MineThatData E-Mail Analytics Challenge. It is a 3-arm A/B/C experiment testing whether gender-targeted email (men’s merchandise vs women’s merchandise) outperforms a no-email control group.
Source: Kevin Hillstrom MineThatData E-Mail Analytics Challenge (CC0: Public Domain). Original data: 64,000 customers.
This is the only dataset pack with a spend column in the results schema, enabling revenue-based uplift analysis in addition to conversion rate analysis.
What Gets Created
| Entity Type | Count | Details |
|---|
| Schemas | 2 | hillstrom_customers (8 fields), hillstrom_results (5 fields) |
| Categories | 2 | Men’s Merchandise, Women’s Merchandise |
| Channels | 1 | Email (api delivery) |
| Offers | 2 | Men’s Promo (priority 85), Women’s Promo (priority 80) |
| Creatives | 2 | One email creative per offer |
| Qualification Rules | 0 | None — open eligibility |
| Models | 2 | Scorecard + Bayesian |
| Outcome Types | 5 | impression, click, accept, convert, dismiss |
| Experiment | 1 | Three-Arm Email Test (33/33/34 split) |
| Customer Rows | 1,000 | HILL-000000 through HILL-000999 |
| Result Rows | 1,000 | Segment assignment + visit/conversion/spend |
Step-by-Step Walkthrough
Load the Hillstrom Dataset
curl -X POST "http://localhost:3000/api/v1/seed-dataset/hillstrom?force=true" \
-H "Content-Type: application/json" \
-H "X-Requested-With: XMLHttpRequest"
Expected Response (201 Created)
{
"success": true,
"dataset": "hillstrom",
"created": {
"schemas": 2,
"categories": 2,
"subCategories": 2,
"channels": 1,
"offers": 2,
"creatives": 2,
"qualificationRules": 0,
"contactPolicies": 0,
"outcomeTypes": 5,
"models": 2,
"experiments": 1,
"decisionFlows": 1,
"segments": 0,
"customerRows": 1000,
"resultRows": 1000,
"interactionHistory": 500,
"interactionSummaries": 500
}
}
# Check schemas (expect 2: hillstrom_customers, hillstrom_results)
curl "http://localhost:3000/api/v1/schemas" -H "X-Requested-With: XMLHttpRequest"
# Check offers (expect 2: Men's Promo, Women's Promo)
curl "http://localhost:3000/api/v1/actions" -H "X-Requested-With: XMLHttpRequest"
# Check channels (expect 1: Email)
curl "http://localhost:3000/api/v1/channels" -H "X-Requested-With: XMLHttpRequest"
# Check models (expect 2: scorecard + bayesian)
curl "http://localhost:3000/api/v1/algorithm-models" -H "X-Requested-With: XMLHttpRequest"
# Check experiments (expect 1: three-arm test)
curl "http://localhost:3000/api/v1/experiments" -H "X-Requested-With: XMLHttpRequest"
Get Recommendations for a Customer
curl -X POST "http://localhost:3000/api/v1/recommend" \
-H "Content-Type: application/json" \
-H "X-Requested-With: XMLHttpRequest" \
-d '{
"customerId": "HILL-000042",
"channel": "email",
"context": {
"recency_months": 3,
"history_amount": 450.00,
"mens_flag": 1,
"womens_flag": 0,
"newbie_flag": 0,
"history_segment": "4) $350 - $500",
"zip_code": "Urban"
},
"limit": 2
}'
curl -X POST "http://localhost:3000/api/v1/recommend" \
-H "Content-Type: application/json" \
-H "X-Requested-With: XMLHttpRequest" \
-d '{
"customerId": "HILL-000088",
"channel": "email",
"context": {
"recency_months": 8,
"history_amount": 125.50,
"mens_flag": 0,
"womens_flag": 1,
"newbie_flag": 1,
"history_segment": "2) $100 - $200",
"zip_code": "Suburban"
},
"limit": 1
}'
Response — Men's-Leaning Customer
{
"customerId": "HILL-000042",
"recommendations": [
{
"rank": 1,
"offerId": "<mens-offer-id>",
"offerName": "Hillstrom: Men's Promo",
"score": 0.95,
"creativeId": "<mens-creative-id>",
"channelId": "<email-channel-id>",
"content": {
"subject": "New arrivals in men's merchandise!",
"body": "Check out our latest men's collection — handpicked deals just for you.",
"cta": "Shop Now",
"preheader": "Men's merchandise picks"
}
},
{
"rank": 2,
"offerId": "<womens-offer-id>",
"offerName": "Hillstrom: Women's Promo",
"score": 0.72
}
],
"modelUsed": "hillstrom-scorecard"
}
curl -X POST "http://localhost:3000/api/v1/respond" \
-H "Content-Type: application/json" \
-H "X-Requested-With: XMLHttpRequest" \
-d '{
"customerId": "HILL-000042",
"decisionId": "<trace-uuid-from-recommend>",
"offerId": "<mens-offer-id>",
"outcomeType": "click",
"channel": "email",
"context": {
"source": "email_click",
"visitedAt": "2026-03-04T14:30:00.000Z"
}
}'
Record a Conversion with Spend
curl -X POST "http://localhost:3000/api/v1/respond" \
-H "Content-Type: application/json" \
-H "X-Requested-With: XMLHttpRequest" \
-d '{
"customerId": "HILL-000042",
"decisionId": "<trace-uuid-from-recommend>",
"offerId": "<mens-offer-id>",
"outcomeType": "convert",
"channel": "email",
"conversionValue": 147.50,
"context": {
"source": "purchase",
"spend": 147.50
}
}'
curl "http://localhost:3000/api/v1/experiments/<experiment-id>" \
-H "X-Requested-With: XMLHttpRequest"
Experiment Results
{
"id": "<experiment-id>",
"name": "Hillstrom: Three-Arm Email Test",
"status": "active",
"metrics": {
"champion": {
"model": "hillstrom-scorecard",
"impressions": 165,
"conversions": 25,
"conversionRate": 0.1515,
"avgSpend": 98.50
},
"challengers": [
{
"model": "hillstrom-bayesian",
"impressions": 165,
"conversions": 28,
"conversionRate": 0.1697,
"avgSpend": 104.20,
"uplift": {
"absoluteLift": 0.0182,
"relativeLift": 0.12,
"zScore": 1.45,
"pValue": 0.074,
"significant": false
}
}
]
}
}
Retrain the Winning Model
curl -X POST "http://localhost:3000/api/v1/algorithm-models/<model-id>/train" \
-H "Content-Type: application/json" \
-H "X-Requested-With: XMLHttpRequest" \
-d '{
"trainingConfig": {
"sampleSize": 1000,
"validationSplit": 0.2,
"targetField": "conversion",
"features": ["recency_months", "history_amount", "mens_flag", "womens_flag", "newbie_flag"]
}
}'
Customer Schema
The hillstrom_customers schema captures purchase history and demographic signals:
| Field | Type | Description |
|---|
customer_id | varchar(20) | Unique identifier (HILL-000000 format) |
recency_months | integer | Months since last purchase (1-12) |
history_segment | varchar(30) | Spend tier: “0−100” through “$1,000+“ |
history_amount | numeric(10,2) | Total historical purchase amount |
mens_flag | integer | 1 = purchased men’s merchandise |
womens_flag | integer | 1 = purchased women’s merchandise |
zip_code | varchar(20) | Urban / Suburban / Rural |
newbie_flag | integer | 1 = new customer (12 months) |
Gender flags distribution: ~40% have mens_flag=1 only, ~40% have womens_flag=1 only, ~20% have both at 0. The flags are never both 1 simultaneously.
Results Schema
The hillstrom_results schema records the 3-arm experiment outcomes:
| Field | Type | Description |
|---|
customer_id | varchar(20) | Links to hillstrom_customers |
segment | varchar(30) | “Mens E-Mail”, “Womens E-Mail”, or “No E-Mail” |
visit | integer | 1 = visited site within 2 weeks |
conversion | integer | 1 = purchased within 2 weeks |
spend | numeric(10,2) | Total spend amount (0.00 if no conversion) |
Expected Outcome Rates
| Segment | Share | Visit Rate | Conversion Rate |
|---|
| Mens E-Mail | ~33% | 15% | 5% |
| Womens E-Mail | ~33% | 15% | 5% |
| No E-Mail (control) | ~34% | 10% | 2% |
Scorecard Model
The scorecard uses three deterministic rules:
| Rule | Field | Condition | Points | Rationale |
|---|
| r1 | recency_months | ≤ 6 | +15 | Recent buyers are more responsive |
| r2 | history_amount | ≥ 200 | +20 | Higher spenders convert more |
| r3 | newbie_flag | = 0 | +10 | Established customers have brand affinity |
Base score: 50 | Max: 100 | Normalization: min-max
Score Examples
| Customer Profile | Recency | History $ | Newbie | Points | Score |
|---|
| High-value recent buyer | 3 (+15) | $450 (+20) | 0 (+10) | 50+45 | 0.95 |
| New customer, low spend | 2 (+15) | $75 (0) | 1 (0) | 50+15 | 0.65 |
| Lapsed customer | 11 (0) | $320 (+20) | 0 (+10) | 50+30 | 0.80 |
| Worst case | 12 (0) | $50 (0) | 1 (0) | 50+0 | 0.50 |
Bayesian Model
A Naive Bayes classifier with 5 predictors including gender flags:
| Feature | Importance | Type | Role |
|---|
recency_months | 0.25 | Continuous (binned) | Purchase recency signal |
history_amount | 0.30 | Continuous (binned) | Strongest predictor — spending power |
mens_flag | 0.15 | Binary | Gender merchandise preference |
womens_flag | 0.15 | Binary | Gender merchandise preference |
newbie_flag | 0.15 | Binary | Customer tenure signal |
The Bayesian model incorporates mens_flag and womens_flag directly, enabling gender-aware propensity estimation. The scorecard does not use gender — this is the key difference tested by the experiment.
Three-Arm Experiment
| Arm | Model | Traffic % | Purpose |
|---|
| Champion | hillstrom-scorecard | 33% | Scorecard baseline |
| Challenger 1 | hillstrom-bayesian | 33% | Test Bayesian classifier |
| Challenger 2 (Control) | hillstrom-scorecard | 34% | Control arm (reuses scorecard) |
Expected Email vs No-Email Uplift
| Metric | Email Groups (combined) | No E-Mail (control) | Absolute Uplift | Relative Uplift |
|---|
| Visit Rate | 15% | 10% | +5 pp | +50% |
| Conversion Rate | 5% | 2% | +3 pp | +150% |
What to Look For
- Single channel: Email-only simplifies the analysis — no channel arbitration or multi-channel routing.
- Gender-segmented offers: The men’s promo (priority 85) outranks the women’s promo (priority 80). A customer with
mens_flag=1 gets the men’s email first.
- Spend tracking: The
spend column in hillstrom_results enables revenue-based analysis, not just conversion rate. Use conversionValue in the Respond API to track this.
- 3-arm A/B/C: Unlike binary treatment/control, this tests whether men’s email, women’s email, or no email performs best.
- 7 history tiers: From “0−100” to “$1,000+” — provides rich customer segmentation by purchase amount.
- Scorecard vs Bayesian: The scorecard is interpretable but ignores gender. The Bayesian model uses gender flags, making it better at matching email content to customer preference.
Key Differences from Starbucks
| Aspect | Hillstrom | Starbucks |
|---|
| Channels | 1 (email only) | 4 (web, mobile, email, social) |
| Offers | 2 | 10 |
| Qualification rules | 0 | 2 |
| Contact policies | 0 | 2 |
| Experiment arms | 3 (A/B/C with no-email control) | 2 (80/20 champion/challenger) |
| Spend tracking | In results schema | Via conversionValue only |
| Gender segmentation | Core feature | Not present |
| History tiers | 7 tiers | Not present |