Seed Datasets API - KaireonAI

The Seed Datasets API provides pre-built dataset packs that populate the platform with realistic sample data. Each pack includes schemas, categories, offers, channels, creatives, qualification rules, contact policies, algorithm models, decision flows, segments, and synthetic customer/interaction data.

See the Sample Data guide for a walkthrough of using seed datasets.

Base path

/api/v1/seed-dataset

List available datasets

GET /api/v1/seed-dataset

Returns all registered dataset packs with their metadata and current load status.

Response `200`

{
  "datasets": [
    {
      "key": "starbucks",
      "name": "Starbucks Offers",
      "description": "Full NBA pipeline with Starbucks loyalty offers — multi-channel delivery, Thompson Bandit, segments.",
      "source": "kaggle",
      "csvFiles": [],
      "testingFocus": "Multi-channel loyalty offers with frequency capping",
      "schemaCount": 3,
      "offerCount": 10,
      "modelCount": 3,
      "channelCount": 6,
      "categoryCount": 3,
      "creativeCount": 60,
      "loaded": false
    }
  ],
  "currentlyLoaded": null
}

Field reference

Field	Type	Description
`key`	string	Unique dataset identifier used in load/delete URLs.
`name`	string	Human-readable dataset name.
`description`	string	Short description of the dataset.
`source`	string	Data source type (e.g., `"synthetic"`).
`csvFiles`	array	List of CSV file paths included in the dataset pack.
`testingFocus`	string	What this dataset is best suited for testing.
`schemaCount`	integer	Number of data schemas in the pack.
`offerCount`	integer	Number of offers in the pack.
`modelCount`	integer	Number of algorithm models in the pack.
`channelCount`	integer	Number of channels in the pack.
`categoryCount`	integer	Number of categories in the pack.
`creativeCount`	integer	Number of creatives in the pack.
`loaded`	boolean	Whether this dataset is currently loaded for the tenant.
`currentlyLoaded`	string \| null	Key of the currently loaded dataset, or `null` if none (top-level field).

Load a dataset

POST /api/v1/seed-dataset/{key}

Loads a dataset pack into the platform. Creates all entities in correct foreign-key dependency order: schemas, categories, channels, offers, creatives, rules, models, decision flows, segments, synthetic data rows, and interaction history.

Path parameters

Parameter	Required	Type	Description
`key`	Yes	string	Dataset key (e.g., `"starbucks"`).

Query parameters

Parameter	Required	Type	Description
`force`	No	string	Set to `"true"` to replace a currently loaded dataset.

Response `201`

{
  "message": "Starbucks Rewards loaded successfully",
  "counts": {
    "schemas": 2,
    "categories": 3,
    "subCategories": 6,
    "channels": 4,
    "offers": 12,
    "creatives": 24,
    "qualificationRules": 5,
    "contactPolicies": 3,
    "outcomeTypes": 10,
    "models": 2,
    "experiments": 1,
    "decisionFlows": 1,
    "segments": 1,
    "customers_rows": 1000,
    "transactions_rows": 1000,
    "interactions": 500,
    "interactionSummaries": 500,
    "segmentCustomers": 450
  },
  "datasetKey": "starbucks"
}

Error codes

Code	Reason
`404`	Dataset key not found in registry.
`409`	Another dataset is already loaded (use `?force=true` to replace).
`409`	Same dataset is already loaded.
`429`	Rate limited (5 requests per 60 seconds).

Response `409` (dataset conflict)

{
  "status": 409,
  "currentlyLoaded": "banking",
  "requestedLoad": "starbucks",
  "message": "Banking NBA is currently loaded. Add ?force=true to remove it and load Starbucks Rewards."
}

Remove a dataset

DELETE /api/v1/seed-dataset/{key}

Removes all entities belonging to a dataset pack in reverse foreign-key dependency order. Drops associated PostgreSQL tables and segment views.

Path parameters

Parameter	Required	Type	Description
`key`	Yes	string	Dataset key to remove.

Response `200`

{
  "message": "Starbucks Rewards removed successfully",
  "counts": {
    "interactions": 500,
    "interactionSummaries": 500,
    "creatives": 24,
    "offers": 12,
    "experiments": 1,
    "decisionFlows": 1,
    "channels": 4,
    "categories": 3,
    "qualificationRules": 5,
    "contactPolicies": 3,
    "models": 2,
    "runs": 0,
    "segments": 1,
    "schemas": 2
  }
}

Error codes

Code	Reason
`404`	Dataset key not found in registry.

Upload CSV data

POST /api/v1/seed-dataset/{key}/upload

Upload a CSV file to replace the data in a specific schema table belonging to a loaded dataset. The existing rows in the target table are truncated before inserting the new data. The dataset’s mapCsvRow() function transforms each CSV row into the correct schema format.

Path parameters

Parameter	Required	Type	Description
`key`	Yes	string	Dataset key (e.g., `"starbucks"`).

Request body (multipart/form-data)

Field	Required	Type	Description
`file`	Yes	File	CSV file to upload. Maximum size: 50 MB.
`schema`	Yes	string	Target schema name (must be part of the dataset pack).

Example

curl -X POST https://playground.kaireonai.com/api/v1/seed-dataset/starbucks/upload \
  -H "X-Tenant-Id: my-tenant" \
  -H "X-User-Role: admin" \
  -F "file=@customers.csv" \
  -F "schema=starbucks_customers"

Response `200`

{
  "message": "Uploaded 5000 rows to starbucks_customers",
  "rowsInserted": 5000,
  "schemaName": "starbucks_customers",
  "datasetKey": "starbucks"
}

Error codes

Code	Reason
`400`	Missing `file` or `schema` form field.
`400`	Schema name not part of the dataset pack.
`400`	Schema not loaded in the database (load the dataset first).
`400`	CSV parse error.
`400`	No valid rows after mapping.
`400`	Dataset does not support CSV upload (no `mapCsvRow` function).
`404`	Dataset key not found in registry.
`413`	File exceeds 50 MB size limit.
`429`	Rate limited (5 requests per 60 seconds).

Poll seed progress

GET /api/v1/seed-dataset/:key/status

Read the current seeding status for an in-flight or completed Load a dataset call. The route at src/app/api/v1/seed-dataset/[key]/status/route.ts:14-45 reads PlatformSetting row keyed by (tenantId, "seed", "seed_progress") and parses its JSON value. When no row exists the route returns the idle baseline so the UI does not need to handle a missing-row case.

Path Parameters

Parameter	Type	Description
`key`	string	Dataset key (e.g., `"starbucks"`). Currently informational — the row key is per-tenant, not per-dataset.

Response — In flight

{
  "status": "running",
  "step": "Loading interaction history",
  "progress": 65
}

Response — Idle (no seed in progress, or row missing)

Returned at route.ts:29-33 and route.ts:39-43.

{
  "status": "idle",
  "step": "",
  "progress": 0
}

status

string

Operator-defined state from the seed orchestrator. Common values: "idle", "running", "completed", "failed". The route does not validate the set — whatever the orchestrator wrote into PlatformSetting.value is returned as-is.

step

string

Human-readable label for the current step. Empty string when idle.

progress

number

Integer 0-100 reflecting per-step progress. Always 0 when idle.

Status codes

Code	When	Source
200	Returns the progress object (or idle baseline)	`route.ts:37, 42`
401 / 403	Caller fails authentication or role check	`route.ts:18`

Roles

admin, editor, viewer.

Polling cadence is up to the caller. The seed orchestrator updates the row at every step boundary — sub-second polls will see no change between updates. A 1-2 second interval is sufficient for the UI progress bar.

Role requirements

Method	Minimum role
GET	`admin`
POST	`admin`
DELETE	`admin`

Loading a dataset creates real PostgreSQL tables with synthetic data rows. In a production environment, only use this for testing purposes.

Overview

Core APIs

Studio

Data

Algorithms

Content & Metrics

Orchestration

Customers & Traces

Events & Attribution

Analytics

AI

Testing & Simulation

Export & Import

Admin

Documentation Index

​Base path

​List available datasets

​Response 200

​Field reference

​Load a dataset

​Path parameters

​Query parameters

​Response 201

​Error codes

​Response 409 (dataset conflict)

​Remove a dataset

​Path parameters

​Response 200

​Error codes

​Upload CSV data

​Path parameters

​Request body (multipart/form-data)

​Example

​Response 200

​Error codes

​Poll seed progress

​Path Parameters

​Response — In flight

​Response — Idle (no seed in progress, or row missing)

​Status codes

​Roles

​Role requirements

Base path

List available datasets

Response `200`

Field reference

Load a dataset

Path parameters

Query parameters

Response `201`

Error codes

Response `409` (dataset conflict)

Remove a dataset

Path parameters

Response `200`

Error codes

Upload CSV data

Path parameters

Request body (multipart/form-data)

Example

Response `200`

Error codes

Poll seed progress

Path Parameters

Response — In flight

Response — Idle (no seed in progress, or row missing)

Status codes

Roles

Role requirements