Auto-Segmentation

Overview

Auto-Segmentation analyzes your schema data to discover natural customer segments. Instead of manually defining segments based on assumptions, you select a schema and the AI identifies clusters of customers with similar characteristics. Navigate to AI > Segments in the sidebar to view the segment insights dashboard, or trigger analysis programmatically via POST /api/v1/ai/analyze/segments with a schemaId in the request body. The Segments dashboard page (/ai/segments) automatically runs a health check and extracts segment-related findings. It shows segment data health status and overall segmentation readiness, refreshing every 5 minutes.

How It Works

Select a schema

Choose the schema table that contains the customer data you want to segment (e.g., customers, credit_profiles). The schema must have been created in the Data module and contain rows. The schemaId is required for segment analysis.

Pick fields

Select the fields to include in the segmentation analysis. Choose numeric and categorical fields that are likely to differentiate customer groups — for example, age, income, total_spend, region, account_type.

Fields are classified as either numeric (integer, float, decimal, number, bigint) or categorical (everything else). This classification determines which statistics are computed for the LLM prompt.

Select 3 to 8 fields for the best results. Too few fields produce trivial segments; too many dilute the signal and slow analysis.

Run discovery

Click Discover Segments. The AI analyzes the field statistics and identifies clusters of customers with similar characteristics.

Review segment cards

The results appear as segment cards, each representing a discovered cluster. Review the cards to understand what makes each segment distinct.

Dual-Tier Analysis

Tier	When Used	Method
LLM	Default, or when ML Worker is unavailable	Builds a field statistics summary (min/max/avg for numeric, distinct/topValues for categorical) and uses `generateObject()` with the configured LLM to propose segment definitions
ML Worker	When connected and dataset exceeds 5,000 rows	Submits an async job to the Python ML Worker which runs K-Means clustering on the full dataset with silhouette scoring to determine the optimal number of clusters

The ML Worker tier is more accurate because it processes the entire dataset and uses statistical validation (silhouette score) to find the natural number of clusters rather than relying on pattern matching.

Heuristic Fallback

If the LLM call fails, the system falls back to a deterministic heuristic segmentation:

Finds the numeric field with the widest range (max - min)
Splits into three segments at the 33rd and 67th percentiles: Low, Mid, and High
Each segment gets filter rules, characteristics, and suggested marketing use cases

If no numeric fields exist, a single “All Customers” segment is returned with a message suggesting you add numeric attributes.

Segment Card Format

Each segment card (a SegmentResult object) displays:

Field	Type	Description
`name`	string	AI-generated descriptive name (e.g., “High-Value Urban Professionals”)
`description`	string	1-2 sentence description of who belongs to this segment
`size`	number	Estimated number of customers in the segment
`percentage`	number	Percentage of total customers (0-100)
`filterRules`	array	Machine-readable filter rules with `field`, `operator` (eq, gt, lt, gte, lte, in, contains), and `value`
`characteristics`	array	Key distinguishing features, each with `feature` (field name), `value` (description), and `importance` (0-1 score)
`suggestedUse`	string	AI-generated recommendation for how to target this segment

Example card:

High-Value Loyalists — 2,340 customers (18%)

Average spend: $1,250 (vs.$ 480 population avg)

Average tenure: 4.2 years (vs. 1.8 population avg)

Primary channel: Email (72%)

Suggested: Premium offers, loyalty rewards, lower contact frequency

Applying Segments

When you click Apply on a segment:

A recommendation is created in the AI Insights Dashboard
The recommendation includes the segment definition (filter rules)
Applying from Insights creates a draft qualification rule that identifies customers matching the segment criteria

You can use these qualification rules in Decision Flows to target specific segments with tailored offers.

Field Selection Tips

Numeric fields work best — Income, age, spend, tenure, and score fields produce clearer clusters
Limit categorical fields — High-cardinality categoricals (e.g., zip code) add noise. Prefer broad categories like region or account type
Exclude IDs — Do not include customer_id, email, or other unique identifiers as segmentation fields
Include behavioral data — If you have interaction summaries (total_clicks, avg_response_rate), include them for behaviorally meaningful segments

Advanced Parameters

Each segmentation run can be fine-tuned using the Advanced Parameters panel on the segmentation page. Expand the panel to adjust:

Parameter	Min	Max	Default	Description
Min Clusters	2	20	2	Fewest groups to split customers into
Max Clusters	2	50	8	Most groups customers can be split into
Algorithm	—	—	`kmeans`	Clustering method: `kmeans` (centroid-based, fast) or `dbscan` (density-based, handles outliers)
Included Features	—	—	null (all)	Which attributes to consider. Null = all numeric features

Per-run overrides do not change your saved tenant configuration. To change the organization-wide defaults, go to AI Configuration.

Large Dataset Warning

When the selected schema contains 5,000 or more rows, a confirmation dialog appears before analysis begins. The dialog shows:

Accuracy comparison — ML Worker uses K-Means with silhouette scoring on the full dataset vs. LLM analysis of field statistics
Estimated cost — Token count and approximate cost if proceeding with LLM
Speed comparison — ML Worker processes locally in seconds vs. LLM round-trip

You can choose Use ML Worker (recommended for large datasets) or Proceed with LLM (uses field statistics summary).

For datasets over 5,000 rows, the ML Worker is strongly recommended. LLM-based analysis works from summarized field statistics rather than raw data, which may miss important patterns. See ML Worker Setup for deployment instructions.

Next Steps

AI Configuration

Configure default segmentation parameters.

AI Insights Dashboard

View and apply segmentation recommendations.

Smart Policy Recommender

Optimize contact frequency policies.

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

Reports

Release Notes

Overview

How It Works

Dual-Tier Analysis

Heuristic Fallback

Segment Card Format

Applying Segments

Field Selection Tips

Advanced Parameters

Large Dataset Warning

Next Steps

AI Configuration

AI Insights Dashboard

Smart Policy Recommender

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

Reports

Release Notes

Documentation Index

​Overview

​How It Works

​Dual-Tier Analysis

​Heuristic Fallback

​Segment Card Format

​Applying Segments

​Field Selection Tips

​Advanced Parameters

​Large Dataset Warning

​Next Steps

AI Configuration

AI Insights Dashboard

Smart Policy Recommender

Overview

How It Works

Dual-Tier Analysis

Heuristic Fallback

Segment Card Format

Applying Segments

Field Selection Tips

Advanced Parameters

Large Dataset Warning

Next Steps