Documentation Index
Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Auto-Segmentation analyzes your schema data to discover natural customer segments. Instead of manually defining segments based on assumptions, you select a schema and the AI identifies clusters of customers with similar characteristics. Navigate to AI > Segments in the sidebar to view the segment insights dashboard, or trigger analysis programmatically viaPOST /api/v1/ai/analyze/segments with a schemaId in the request body.
The Segments dashboard page (/ai/segments) automatically runs a health check and extracts segment-related findings. It shows segment data health status and overall segmentation readiness, refreshing every 5 minutes.
How It Works
Choose the schema table that contains the customer data you want to segment (e.g.,
customers, credit_profiles). The schema must have been created in the Data module and contain rows. The schemaId is required for segment analysis.Select the fields to include in the segmentation analysis. Choose numeric and categorical fields that are likely to differentiate customer groups — for example,
age, income, total_spend, region, account_type.Fields are classified as either numeric (integer, float, decimal, number, bigint) or categorical (everything else). This classification determines which statistics are computed for the LLM prompt.
Select 3 to 8 fields for the best results. Too few fields produce trivial segments; too many dilute the signal and slow analysis.
Click Discover Segments. The AI analyzes the field statistics and identifies clusters of customers with similar characteristics.
Dual-Tier Analysis
| Tier | When Used | Method |
|---|---|---|
| LLM | Default, or when ML Worker is unavailable | Builds a field statistics summary (min/max/avg for numeric, distinct/topValues for categorical) and uses generateObject() with the configured LLM to propose segment definitions |
| ML Worker | When connected and dataset exceeds 5,000 rows | Submits an async job to the Python ML Worker which runs K-Means clustering on the full dataset with silhouette scoring to determine the optimal number of clusters |
Heuristic Fallback
If the LLM call fails, the system falls back to a deterministic heuristic segmentation:- Finds the numeric field with the widest range (max - min)
- Splits into three segments at the 33rd and 67th percentiles: Low, Mid, and High
- Each segment gets filter rules, characteristics, and suggested marketing use cases
Segment Card Format
Each segment card (aSegmentResult object) displays:
| Field | Type | Description |
|---|---|---|
name | string | AI-generated descriptive name (e.g., “High-Value Urban Professionals”) |
description | string | 1-2 sentence description of who belongs to this segment |
size | number | Estimated number of customers in the segment |
percentage | number | Percentage of total customers (0-100) |
filterRules | array | Machine-readable filter rules with field, operator (eq, gt, lt, gte, lte, in, contains), and value |
characteristics | array | Key distinguishing features, each with feature (field name), value (description), and importance (0-1 score) |
suggestedUse | string | AI-generated recommendation for how to target this segment |
High-Value Loyalists — 2,340 customers (18%)
- Average spend: 480 population avg)
- Average tenure: 4.2 years (vs. 1.8 population avg)
- Primary channel: Email (72%)
- Suggested: Premium offers, loyalty rewards, lower contact frequency
Applying Segments
When you click Apply on a segment:- A recommendation is created in the AI Insights Dashboard
- The recommendation includes the segment definition (filter rules)
- Applying from Insights creates a draft qualification rule that identifies customers matching the segment criteria
Field Selection Tips
- Numeric fields work best — Income, age, spend, tenure, and score fields produce clearer clusters
- Limit categorical fields — High-cardinality categoricals (e.g., zip code) add noise. Prefer broad categories like region or account type
- Exclude IDs — Do not include customer_id, email, or other unique identifiers as segmentation fields
- Include behavioral data — If you have interaction summaries (total_clicks, avg_response_rate), include them for behaviorally meaningful segments
Advanced Parameters
Each segmentation run can be fine-tuned using the Advanced Parameters panel on the segmentation page. Expand the panel to adjust:| Parameter | Min | Max | Default | Description |
|---|---|---|---|---|
| Min Clusters | 2 | 20 | 2 | Fewest groups to split customers into |
| Max Clusters | 2 | 50 | 8 | Most groups customers can be split into |
| Algorithm | — | — | kmeans | Clustering method: kmeans (centroid-based, fast) or dbscan (density-based, handles outliers) |
| Included Features | — | — | null (all) | Which attributes to consider. Null = all numeric features |
Large Dataset Warning
When the selected schema contains 5,000 or more rows, a confirmation dialog appears before analysis begins. The dialog shows:- Accuracy comparison — ML Worker uses K-Means with silhouette scoring on the full dataset vs. LLM analysis of field statistics
- Estimated cost — Token count and approximate cost if proceeding with LLM
- Speed comparison — ML Worker processes locally in seconds vs. LLM round-trip
Next Steps
AI Configuration
Configure default segmentation parameters.
AI Insights Dashboard
View and apply segmentation recommendations.
Smart Policy Recommender
Optimize contact frequency policies.