/data/pipelines/* page, describe what you want (“add a daily CSV
ingestion of /tmp/orders.csv into starbucks.orders”), and the assistant
produces a typed IR patch. The panel renders the patch as a colored
diff with three actions: Apply, Show full IR, Reject (reason).
How the AI stays honest
Three validation layers, all enforced server-side:- Constrained JSON generation — the provider’s output is bound by a
Zod schema via Vercel AI SDK
generateObject. Invalid shapes are rejected at the wire. - Patch application — the AI emits an RFC 6902 JSON Patch, not a full IR. Our applier rejects paths that do not exist in the current IR.
- Validate-then-regenerate — the patched IR is fed through the
Phase 1
parsePipelineIRvalidator. If structural checks fail, the server re-prompts the AI with the exact error and retries (up to 3 times). You only ever see IR that the runtime can execute.
AuditLog with entityType = 'pipeline_ai_proposal': full patch,
rationale, retry count, token count. Rejections capture your reason text.
Prerequisites
- Tenant setting
flowIrEnabledmust be on. - AI provider configured in platform settings. Any provider that
supports structured output works — Anthropic, OpenAI, Google, AWS
Bedrock. Providers that lack constrained-JSON support will return a
400 with
provider_unsupported_structured.
Rate limits
- 100 proposals per tenant per hour. Each proposal may use up to 3 internal retries against your provider. Your provider’s per-request rate limits still apply.
Phase 1 scope
The AI is explicitly warned to stay inside the runtime’s current capabilities:- Sources — only
local_fshas a runtime executor in Phase 1; the other 6 connector kinds (s3,gcs,azure_blob,sftp,ftp,http_pull) validate but are deferred to Phase 3. - Target load modes —
append,truncate, andupsertare implemented;blue_green,incremental_watermark, andcdc_mirrorare deferred to Phase 4 and will throw at runtime if proposed. - Validate node — dataset-level
rowCountis enforced; other dataset checks (freshness, fkIntegrity, cardinality, duplicateKey) are recorded but not enforced yet.
API
POST /api/v1/ai/pipeline-chat
Request body:
| Status | Body error | Meaning |
|---|---|---|
| 400 | provider_unsupported_structured | Tenant’s AI provider lacks constrained JSON |
| 403 | flow_ir_disabled | Enable flowIrEnabled in tenant settings |
| 429 | rate_limited | 100/h budget exhausted; check Retry-After header |
| 503 | provider_unavailable | AI provider down |
POST /api/v1/ai/pipeline-chat/feedback
Captures rejection reasons:
Related
- Pipeline IR — the underlying typed document the AI edits.
- Pipelines API — how to apply an IR via
POST /api/v1/pipelineswithirVersion: "1.0".