Flow Overview & Audit Readiness

Phase summary
UI surfaces (configure everything from the sidebar)
Validation contract (defense in depth)
Lineage on every row
Honest residuals
Test + typecheck health
Reading order

KaireonAI Flow is the data + decisioning fabric for the platform — a typed Pipeline IR, an in-process batch interpreter, an AI authoring layer, an MCP server for external agents, atomic file ingestion, six target load modes, hook execution, dataset + row-level validators, YAML connectors with an AI generator, row-level lineage, and configurable UI pages.

Phase summary

Phase	What’s live
1	Pipeline IR (8 node kinds), parsePipelineIR, versioned IR repo, batch interpreter, 8 executors, legacy adapter, IR-native POST /pipelines + run dispatch
2a	AI Pipeline Mode — validate-then-regenerate Coordinator (≤3 retries), tenant-configured provider, RFC 6902 IrDiffView, 100/h rate limit
2b	MCP flow-server — 11 tools + 3 cross-cutting IR endpoints (GET/POST /pipelines/:id/ir, GET …/versions); isMcpReadOnly() write-gated
3	File ingestion — date-template patterns w/ IANA tz, 4 ordering modes, wait policy + onMissAction, atomic stage→archive→failure, csv/json/jsonl
4	Load modes — append/truncate/upsert + blue_green (rename triple) + incremental_watermark; cdc_mirror Phase-6 stub. Hooks (sql/refresh/webhook/custom_function). 5 dataset validators + row-level rules + DLQ auto-create
5	YAML connectors — 4 auth × 4 pagination × 9 categories. HTTP runtime with template substitution + SSRF + rate-limit. Plugin SDK. POST /api/v1/connectors/yaml
5b	AI connector generator — paste docs URL or hint → draft YAML via tenant’s configured provider
6	Observability core — _kaireon_lineage JSONB on every target write, run-metrics summarizer

Page	Path
Flow Pipelines	`/data/flow-pipelines` — IR list + JSON editor + version history + Run-now
YAML Connectors	`/data/yaml-connectors` — YAML editor + AI generator + Register
Flow Runs	`/data/flow-runs` — run history with status + row counts
AI Pipeline Mode	(right-side AI panel on `/data/pipelines/*` routes)

Validation contract (defense in depth)

Every IR write goes through three checks server-side:

HTTP body validation — Zod on each route
parsePipelineIR — Phase 1 two-phase validator (Zod + structural acyclic + ref-integrity)
AuditLog — every AI proposal + IR save + connector register

Plus runtime safety:

All outbound webhooks + AI generator docs fetch + YAML HTTP runtime use validateAndResolve SSRF guard
All SQL identifiers go through IDENT regex + safeIdent
Hook SQL: forbidden-leading-verb check (DROP/DELETE/UPDATE/TRUNCATE/INSERT/MERGE/COPY/GRANT/REVOKE/ALTER)
Transform SQL: sanitizeExpression whitelist
All MCP write tools respect isMcpReadOnly() gate; MCP_ALLOW_WRITES=true unlocks

Lineage on every row

Every Phase 4 target write augments rows with a _kaireon_lineage JSONB column:

{
  "runId": "<uuid>",
  "pipelineId": "<id>",
  "sourceNodeId": "<upstream-node>"
}

Idempotent ALTER TABLE ... ADD COLUMN IF NOT EXISTS runs before every load. The future “Errors UI” + the MCP inspectFlowError tool both query against this column.

Honest residuals

Each of these has a clear runtime error pointing at when it’ll be real, rather than a silent stub:

Residual	Gate
Real source materialization (rows actually populating PG temp tables)	Documented in Phase 4 spec; tests stub Prisma
`cdc_mirror` runtime	Throws “cdc_mirror requires streaming runtime — Phase 6 (Flink CDC / Debezium integration)“
Cloud connector executors (S3/GCS/Azure/SFTP/FTP/HTTP-pull)	Source executor throws “Phase 1-3 supports only local_fs source”
Parquet/Avro/ORC/TSV/XML formats	Throws “format X not yet supported (deferred to a later phase)“
`custom_function` hook	Throws “custom_function hook requires the plugin SDK — Phase 5”
Cron + EventBridge scheduling	Inline-on-run today; documented as ops phase
Marketplace UI publishing	UX phase
Filesystem auto-discovery of plugins	Plugins register explicitly; auto-discovery is a follow-up
Bulk migration of 25 HTTP-shaped connectors to YAML	Each its own follow-up PR

When the missing infrastructure lands, no UI or schema work is needed — the configurable surface is already there.

Test + typecheck health

260 / 260 lib/flow tests passing
Typecheck clean across lib/flow, components/flow, flow API routes
~50 atomic commits across the 8 phases + UI + audit-closure passes
Six Mintlify pages cover every user-visible capability

Reading order

Pipeline IR — the typed contract
AI Pipeline Authoring — NL → IR
MCP Flow Server — external-agent surface
File Ingestion — source-side semantics
Loading Modes & Validation — target-side semantics
YAML Connectors — declarative HTTP/REST connectors

Composable Pipeline Pipeline IR (Flow)

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

Reports

Release Notes

Flow Overview & Audit Readiness

Phase summary

UI surfaces (configure everything from the sidebar)

Validation contract (defense in depth)

Lineage on every row

Honest residuals

Test + typecheck health

Reading order

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

Reports

Release Notes

​Phase summary

​UI surfaces (configure everything from the sidebar)

​Validation contract (defense in depth)

​Lineage on every row

​Honest residuals

​Test + typecheck health

​Reading order

Phase summary

UI surfaces (configure everything from the sidebar)

Validation contract (defense in depth)

Lineage on every row

Honest residuals

Test + typecheck health

Reading order