Phase summary
| Phase | What’s live |
|---|---|
| 1 | Pipeline IR (8 node kinds), parsePipelineIR, versioned IR repo, batch interpreter, 8 executors, legacy adapter, IR-native POST /pipelines + run dispatch |
| 2a | AI Pipeline Mode — validate-then-regenerate Coordinator (≤3 retries), tenant-configured provider, RFC 6902 IrDiffView, 100/h rate limit |
| 2b | MCP flow-server — 11 tools + 3 cross-cutting IR endpoints (GET/POST /pipelines/:id/ir, GET …/versions); isMcpReadOnly() write-gated |
| 3 | File ingestion — date-template patterns w/ IANA tz, 4 ordering modes, wait policy + onMissAction, atomic stage→archive→failure, csv/json/jsonl |
| 4 | Load modes — append/truncate/upsert + blue_green (rename triple) + incremental_watermark; cdc_mirror Phase-6 stub. Hooks (sql/refresh/webhook/custom_function). 5 dataset validators + row-level rules + DLQ auto-create |
| 5 | YAML connectors — 4 auth × 4 pagination × 9 categories. HTTP runtime with template substitution + SSRF + rate-limit. Plugin SDK. POST /api/v1/connectors/yaml |
| 5b | AI connector generator — paste docs URL or hint → draft YAML via tenant’s configured provider |
| 6 | Observability core — _kaireon_lineage JSONB on every target write, run-metrics summarizer |
UI surfaces (configure everything from the sidebar)
| Page | Path |
|---|---|
| Flow Pipelines | /data/flow-pipelines — IR list + JSON editor + version history + Run-now |
| YAML Connectors | /data/yaml-connectors — YAML editor + AI generator + Register |
| Flow Runs | /data/flow-runs — run history with status + row counts |
| AI Pipeline Mode | (right-side AI panel on /data/pipelines/* routes) |
Validation contract (defense in depth)
Every IR write goes through three checks server-side:- HTTP body validation — Zod on each route
parsePipelineIR— Phase 1 two-phase validator (Zod + structural acyclic + ref-integrity)- AuditLog — every AI proposal + IR save + connector register
- All outbound webhooks + AI generator docs fetch + YAML HTTP runtime use
validateAndResolveSSRF guard - All SQL identifiers go through
IDENTregex +safeIdent - Hook SQL: forbidden-leading-verb check (DROP/DELETE/UPDATE/TRUNCATE/INSERT/MERGE/COPY/GRANT/REVOKE/ALTER)
- Transform SQL:
sanitizeExpressionwhitelist - All MCP write tools respect
isMcpReadOnly()gate;MCP_ALLOW_WRITES=trueunlocks
Lineage on every row
Every Phase 4 target write augments rows with a_kaireon_lineage
JSONB column:
ALTER TABLE ... ADD COLUMN IF NOT EXISTS runs before
every load. The future “Errors UI” + the MCP inspectFlowError tool
both query against this column.
Honest residuals
Each of these has a clear runtime error pointing at when it’ll be real, rather than a silent stub:| Residual | Gate |
|---|---|
| Real source materialization (rows actually populating PG temp tables) | Documented in Phase 4 spec; tests stub Prisma |
cdc_mirror runtime | Throws “cdc_mirror requires streaming runtime — Phase 6 (Flink CDC / Debezium integration)“ |
| Cloud connector executors (S3/GCS/Azure/SFTP/FTP/HTTP-pull) | Source executor throws “Phase 1-3 supports only local_fs source” |
| Parquet/Avro/ORC/TSV/XML formats | Throws “format X not yet supported (deferred to a later phase)“ |
custom_function hook | Throws “custom_function hook requires the plugin SDK — Phase 5” |
| Cron + EventBridge scheduling | Inline-on-run today; documented as ops phase |
| Marketplace UI publishing | UX phase |
| Filesystem auto-discovery of plugins | Plugins register explicitly; auto-discovery is a follow-up |
| Bulk migration of 25 HTTP-shaped connectors to YAML | Each its own follow-up PR |
Test + typecheck health
- 260 / 260 lib/flow tests passing
- Typecheck clean across
lib/flow,components/flow, flow API routes - ~50 atomic commits across the 8 phases + UI + audit-closure passes
- Six Mintlify pages cover every user-visible capability
Reading order
- Pipeline IR — the typed contract
- AI Pipeline Authoring — NL → IR
- MCP Flow Server — external-agent surface
- File Ingestion — source-side semantics
- Loading Modes & Validation — target-side semantics
- YAML Connectors — declarative HTTP/REST connectors