Streaming is gated behind the FLOW_STREAMING_ENABLED env flag —
the playground deployment runs with this flag off (we do not provision
a Kafka/Kinesis/Pulsar broker). Self-hosters with their own broker enable
it; everyone else gets a clear “streaming disabled” error from any
streaming source kind or cdc_mirror target.
What’s gated by FLOW_STREAMING_ENABLED
| Capability | When env unset (playground default) | When env = "true" (self-hosted) |
|---|
cdc_mirror target load mode | Throws “streaming runtime disabled” | Falls through to broker-config error (kaireon-worker is the future home) |
kafka_consumer IR source kind | Throws same gated message at executor dispatch | Self-hoster wires kafkajs and supplies brokerList in connector config |
kinesis_consumer IR source kind | Same | Self-hoster wires @aws-sdk/client-kinesis consumer + stream ARN |
pulsar_consumer IR source kind | Same | Self-hoster wires pulsar-client + service URL |
Checkpoint persistence (lands the moment any broker connects)
The Phase 6.4 scaffold ships the helper at lib/flow/runtime/streaming/checkpoint.ts:
import { writeCheckpoint, readCheckpoints } from "@/lib/flow/runtime/streaming/checkpoint";
// After every successful batch
await writeCheckpoint({
prisma, runId, tenantId,
checkpoint: { pipelineId, partition: 0, offset: 1234 },
});
// On consumer restart
const checkpoints = await readCheckpoints({ prisma, runId, tenantId });
Stored in pipeline_runs.metadata.streamingCheckpoints[]. At-least-once
delivery semantics — write checkpoint before the next batch’s side
effects. Any consumer (Kafka / Kinesis / Pulsar) plugs into this same
helper so the persistence layer is broker-agnostic.
Self-hoster setup
When you’re ready to enable streaming on a self-hosted deployment:
- Set
FLOW_STREAMING_ENABLED=true in the API + worker env.
- Wire your broker connector — install
kafkajs (or
@aws-sdk/client-kinesis / pulsar-client) and implement the
broker-specific consumer in kaireon-worker.
- Restart the worker process — the existing
kaireon-worker ECR repo
is unwired in playground (no App Runner service); self-hosters point
their orchestrator at it.
- Each consumer run reads
streamingCheckpoints on startup, seeks the
broker to the saved offset, processes, and writeCheckpoints after
each successful batch.
Honest residuals carried into Phase 6.6 hardening
- Per-partition lag metrics on the per-node metrics dashboard
- Backpressure semantics beyond at-least-once
- UI hide for streaming node kinds when
FLOW_STREAMING_ENABLED=false
— the Visual canvas + Schedule tab don’t yet hide them with a
“self-host required” CTA
Spec reference
specs/2026-04-24-flow-design.md §6 Runtime Engine, streaming
subsection. Phase 6.4 SPEC at
.planning/phases/06.4-streaming-runtime-cdc-mirror/06.4-SPEC.md.