Documentation Index
Fetch the complete documentation index at: https://docs.kaireonai.com/llms.txt
Use this file to discover all available pages before exploring further.
Status as of 2026-05-03 PM. Import endpoint, runtime, integrity
check, audit trail, multi-input + out-of-band blob, AND scoring
dispatch are now all shipped. The pipeline runner branches on
modelType === "onnx_imported" and awaits the async ONNX scoring
path, fail-soft to score 0.5 with a degraded-explanation marker
when onnxruntime-node is not installed in the deployment image.
Install the dep explicitly when a tenant enables ONNX BYO:
npm install onnxruntime-node@^1.20.Import endpoint
POST /api/v1/models/import — multipart form with these parts:
| Part | Required | Meaning |
|---|---|---|
file | yes | ONNX bytes (≤100 MB). |
name | yes | Display name (the algorithm model name). |
family | yes | Must be "onnx_imported" in V1. |
featureNames | yes | JSON-encoded string[] matching the model’s input shape. |
featureDefaults | no | JSON-encoded Record<string, number> for missing-input handling. |
modelState.onnxBytesBase64
slot, along with a sha256 digest at bytesHashSha256 so the runner
can verify integrity at scoring time.
Scoring
The ONNX scoring runner exposes a per-instance score function that loads the ONNX session lazily through Node’s dynamic-require shim. The runtime dep is not in the platform’spackage.json by default —
install it explicitly when a tenant enables ONNX BYO:
modelType === "onnx_imported" at both per-candidate scoring sites
(the formula PRIE path and the non-formula model fallback) and
awaits the async path instead of the sync default scorer.
When the runtime dep is missing, the ONNX scorer is fail-soft to
score 0.5 and adds a degraded-explanation marker
({ engineType: "onnx_imported", degraded: true, reason: "onnx_runtime_missing" })
to the trace so the operator can detect the missing-install state via
audit-log analytics. A malformed-model-state error still bubbles —
that’s a configuration bug that should surface in the import audit.
Honest limits
- Multi-input + out-of-band blob store supported (was single-input only in earlier releases). Float32 + int64 + bool tensor dtypes only — mixed-precision graphs and string tensors are rejected at load.
- 100 MB cap per file. Larger payloads land in the out-of-band blob store automatically.
- In-memory session cache; cache eviction relies on process restart. Keep V1 deployments to <50 ONNX models per tenant.
- CPU-only inference. No GPU support.
- The
onnxruntime-nodedep is opt-in. When absent, scoring degrades to 0.5 + degraded-explanation marker rather than throwing — the fail-soft path keeps/recommendresponsive for tenants that don’t use ONNX BYO.
Roadmap
- Streaming-input models (multi-call session reuse) for transformer use cases.
- GPU runtime via
onnxruntime-nodeGPU build.
Audit trail
Every import writes an audit log row withaction: "model_import_onnx", entityType: "algorithm_model",
and changes: { family, bytesHashSha256, size, featureCount }.
DSAR exports cite the import audit row, the model’s
bytesHashSha256, and (when the runtime is installed at scoring time)
each scoring decision back to the trace’s scoringResults[].modelType: "onnx_imported" entry.