External Endpoint

When to use
How the engine calls you
Fixture config
Training
Score interpretation
Pitfalls
Cross-reference

modelType: "external_endpoint" — bypass in-engine scoring entirely and POST the candidate set to an HTTP service you operate. KaireonAI calls your URL with a batch of candidates and the customer’s attributes; your service returns a score per candidate; the engine stamps those scores onto the candidates and continues the pipeline.

When to use

You already have a production model in another stack (Python/PyTorch, TensorFlow Serving, in-house C++) — don’t rewrite, integrate.
You need real-time features the engine doesn’t have — joining against a third-party API, a fraud-score lookup, or a feature store at request time.
You want centralized model governance — your ML platform team owns the model lifecycle; KaireonAI just consumes scores.

Skip it when request-time latency matters more than model freshness — each external call adds 20–200ms of network round-trip. In-engine algorithms run in 5–50µs. Use external only when the model can’t run in-engine.

How the engine calls you

POST {endpoint URL}
Authorization: Bearer {auth.token from model config}
X-Tenant-Id: {tenantId}
Content-Type: application/json

{
  "customerId": "cust-42",
  "attributes": { "tier": "Gold", "credit_score": 760, ... },
  "candidates": [
    { "id": "off-travel",   "name": "Travel Card",   "attributes": { ... } },
    { "id": "off-cashback", "name": "Cashback Card", "attributes": { ... } }
  ],
  "requestId": "uuid",
  "decisionFlowId": "uuid"
}

Your service must return within timeoutMs (configurable, default 3000ms):

{
  "scores": [
    { "offerId": "off-travel",   "score": 0.42 },
    { "offerId": "off-cashback", "score": 0.71 }
  ]
}

The engine then sets candidate.score = scores[i].score × fitMultiplier for each candidate.

Fixture config

{
  "modelType": "external_endpoint",
  "config": {
    "endpoint": "https://your-ml-service.example.com/score",
    "auth": {
      "type": "bearer",
      "token": "<secret-ref-or-literal>"
    },
    "timeoutMs": 3000,
    "responseMapping": {
      "scoresPath": "$.scores",
      "offerIdField": "offerId",
      "scoreField": "score",
      "fallbackScore": 0.5
    }
  }
}

responseMapping lets you adapt to a non-standard service shape — scoresPath is JSONPath, offerIdField / scoreField are field names in each entry. fallbackScore is the value the engine uses when your service errors out, times out, or returns malformed data. Default 0.5 keeps candidates alive but neutral.

Training

Out of scope for KaireonAI. You train and version the model in your stack; the engine only ever calls the URL.

Score interpretation

Whatever your service returns. The engine doesn’t transform or clamp the score beyond multiplying by fitMultiplier and (if PRIE is enabled) applying the geometric mean. Make sure your service returns [0, 1] if PRIE will multiply it.

Pitfalls

Latency — external calls dominate request budget. Build your service to respond in < 100ms p99. Cache aggressively. Pre-compute features at quiet times.
Timeout fallback — when your service is slow or down, the engine uses fallbackScore for every candidate (every offer ties). Have an alert on fallbackScore-rate; if it’s ever > 1%, your model is offline in production.
Authentication — auth.token should be loaded from your secrets manager, not the model config JSON. Use the secrets-resolver pattern for any production deployment.
Network reliability — TCP timeouts at TLS handshake, DNS flakiness, network partitions. Configure retries on the calling side (engine has a 1-attempt default; raise via retries: 2 in config) or accept the fallback.
Schema drift — if you rename fields in your service response, the responseMapping path stops resolving. Pin the schema; version-check on every response.
Async via scoreOfferSetExternal — the engine batches multiple candidates in one HTTP call. Your service must return scores for ALL passed candidates, even on partial failure (use fallbackScore per candidate, not for the whole call).

Cross-reference

Algorithm Selection Guide.
ONNX Imported — if your existing model is ONNX, prefer in-process ONNX over HTTP for latency.

Neural Collaborative Filtering Adaptive Learning

Get Started

Tutorials

Decisioning

Studio

Data Pipelines

AI & ML

Operations & Reporting

Governance & Security

Integrations

Reference

When to use

How the engine calls you

Fixture config

Training

Score interpretation

Pitfalls

Cross-reference

Get Started

Tutorials

Decisioning

Studio

Data Pipelines

AI & ML

Operations & Reporting

Governance & Security

Integrations

Reference

Documentation Index

​When to use

​How the engine calls you

​Fixture config

​Training

​Score interpretation

​Pitfalls

​Cross-reference

When to use

How the engine calls you

Fixture config

Training

Score interpretation

Pitfalls

Cross-reference