- Fairness — five standard audit-recognized bias metrics evaluated over any slice of decision outcomes.
- Drift — Population Stability Index + two-sample Kolmogorov– Smirnov tests over feature distributions, with a multi-feature rollup verdict (none / monitor / alert).
Fairness evaluation
POST /api/v1/fairness/evaluate
Two input modes: Inline — CI / regression workflows supply samples directly:groupLookup map:
Response
- demographicParityGap —
max − minpositive-decision rate. - disparateImpactRatio —
min / max. Below 0.80 triggers the four-fifths-rule flag when every group has ≥ 30 samples. - equalOpportunityGap — TPR gap across groups (requires ground-
truth
labelon each sample). - equalizedOddsGap —
max(TPR gap, FPR gap).
Drift detection
POST /api/v1/models/:id/drift
Submit two feature-value snapshots (reference + current) and the endpoint returns PSI + KS per feature plus an aggregate severity:PSI thresholds (Siddiqi 2005, Basel-recognized)
| PSI | Severity | Action |
|---|---|---|
< 0.1 | none | no action |
0.1 – 0.25 | monitor | watch next week |
≥ 0.25 | alert | investigate / retrain |
KS significance
Two-sided p-value via the Kolmogorov distribution Q(λ) with regime switching (alternating series for λ ≥ 0.3, Jacobi-theta alternative for smaller λ).significant: true when p < 0.05.
Overall severity
alert fires if any feature has PSI ≥ 0.25 OR a significant KS
with D > 0.1. monitor fires if any feature crosses 0.1 PSI without
alerting. Otherwise none.
Rate limits + audit
POST /fairness/evaluate— 20/min/tenant, audit-logged asaction=fairness_evaluate.POST /models/:id/drift— 30/min/tenant, audit-logged asaction=drift_evaluate.