SLO & Health Endpoints
Operational endpoints for health checking, readiness probing, and Prometheus metric scraping.
Overview
ZERG exposes three standard operational endpoints: a deep health check (/api/v1/health), a shallow load-balancer probe (/api/v1/ready), and Prometheus-formatted metrics (/api/v1/metrics). These are exempt from authentication to support automated infrastructure.
Authentication
None of these endpoints require authentication. They are exempted via auth_exempt_paths configuration.
Endpoints
Health Check
GET /api/v1/healthDeep health check that validates core system components.
{
"status": "ok",
"version": "0.408.0",
"uptime_seconds": 7200,
"components": {
"mnesia": { "status": "ok", "tables": 35 },
"zmq": { "status": "ok", "sockets": 3 },
"providers": { "total": 5, "healthy": 5 },
"workers": { "active": 12, "idle": 4 }
}
}Returns HTTP 503 if any critical component is degraded.
Readiness Probe
GET /api/v1/readyShallow load-balancer probe. Returns HTTP 200 when the server is accepting connections.
OKUsed by nginx, Docker health checks, and deployment scripts. Does not validate downstream components.
Metrics
GET /api/v1/metricsPrometheus-formatted metrics endpoint.
# HELP zerg_http_requests_total Total HTTP requests
# TYPE zerg_http_requests_total counter
zerg_http_requests_total{method="GET",path="/api/v1/health"} 42
# HELP zerg_provider_latency_seconds Provider request latency
# TYPE zerg_provider_latency_seconds histogram
zerg_provider_latency_seconds_bucket{provider="anthropic",le="0.1"} 15
zerg_provider_latency_seconds_bucket{provider="anthropic",le="0.5"} 28
zerg_provider_latency_seconds_bucket{provider="anthropic",le="+Inf"} 35
zerg_provider_latency_seconds_sum 8.2
zerg_provider_latency_seconds_count 35
# HELP zerg_circuit_breaker_state Circuit breaker state
# TYPE zerg_circuit_breaker_state gauge
zerg_circuit_breaker_state{provider="anthropic"} 0Available Metrics:
| Metric | Type | Labels | Description |
|---|---|---|---|
zerg_http_requests_total | counter | method, path, status | HTTP request count |
zerg_provider_latency_seconds | histogram | provider | Provider call latency |
zerg_circuit_breaker_state | gauge | provider | 0=closed, 1=half-open, 2=open |
zerg_active_workers | gauge | — | Currently active ZMQ workers |
zerg_memory_bytes | gauge | — | ETS + process memory |
zerg_event_total | counter | type | EventBus event count |
SLO Endpoint:
GET /api/v1/sloReturns current SLO compliance status:
{
"latency_p99": { "target": 5000, "current": 3200, "compliant": true },
"uptime_7d": { "target": 99.9, "current": 99.95, "compliant": true },
"error_rate": { "target": 1.0, "current": 0.3, "compliant": true }
}Examples:
curl http://127.0.0.1:11434/api/v1/health
curl http://127.0.0.1:11434/api/v1/ready
curl http://127.0.0.1:11434/api/v1/metrics