HTTP API Reference¶
The Dory Orchestrator exposes a single HTTP server on port 8080 serving the edge API, health/readiness probes, and Prometheus metrics. Edge pods reach it in-cluster at dory-orchestrator-metrics.dory-system.svc.cluster.local:8080.
See Configuration for MetricsPort, Metrics for the /metrics payload, Edge Failover for how the heartbeat drives health detection, and Deployment for the Services that front this port.
Edge API¶
POST /api/v1/edge/heartbeat¶
Edge processors call this periodically to report liveness and receive a directive.
- Non-
POSTrequests → 405 Method Not Allowed. - Request body limit: 64 KB.
- Invalid JSON → 400 Bad Request.
Request body:
{
"processor_id": "string",
"node_id": "string",
"role": "string",
"epoch": 0,
"timestamp": 0.0,
"status": "string",
"last_state_sync": 0.0,
"state_size_bytes": 0,
"uptime_sec": 0.0
}
epoch, last_state_sync, and state_size_bytes are pointers (nullable / optional).
Behavior:
- If
processor_idis set:- Update
edge_nodes.last_heartbeat_at = NOW()for the node linked to that processor. - Update
processors.last_health_check_atandprocessors.health_status(healthywhenstatus == "healthy").
- Update
- Directive: defaults to
"continue". Ifprocessor_idis set andGetProcessorConfigByIDfails — i.e. the processor is not found or its status isterminated/failed— the directive is"shutdown".
Note
Only continue and shutdown are ever returned. There is no demote or promote directive.
Response — always 200, application/json:
orchestrator_time is Unix seconds (float). directive is "continue" or "shutdown".
POST /api/v1/edge/nodes¶
Register an edge node.
Request body:
nameis required → 400 if missing.organization_idmust be a valid UUID → 400 otherwise.
Calls RegisterEdgeNode (INSERT into edge_nodes with status='online').
Response — 201 Created:
{
"id": "uuid",
"organization_id": "uuid",
"name": "string",
"status": "online",
"failover_enabled": true,
"failover_target_node_id": null,
"last_heartbeat_at": null
}
POST /api/v1/edge/nodes/decommission¶
Decommission an edge node and stop its active apps.
Request body:
node_nameis required → 400 if missing.
Sets edge_node_apps.status='stopped' for active apps and edge_nodes.status='decommissioned'.
- Node not found → 500.
Response — 200:
Observability & health¶
GET /metrics¶
Prometheus exposition via promhttp. See Metrics.
GET /healthz¶
Runs all health checkers. Overall status is unhealthy if any checker is unhealthy, degraded if any is degraded, otherwise healthy.
healthyordegraded→ 200.unhealthy→ 503.
Checkers: database (Ping), kubernetes (GetServerVersion), watcher (GetHealth), health-poller.
Response body:
{
"status": "healthy",
"timestamp": "...",
"version": "...",
"components": {
"database": { "status": "...", "message": "...", "last_check": "..." }
}
}
GET /readyz¶
Readiness probe.
- If the orchestrator has not been marked ready (
MarkReadyis set after setup completes) → 503 with startup unhealthy. - Otherwise runs readiness checkers (
database,kubernetes);Readyonly if all are healthy → 200, else 503.
Response body:
GET /livez¶
Liveness probe — always 200.
Tip
The Deployment uses /livez for liveness and /readyz for readiness. See Deployment.