System Architecture¶

This page maps the Dory platform end to end: the components, how the Orchestrator's control loop turns desired state into running pods, how state moves between pods, and how the platform recovers from node drains and edge outages. Deep detail lives in the SDK and Orchestrator sections — this page links to them.

Components & responsibilities¶

Component	Role
Orchestrator (Go, `v0.1.0`)	Control plane. Single HTTP server on `:8080`. Reads desired processors from PostgreSQL and reconciles Kubernetes pods.
PostgreSQL config DB	Source of truth for desired state: `processors` joined to `processor_templates` and `processor_template_versions` (with `runtime_config_template`).
Kubernetes API	Where the Orchestrator creates, observes, and deletes processor pods.
Processor pods	SDK-based workloads, labeled `managed-by=dory-orchestrator`, running on managed (cloud) or edge nodes.
Karpenter	Provisions app nodes on demand when no existing node has room. NodePool `dory-app-pool`, EC2NodeClass `dory-app-nodeclass`.
RabbitMQ	Output bus. Processors publish a versioned envelope to the topic exchange `dory.output`.
Subscribers	Consume the envelope and match on its major version. Documented in the Subscriber SDK guide.

The Orchestrator HTTP server exposes:

Endpoint	Purpose
`GET /metrics`	Prometheus metrics
`GET /healthz`, `/readyz`, `/livez`	Orchestrator health/liveness/readiness
`POST /api/v1/edge/heartbeat`	Edge pod heartbeats (returns directive `continue` or `shutdown`)
`POST /api/v1/edge/nodes`	Register an edge node
`POST /api/v1/edge/nodes/decommission`	Decommission an edge node

Architecture diagram¶

flowchart LR
    DB[(PostgreSQL<br/>config DB)] --> ORCH[Orchestrator<br/>Go v0.1.0 :8080]
    ORCH -->|reconcile pods| K8S[Kubernetes API]
    ORCH -.->|provision_node| KARP[Karpenter<br/>dory-app-pool]
    KARP -->|new app node| MN

    subgraph Managed cloud nodes
      MN[Managed pods<br/>nodeSelector workload-type=application]
    end
    subgraph Edge nodes
      EN[Edge pods<br/>nodeSelector node-type=edge]
    end

    K8S --> MN
    K8S --> EN
    EN -->|POST /api/v1/edge/heartbeat| ORCH
    ORCH <-->|GET/POST /state<br/>Bearer DORY_STATE_TOKEN| MN
    ORCH <-->|GET/POST /state| EN

    MN -->|envelope| MQ[(RabbitMQ<br/>exchange dory.output)]
    EN -->|envelope| MQ
    MQ --> SUB[Subscribers]

The control loop¶

The Orchestrator runs a config-watcher-driven reconcile loop.

A config-watcher polls the PostgreSQL DB on a fixed interval (default 30s). Reconcile fires every interval even when nothing changed in the DB, so the scheduler can also act on consolidation opportunities.
Desired state is keyed by the processor-id label. One processor ⇒ one pod.
Pods are immutable: any change to a processor's config is applied as delete + recreate, not an in-place update.
A DB row (processors → processor_templates → processor_template_versions.runtime_config_template) maps to a pod spec: image {image_uri}@{digest}, env vars, resources, health probes (readiness GET /ready, liveness GET /health), prestop hook (GET /prestop), labels, and node placement.

Node placement rules:

Pod type	Placement
Managed	`nodeSelector{workload-type: application}`
Edge	toleration `edge-node=true:NoSchedule` + `nodeSelector{node-type: edge}`
All pods	node affinity `node-role NotIn [system]`; ServiceAccount `dory-processor`; ImagePullSecret `ecr-registry-secret`

The scheduler bin-packs with first-fit on the most-utilized healthy node (10% resource buffer). If nothing fits, it emits a provision_node decision that creates a Pending pod, which Karpenter then satisfies by provisioning an app node. See Orchestrator architecture for scheduler and reconciler internals.

State transfer¶

When a pod moves, the Orchestrator transfers its state directly between pods over HTTP.

sequenceDiagram
    participant O as Orchestrator
    participant Old as Old pod :8080
    participant New as New pod :8080
    O->>Old: GET /state  (Bearer DORY_STATE_TOKEN)
    Old-->>O: state body
    O->>New: POST /state  (Bearer DORY_STATE_TOKEN)
    New-->>O: 200 OK

Property	Value
Capture	`GET http://<podIP>:8080/state`
Restore	`POST http://<podIP>:8080/state`
Auth	`Authorization: Bearer <DORY_STATE_TOKEN>`
HTTP timeout	30s
Max state size	10 MB
Retries	exponential backoff (1s base ×2, cap 30s)

Note

On the SDK side, capture and restore must finish within 25s and stay under 8 MB, keeping a buffer beneath the orchestrator's 30s / 10 MB limits. The SDK persists state to its configured backend (ConfigMap / S3 / PVC / Local). See Core Concepts & Glossary.

Zero-downtime migration on node drain¶

When the Orchestrator runs with --enable-monitor and an application node is cordoned:

The Orchestrator captures state from each pod on the draining node.
It creates a replacement pod named <app>-drain-<ts> — on a healthy node, or with an empty NodeName so Karpenter provisions one (with an extended 5m readiness wait).
It restores state into the replacement.
Only then does it let kubectl drain evict the old pod.

Tip

A sentinel ConfigMap dory-controller-ref is attached to processor pods as an owner reference, making bare pods drain-eligible without --force.

Edge ↔ cloud failover¶

Edge pods POST heartbeats to POST /api/v1/edge/heartbeat; the response carries a directive of continue or shutdown.

The Orchestrator marks an edge node failed when either:

The node is Kubernetes NotReady for more than a 30s grace period, or
Its DB heartbeat is stale for more than 60s.

On failover, the app is recreated as a managed (cloud) pod with:

Label / env	Value
`workload-location`	`edge`
`migrated-from-edge`	`true`
`original-edge-node`	`<node>`
env `DORY_MIGRATED_FROM_EDGE`	`true`
env `DORY_STATE_RESTORE_PATH`	`<state_storage_path>`

The SDK restores state from its ConfigMap-backed store. Failback recreates the edge pod once the edge node returns.

flowchart LR
    EP[Edge pod] -- heartbeat --> ORCH[Orchestrator]
    ORCH -- NotReady >30s OR<br/>heartbeat stale >60s --> FAIL{Edge node failed?}
    FAIL -- yes --> MP[Managed pod<br/>migrated-from-edge=true]
    MP -- edge node returns --> FB[Failback recreates edge pod]

See Orchestrator architecture for the failover/failback state machine and fencing details.

The output path¶

Processors publish results to RabbitMQ.

Exchange: topic exchange dory.output.
Envelope (JSON): { schema_version: "0.1", message_id: <uuid4>, timestamp: <ISO8601 UTC>, payload: { ... } }.
Routing key: <processor_id>.<event_type>.<geohash-segments>.

{
  "schema_version": "0.1",
  "message_id": "f81d4fae-7dec-11d0-a765-00a0c91e6bf6",
  "timestamp": "2026-06-17T12:00:00Z",
  "payload": {}
}

Subscribers bind to dory.output with routing-key patterns and match on the envelope's major version. The Processor SDK constructs and publishes the envelope; see Processor SDK getting started.