Skip to content

System Architecture

This page maps the Dory platform end to end: the components, how the Orchestrator's control loop turns desired state into running pods, how state moves between pods, and how the platform recovers from node drains and edge outages. Deep detail lives in the SDK and Orchestrator sections — this page links to them.

Components & responsibilities

Component Role
Orchestrator (Go, v0.1.0) Control plane. Single HTTP server on :8080. Reads desired processors from PostgreSQL and reconciles Kubernetes pods.
PostgreSQL config DB Source of truth for desired state: processors joined to processor_templates and processor_template_versions (with runtime_config_template).
Kubernetes API Where the Orchestrator creates, observes, and deletes processor pods.
Processor pods SDK-based workloads, labeled managed-by=dory-orchestrator, running on managed (cloud) or edge nodes.
Karpenter Provisions app nodes on demand when no existing node has room. NodePool dory-app-pool, EC2NodeClass dory-app-nodeclass.
RabbitMQ Output bus. Processors publish a versioned envelope to the topic exchange dory.output.
Subscribers Consume the envelope and match on its major version. Documented in the Subscriber SDK guide.

The Orchestrator HTTP server exposes:

Endpoint Purpose
GET /metrics Prometheus metrics
GET /healthz, /readyz, /livez Orchestrator health/liveness/readiness
POST /api/v1/edge/heartbeat Edge pod heartbeats (returns directive continue or shutdown)
POST /api/v1/edge/nodes Register an edge node
POST /api/v1/edge/nodes/decommission Decommission an edge node

Architecture diagram

flowchart LR
    DB[(PostgreSQL<br/>config DB)] --> ORCH[Orchestrator<br/>Go v0.1.0 :8080]
    ORCH -->|reconcile pods| K8S[Kubernetes API]
    ORCH -.->|provision_node| KARP[Karpenter<br/>dory-app-pool]
    KARP -->|new app node| MN

    subgraph Managed cloud nodes
      MN[Managed pods<br/>nodeSelector workload-type=application]
    end
    subgraph Edge nodes
      EN[Edge pods<br/>nodeSelector node-type=edge]
    end

    K8S --> MN
    K8S --> EN
    EN -->|POST /api/v1/edge/heartbeat| ORCH
    ORCH <-->|GET/POST /state<br/>Bearer DORY_STATE_TOKEN| MN
    ORCH <-->|GET/POST /state| EN

    MN -->|envelope| MQ[(RabbitMQ<br/>exchange dory.output)]
    EN -->|envelope| MQ
    MQ --> SUB[Subscribers]

The control loop

The Orchestrator runs a config-watcher-driven reconcile loop.

  • A config-watcher polls the PostgreSQL DB on a fixed interval (default 30s). Reconcile fires every interval even when nothing changed in the DB, so the scheduler can also act on consolidation opportunities.
  • Desired state is keyed by the processor-id label. One processor ⇒ one pod.
  • Pods are immutable: any change to a processor's config is applied as delete + recreate, not an in-place update.
  • A DB row (processorsprocessor_templatesprocessor_template_versions.runtime_config_template) maps to a pod spec: image {image_uri}@{digest}, env vars, resources, health probes (readiness GET /ready, liveness GET /health), prestop hook (GET /prestop), labels, and node placement.

Node placement rules:

Pod type Placement
Managed nodeSelector{workload-type: application}
Edge toleration edge-node=true:NoSchedule + nodeSelector{node-type: edge}
All pods node affinity node-role NotIn [system]; ServiceAccount dory-processor; ImagePullSecret ecr-registry-secret

The scheduler bin-packs with first-fit on the most-utilized healthy node (10% resource buffer). If nothing fits, it emits a provision_node decision that creates a Pending pod, which Karpenter then satisfies by provisioning an app node. See Orchestrator architecture for scheduler and reconciler internals.

State transfer

When a pod moves, the Orchestrator transfers its state directly between pods over HTTP.

sequenceDiagram
    participant O as Orchestrator
    participant Old as Old pod :8080
    participant New as New pod :8080
    O->>Old: GET /state  (Bearer DORY_STATE_TOKEN)
    Old-->>O: state body
    O->>New: POST /state  (Bearer DORY_STATE_TOKEN)
    New-->>O: 200 OK
Property Value
Capture GET http://<podIP>:8080/state
Restore POST http://<podIP>:8080/state
Auth Authorization: Bearer <DORY_STATE_TOKEN>
HTTP timeout 30s
Max state size 10 MB
Retries exponential backoff (1s base ×2, cap 30s)

Note

On the SDK side, capture and restore must finish within 25s and stay under 8 MB, keeping a buffer beneath the orchestrator's 30s / 10 MB limits. The SDK persists state to its configured backend (ConfigMap / S3 / PVC / Local). See Core Concepts & Glossary.

Zero-downtime migration on node drain

When the Orchestrator runs with --enable-monitor and an application node is cordoned:

  1. The Orchestrator captures state from each pod on the draining node.
  2. It creates a replacement pod named <app>-drain-<ts> — on a healthy node, or with an empty NodeName so Karpenter provisions one (with an extended 5m readiness wait).
  3. It restores state into the replacement.
  4. Only then does it let kubectl drain evict the old pod.

Tip

A sentinel ConfigMap dory-controller-ref is attached to processor pods as an owner reference, making bare pods drain-eligible without --force.

Edge ↔ cloud failover

Edge pods POST heartbeats to POST /api/v1/edge/heartbeat; the response carries a directive of continue or shutdown.

The Orchestrator marks an edge node failed when either:

  • The node is Kubernetes NotReady for more than a 30s grace period, or
  • Its DB heartbeat is stale for more than 60s.

On failover, the app is recreated as a managed (cloud) pod with:

Label / env Value
workload-location edge
migrated-from-edge true
original-edge-node <node>
env DORY_MIGRATED_FROM_EDGE true
env DORY_STATE_RESTORE_PATH <state_storage_path>

The SDK restores state from its ConfigMap-backed store. Failback recreates the edge pod once the edge node returns.

flowchart LR
    EP[Edge pod] -- heartbeat --> ORCH[Orchestrator]
    ORCH -- NotReady >30s OR<br/>heartbeat stale >60s --> FAIL{Edge node failed?}
    FAIL -- yes --> MP[Managed pod<br/>migrated-from-edge=true]
    MP -- edge node returns --> FB[Failback recreates edge pod]

See Orchestrator architecture for the failover/failback state machine and fencing details.

The output path

Processors publish results to RabbitMQ.

  • Exchange: topic exchange dory.output.
  • Envelope (JSON): { schema_version: "0.1", message_id: <uuid4>, timestamp: <ISO8601 UTC>, payload: { ... } }.
  • Routing key: <processor_id>.<event_type>.<geohash-segments>.
{
  "schema_version": "0.1",
  "message_id": "f81d4fae-7dec-11d0-a765-00a0c91e6bf6",
  "timestamp": "2026-06-17T12:00:00Z",
  "payload": {}
}

Subscribers bind to dory.output with routing-key patterns and match on the envelope's major version. The Processor SDK constructs and publishes the envelope; see Processor SDK getting started.