Greentic deployment · control plane

The "Server Next" plan, in plain words

Greentic deploys digital workers. Until now, the record of what is deployed where lived as files on one operator's machine. "Server Next" turns that into a real, shared, safe control plane — a server many operators and automations talk to over the network. And the latest work crosses the most important line: the server is no longer just a place to store deployment state, it's a surface you deploy and reconcile through. Here's why that matters, what already works, and what's left.

Why this is necessary

The short version: a file on a laptop is fine for a demo, not for an enterprise.

The analogy. Today's local store is like a spreadsheet saved on one person's laptop. It works until a second person needs to edit it, until someone overwrites a change by accident, until you need to know who changed what and when, or until the file gets corrupted and there's no undo. The "Server Next" plan replaces that laptop spreadsheet with a shared database that has logins, an edit history, locking, and a restore button — and then lets you run deployments through it, not just record them.
BEFORE — local only Operator + deployment state as files one machine, no sharing, no undo AFTER — shared store server Operator A Automation HTTP deploy · reconcile Store server one source of truth locking · audit · retry-safe access control · restore
From "state lives on one machine" to "state lives in one server everyone safely shares — and acts through".

What a real control plane has to guarantee

Each of these is a way the laptop-spreadsheet model fails — and a thing the server must get right.

① One source of truth

Reachable over the network, not a file only one person can see.

② No silent clobbering

Two operators editing at once can't overwrite each other — writes are checked against a version tag (optimistic locking).

③ No accidental double-apply

If a network glitch makes a client retry "deploy", it must not deploy twice — the server replays the first answer instead.

④ A permanent record

Every change — and every rejected attempt — is written to an audit log for compliance.

⑤ Access control

Who may change what, decided per person + environment + action, and locked shut by default.

⑥ Recover & detect corruption

Snapshots you can restore, and integrity checks that catch silent data rot before it spreads.

The roadmap: five phases (S1–S5)

The plan sequenced the work so each phase builds on the one before it. Below is the plan and where each phase actually stands today.

Shipped Mostly shipped — a piece remains Future / not started
S1

Pick the storage backend

Re-sequenced

The plan said "finish the Postgres backend first." In practice the team shipped a simpler single-writer SQLite server first — it works today and ships sooner — and parked Postgres as a prototype to revisit when scale or high-availability demand it. A deliberate change of order, not a skipped step.

S2

The HTTP store server

Shipped

A real server (greentic-operator-store-server) now exposes the deployment contract over HTTP. Version-tag locking (guarantee ②) and retry-safe replay (③) are built in. --store-url works end-to-end: an operator can apply an environment to a remote server, and even dry-run it first to confirm it would converge.

S3

Access control (RBAC)

Static tokens shipped

Decisions are made per actor + environment + action, fail-closed, and rejected attempts are still audited (guarantees ④ & ⑤). Today it uses static bearer tokens with coarse roles (admin / operator / read-only) — and the same gate now guards the remote deploy and reconcile actions below, with read-only tokens denied. The future piece is stronger identity — OIDC or mutual-TLS — which can replace this module without changing anything downstream.

S4

The remaining deploy actions

Shipped

The "stage a new revision → warm it up → drain → archive" lifecycle is wired through the server, with a clean split between the pure decision logic and the database write. These were the actions left unimplemented when the plan was written — and they're the foundation the remote control surface (next section) now drives.

S5

Operational hardening

Mostly shipped

This is the gate for "production-ready", and most of it is done: the retry-log is bounded, the audit log has opt-in retention that records what it trimmed, corruption detection verifies hashes on load, and backup & restore exists with tests. Remaining: a full backup→restore acceptance drill and high-availability (automatic failover to a standby).

What came next: the server became a control surface

With state safely stored, the recent work let operators act through the server, not just read and write records. Three remote actions now go through the same safety envelope as every other write — authorize → idempotency (no double-apply) → version check → durable audit — so a deploy run from a remote machine is as safe and as traceable as a local one.

Remote deploy

Shipped

op deploy --store-url runs the whole rollout against the remote server — add a bundle, stage a new revision, warm it, shift traffic — and it's retry-safe, so a flaky connection can't deploy twice.

Server-mediated reconcile

Shipped

env reconcile --store-url triggers the real cluster apply from remotely-stored state. The server authorizes and audits the reconcile and pins it to a version check before the operator touches the cluster — so the act of reconciling is itself recorded and access-controlled.

Webhook-secret rotation

Shipped

The operator provisions the new secret value in its own vault and hands the server only the new reference; the server records it and bumps the generation. The server still never holds the secret value — it stamps what the caller asserts.

Why this is the important line. Before, the server was a database of what's deployed — you could read and edit the record, but the actual deploy still happened off to the side. Now it's a control plane you deploy through: the deploy, the cluster reconcile, and the secret rotation each pass the access check and land in the audit log. That's the difference between "we wrote down what we did" and "the system did it, on the record."

What's already shipped

A lot of this was "finishing", not greenfield — the wire contract and the client existed before the server did.

Foundations (pre-existing)

  • The wire contract — version tags, retry-replay records, access-request and corruption-hash types.
  • The HTTP client and --store-url routing on the operator CLI.
  • A Postgres prototype backend (parked).

Built out since

  • The SQLite store server — full schema, every read/write verb.
  • Optimistic locking (no silent overwrites) + retry-safe replay.
  • Audit log + opt-in retention with a watermark.
  • Static-token RBAC, fail-closed, denials audited.
  • Revision lifecycle: stage · warm · drain · archive.
  • Backup & restore + integrity / corruption checks.
  • Remote deploy, reconcile & rotate — driving deploys through the server, each authorized + audited.

What's left

ItemWhat it addsStatus
Backup→restore drill A single end-to-end acceptance test proving that taking a backup and restoring it reproduces the exact live state ("active generation"). The mechanism exists and is unit-tested; this is the production sign-off proof. Next
Reconcile completion report-back Today the audit records that a reconcile was authorized; a second server call (after the cluster apply) will also record whether it succeeded or failed — so the durable log reflects the real outcome, not just the go-ahead. Next
Identity beyond static tokens OIDC or mutual-TLS so actors are real verified identities, not shared bearer tokens. Designed-for: it can swap in without touching downstream code. Future
High availability A standby server that takes over automatically without losing any committed change. Needs a design decision (shared storage vs. replication) first. Future
Postgres backend Finish the parked Postgres path for when one SQLite writer is no longer enough. Reopen when scale/HA forces it. Parked

Where it stands, in one line

The control plane is working end-to-end — and you no longer just store state in it, you deploy and reconcile through it: a shared server with locking, retry-safety, audit, access control, the full deploy lifecycle, recoverable backups, and remote deploy / reconcile / rotate that each pass the same authorize-and-audit envelope. The remaining work is the production sign-off drill, a reconcile completion report-back, and the enterprise-grade extras — verified identity and automatic failover.

Greentic deployment · "Server Next". Status reflects the greentic-operator-store-server and remote-operability work through the latest develop. Phases S2 and S4 shipped; S3 shipped at the static-token level; S5 mostly shipped (drill + HA remain); S1 re-sequenced to SQLite-first with Postgres parked. Since the first cut, remote operability landed: remote deploy (#396), server-mediated reconcile (#397 / #398), and remote webhook-ref rotation (#399), all merged. Updated 2026-06-29.