OctaFlow

Project Rapido · v2 · northcheck

Two planes, one node

A concrete, high-level picture of how the system is deployed and stays interconnected for GEB: the two agent planes on one node, the seam between them, the outside interfaces, and how skills and measurement plug in. It is the base for the implementation.

What changed from v1

v1 modelled the two platforms as an A/B pair joined by a platform-abstraction layer, both interchangeable per company. The operating manual reframes them as a stack, not alternatives: Paperclip governs, Fusion executes, joined by a shared standard — the seam. v2 pins the concrete deployment: Paperclip and Fusion on one cloud VM hosted on northcheck, skills consumed from the seam (synced from Penelope, not duplicated), external interfaces via webhooks, operator access through the dashboards, and a Langfuse measurement rail. The A/B data already gathered stays as evidence; it is no longer the architecture.

2

Agent planes

1

northcheck VM

1

Shared seam

From an A/B pair to a governance → execution stack

01 / 08 · Overview

The operating model

Two planes, joined by a seam

Paperclip governs, Fusion executes. The seam between them is a handover with a gate on each side — not a live socket. The approved Paperclip spec becomes Fusion's PROMPT.md; nothing crosses until governance passes it.

Paperclip · governance Company, goals, org chart, budgets, and the CEO / approval gate. Decides what gets built and who builds it.

Fusion · execution The PROMPT.md spec, git worktrees, per-step review, pre/post-merge gates, delivery. Decides how it gets built and ships it.

The seam A handover with a gate on each side. Paperclip writes the approved spec; Fusion executes and writes the work product back; Paperclip records it.

Correction Fusion is built on Pi; Paperclip is an interchangeable governance layer sharing the Agent Companies standard — not a foundation Fusion sits on. Build on the shared standard.

The intent it serves

An internal, self-improving agentic system running GEB and Octaflow through supervised agents and a shared skill ecosystem — to cut production cost while preserving quality, provable internally first.

Every boundary in the chain has a gate

02 / 08 · Operating model

Deployment topology · a single cloud VM on northcheck

Runtime & the seam, inside the VM

Both planes run on one cloud VM hosted on the northcheck platform (internal-first). Paperclip and the execution plane never call each other over a socket — they exchange an approved spec and a work product through the shared repo (the seam). Skills live in the seam and both planes read them. Fusion exports OpenTelemetry to Langfuse, which returns a quality signal to the governance gate.

Fig 1 — runtime and the seam inside the northcheck VM

Why one node

The two apps share the repo and filesystem, so the seam is local — no network coupling between them. Fusion can later move to a dedicated execution node / build farm on northcheck without changing the seam contract.

03 / 08 · Deployment & seam

External interfaces & access

Webhooks out, skills in, dashboards for the team

The node reaches the outside only through webhooks and outbound calls: GEB Shopify (webhook in / out), the Penelope repo (skill-sync pull), model providers (OpenRouter / HF), and the operator dashboards over a Tailscale tunnel.

Fig 2 — external interfaces and operator access

Operator access — the dashboards as the human interface

The Paperclip dashboard (:3100) and the Fusion board (:4040) are the human interface for manual task creation, approvals, and monitoring. Exposed to Aman, Rajan and devs over a Tailscale tunnel (internal-first, no public surface); a Cloudflare Tunnel + Access can add a public URL later. Multi-user via Paperclip's members / roles / invites / permissions. Never expose the dashboard/API without auth.

04 / 08 · External interfaces

How the data flows · GEB anchor

The seam crossing, end to end

A shopper's questionnaire fires a pipeline that collects context, retrieves and analyses evidence, applies rules, scores risk, and produces a decision package for human sign-off before delivery.

Trigger → govern → seam → execute → return → egress

Trigger. A shopper submits the GEB Shopify questionnaire → a webhook hits the node's gateway.
Govern. Paperclip creates the issue (goal ancestry, budget); the CEO / approval gate approves the spec.
Cross the seam. The approved spec becomes a PROMPT.md in the shared repo — via the shared package, not the experimental live plugin.
Execute. Fusion runs it as one Stepwise workflow in a git worktree; each step is plan → execute → review; the pre/post-merge gates carry the GEB gate criteria.
Measure. Fusion exports OpenTelemetry to Langfuse; each step's output is scored against the stored evals; the quality signal feeds the governance gate.
Return. The merged result returns to Paperclip as an audited work product.
Egress. On approval, the action goes back to Shopify by webhook; the run (inputs, steps, output, cost, quality) is recorded.

Read the run as one thing

Did the specification survive the crossing, did each gate fire, and did the measured quality hold. Humans can also create and approve work directly in the Paperclip dashboard — a second entry path alongside the webhook.

05 / 08 · The seam crossing

The rails

Skills from Penelope, quality from Langfuse

Skills live once in the seam and are read by both planes — no per-repo hand-copy. Measurement neither Paperclip nor Fusion does on its own is added by Langfuse over OpenTelemetry.

Skills — consumed from the seam

A skill-sync plugin pulls skills from the Penelope repo (source of truth) into the seam: scheduled + on-demand, pinned to a ref (skills.lock), a read-only mirror.
Every synced skill is scanned with agentshield before it goes live (a shared package is a shared supply-chain surface).
Interim mechanism while there is no export path beyond manual copying; the lightweight form of a companies.sh catalogue, swappable later.

Measurement — Langfuse over OpenTelemetry

Fusion already exports OpenTelemetry; Langfuse ingests at /api/public/otel — no custom bridge.
The gate criteria applied by hand (rubric, business rules, red-flag list) become stored, versioned evals; every run is scored automatically.
Instrument against OpenTelemetry so the tool stays swappable (LangSmith / Braintrust / Phoenix all read it).

Skills sync in; execution traces flow out to evals

06 / 08 · Skills & measurement

How to implement it — high level

The order the pieces come together

Provision one cloud VM on northcheck (Node / Docker, internal network); run Paperclip (:3100) + Fusion (:4040).
Make the seam repo the shared Agent Companies package (COMPANY / AGENTS / PROJECT / TASK / SKILL.md); fn init for Fusion, provider OAuth complete.
Stand up the skill-sync plugin (Penelope → seam skills/, scheduled + on-demand, agentshield-gated).
Run the stack: govern in Paperclip → cross the seam as PROMPT.md → execute in Fusion as one Stepwise workflow with the gate criteria in pre/post-merge steps → return to Paperclip.
Wire the GEB Shopify webhooks (in: questionnaire → Paperclip; out: approved action → Shopify).
Route models per layer through OpenRouter (strong for governance/review, cheap for execution); meter cost.
Stand up Langfuse; point Fusion's OTel export at it; turn the gate criteria into stored evals.
Expose the dashboards over Tailscale; set members/roles for Aman / Rajan / devs.
Run the single seam test end-to-end (GEB anchor) and read it as one thing.

What is needed

One cloud VM on northcheck; Paperclip + Fusion installed and authenticated; the seam repo as a companies.sh package; read access (deploy key) to Penelope's skills repo + agentshield; OpenRouter (+ HF) keys; Shopify webhook/API access; Langfuse (docker compose) + Fusion OTel export; Tailscale (or Cloudflare Tunnel + Access) for the dashboards; Paperclip member accounts/roles; human approvers for the gates.

07 / 08 · Implementation

Open items

What is still to settle

Three items the proposal leaves to the runbook and to running conditions. Tap each for the detail.

The skill-sync plugin is the interim mechanism while there is no export path beyond manual copying. A real companies.sh catalogue / registry is the target; same shape, swappable.

The exact wiring of Shopify → Paperclip ingress and the Fusion trigger is fixed in the runbook, not in this architecture proposal.

Whether Fusion stays on the same VM or moves to a dedicated execution node / build farm on northcheck as scale grows — the seam contract does not change either way.

08 / 08 · Open items