High-level architecture proposal
This document is a first, high-level vision of how the whole system stays interconnected — the two companies, the two automation platforms, and the shared skill ecosystem — with the automations that drive it, and it is meant to be the base for the future implementation of the architecture.
It is solution-shaping but deliberately high-level: it names the systems, the layers between them, and how information flows, but it does not specify code, schemas, or a final platform choice. It is provisional and founded on the intent; it will be reconciled with Rajan's revised vision and the detailed architecture.
Status: Draft · first vision · provisional
Owner: Federico (first large-scale architecture proposal)
Founded on: requirement-intent.md
What this architecture realises
This architecture exists to realise the one-paragraph intent from the brief, verbatim:
Build an internal, self-improving agentic system that runs the day-to-day production and operations of two companies — Global Economic Bridge (GEB) and Octaflow — through supervised AI agents directed from a UI and a shared skill-file ecosystem, in order to cut production cost sharply (use cheaper models wherever quality allows) and free budget for growth, while preserving output quality through deliberate, human-guided model selection and oversight. It must be provable internally first and usable immediately, not a long-horizon research effort.
Every design choice below traces back to this paragraph: two companies on one foundation, supervised agents, a shared skill ecosystem, cost cut and measured, quality preserved, internal-first.
Stages of the pipeline
The pipeline is the same for both companies. Below, each stage says what happens, walked through with the GEB anchor example: a shopper submits a questionnaire on the GEB website.
- Trigger / event. A shopper submits a questionnaire on the GEB website (Shopify storefront). An event-driven, IFTTT-style webhook fires — the start of an automated loop.
- Ingress + platform routing. The event reaches the automation platform currently assigned to GEB (Paperclip by default) via a webhook/routine. The platform-abstraction layer translates the event into one standard task/run on whichever platform the company is assigned to — so the company, not the platform, owns the event.
- Orchestration. The platform creates a task/issue and wakes the responsible GEB agent (the company's org of agents under human supervision).
- Skill resolution. The agent pulls the relevant skill file from the shared Skills Catalogue (e.g. an order-questionnaire-response skill). Skills are platform-agnostic, so the same skill runs on either platform.
- Model selection. The tiered router picks the cheapest model that clears the task's quality bar; the model call is routed through OpenRouter, where token cost is metered.
- Execution. The agent runs the skill across the relevant product(s) of the ~193 on the site — drafting the response, updating the product, or preparing the purchase comms.
- Human-in-the-loop. Higher-risk output (outbound buyer communication) passes a human review gate before it can be released.
- Action / egress. On approval, the action is written back to the target system (the Shopify product / the reply to the shopper), and the run is recorded — inputs, steps, output, cost, status.
- Feedback / self-improvement. Human feedback on the run can revise the skill file in the catalogue; the revision is re-tested and promoted via a gate, so the system improves from real use.
- Cost measurement. Cost is rolled up from OpenRouter and surfaced as the headline metric, proving the saving that frees marketing budget.
These ten stages run identically if GEB is moved to Fusion — only the adapter in stage 2 changes. The company, the skills, the model routing, the review and the feedback loop are unchanged. That invariance is exactly what makes a fair A/B comparison possible.
How everything connects — and swaps
The diagram models the five named systems — GEB, Octaflow, Fusion, Paperclip, and the Skills Catalogue — plus the abstraction layer that makes the two platforms interchangeable, and the shared rails.
How the A/B interchangeability is modelled: both companies reach both platforms through the platform-abstraction layer. The default bindings are solid (GEB → Paperclip, Octaflow → Fusion); the alternate bindings are dashed (GEB → Fusion, Octaflow → Paperclip). The dashed edges are the reversible A/B swap — a configuration change, not a rebuild.
Reading it: trigger sources feed the companies; companies submit work through the abstraction layer; the layer routes each company to its assigned platform (solid = default, dashed = A/B alternate); both platforms pull from the one Skills Catalogue, route models through OpenRouter, and send high-risk output to the human review gate; approved actions flow back to the source systems and feedback revises the catalogued skills.
The order the pieces come together
A descriptive, very-high-level sequence — the order in which the pieces come together, not a build spec.
- Stand up both platforms. Deploy Paperclip and Fusion, self-hosted and internal-only.
- Build the Skills Catalogue. Ingest existing skill files/libraries and Penelope's skills; keep every skill platform-agnostic so it runs on either platform.
- Define the abstraction layer. One task/run contract plus a Paperclip adapter and a Fusion adapter, so a company can run on either platform.
- Wire the default bindings. GEB → Paperclip, Octaflow → Fusion, as reversible configuration (not hard-coding).
- Connect the triggers and connectors. GEB Shopify webhooks (the questionnaire), Lark, and the content channels — the IFTTT-style event sources.
- Wire model routing. Route all model calls through OpenRouter with the tiered cost/quality selection and per-run cost metering.
- Add governance. The human-review gate for high-risk output, and the feedback → skill-revision loop with a promotion gate.
- Run the anchor case end-to-end. Execute the GEB shopper automation internally and measure cost.
- Run the A/B. Move a company to the alternate platform via the abstraction layer; compare cost, quality, and speed; record the result.
- Decide on evidence. Choose the platform per company from the A/B, and iterate.
What is needed