Mockup 01 · Option 2Piyush-style (two-box, `claude -c -p` per turn)

The platonic ideal realized as three machines talking HTTPS + SSH. Derived from the pre-container Cortex design Piyush Jha shipped in Jan 2026, adapted for per-chat isolation and the P4 consent-loop. No tmux; no long-lived REPLs; no shared-host bot-workspace coupling.

TL;DR architectural difference

Bot lives on machine A. Per-chat workspaces live on machine B (separate VPS, Linux users, 0700 home dirs). Knowledge graph lives on machine C. One ssh → claude -c -p per turn. Bot Gateway's crash has zero effect on in-flight Claude sessions.

UX is the same as the platonic ideal. This page focuses on the mechanical architecture that delivers it. For the user-facing scenes (attention, consent loop, manifest dialog, reactions), see Mockup 00.

1The three machines

machine service per-chat user out-of-band path HTTPS / in-channel SSH

2What lives where

Component	Machine	Process / store	Cost of failure
Channel adapter + Bot Gateway	A	Fastify service, systemd	Inbound queue stalls; no data loss (Linq retries).
SQLite canonical log	A (local disk)	`/var/lib/picortex/picortex.sqlite`	Fatal. Daily `litestream` → S3.
Consent broker (pause state)	A	In-memory + SQLite rehydration row	Recovered from SQLite on restart; pending group waits survive.
Per-chat Linux user + home dir	B	`/srv/picortex/chats/<chat>`	Only that chat's context/memory lost if B is wiped.
Claude session memory	B	`~chat-X/.claude/`	Per-chat. Each chat resurrects with fresh memory; transcript rebuild possible from A's log.
noos graph	C	Neo4j (existing Lightsail deploy)	Bot degrades gracefully: "I can't reach my knowledge right now."

3Concrete consent-loop turn (wire view)

A · inbound webhook. Linq POSTs message.received for "is jacob free tuesday?" to https://picortex.globalbr.ai/api/linq/inbound. HMAC verified. Row inserted in messages on A's SQLite.
A · attention gate passes (@mention detected). A · Turn dispatcher checks the chat's manifest row: calendar not in scope.
A · consent broker activates. Inserts DisclosureEvent (approval_mode: pending, ttl_expires_at: +10min). Sends group ack via Linq sendMessage: "Let me check with him — one sec." Typing indicator on.
A · out-of-band DM to Jacob via Linq sendMessage with full context + proposed response + approve/deny keys.
A · Jacob replies y. Broker matches to the pending DisclosureEvent, marks approval_mode: approve-exact, expands turn scope to include calendar:2026-04-28T19:00/21:00.
A → B · SSH. Turn dispatcher opens SSH to machine B:
ssh -i /etc/picortex/keys/b_admin.pem picortex@B sudo -u chat-a1b2 -H claude --session-id chat-a1b2 -p "<system+context+prompt>"
B · Claude runs. Reads ~/.claude/sessions/chat-a1b2/ for prior turns on this chat. Calls optional noos MCP tool to check calendar. stdout = proposed reply.
A · outbound. Bot Gateway receives stdout over SSH, writes to SQLite, sends final reply to group via Linq sendMessage. Closes DisclosureEvent with final_reply + grant_ttl: single-use.

4Code sketch (the Bot Gateway's executor)

// MachineA/src/executor.ts — per-turn dispatcher (Option 2)
import { SSHClient } from './ssh.js'
import { scopedCtx } from './manifest.js'

export async function executeTurn(chat: Chat, message: Message, scope: Scope) {
  const turnId = ulid()
  const ctxSystem = buildSystemPrompt(chat, scope)   // manifest-filtered
  const prompt = formatTurnInput(message)

  // One SSH exec per turn. No tmux, no sentinels.
  const { stdout, code } = await ssh.execAsUser(
    chat.unix_user,
    ['claude', '--session-id', chat.id, '-p', '--dangerously-skip-permissions', '--system', ctxSystem, prompt],
    { env: { PICORTEX_TURN_ID: turnId }, timeout_ms: 120_000 },
  )

  if (code !== 0) return replyFailure(stdout)
  return recordAndSend(chat, stdout, { turnId })
}

Consent broker — pause/resume

// On pause: write DisclosureEvent row, send DM, return to event loop.
// On Jacob's reply: match by (jacob_dm_chat_id, in-flight-row), re-invoke executeTurn with expanded scope.
// On timeout: clear row, send group "I need to check on that offline", log approval_mode="timeout".

5Strengths & costs, head-to-head

Strengths

Privacy split matches the threat model. A Claude turn that gets prompt-injected into "cat ~/.ssh/id_rsa" runs on B; the DM-approval state and Linq webhook secret live on A. Two independent blast radii.
Crash safety. A can restart mid-turn without losing Claude's memory (it's on B). B can be rebooted per-chat-user without affecting the bot.
No tmux fragility. No sentinel protocol. -p + stdout is the entire reply-capture contract.
Linear cost. ~$10/mo machine A + ~$15/mo machine B + free machine C (shared with noos).
Matches existing deploy patterns. Machine B = a stripped-down jcortex clone. Machine A = same systemd+Caddy pattern as voice-assistant.

Costs / risks

Two machines to maintain. Provisioning, keys, firewall rules, upgrades — doubled.
SSH setup burden. ed25519 key distribution, authorized_keys hygiene, monitored sudoers drop-in. Manageable but not zero.
Latency floor bumped by SSH. ~80-200 ms round-trip overhead per turn vs a single-box architecture. Warm reply still within budget (<15 s).
B must stay online for turns to complete. If B is down, turns queue on A; Linq retry saves us.
Claude CLI behavior dependency. --session-id contract + ~/.claude/ layout are upstream-owned. Drift = work.

6Five failure modes & what happens

Failure	User-visible	Recovery
A restart mid-turn	Brief typing indicator gap; bot posts "back online, here's that answer" if turn was in consent-loop pause	Rehydrate paused DisclosureEvents from SQLite on boot
B down	Bot: "I'm briefly unable to think — one moment" in DM; groups get nothing until B returns	Retry turn once B is up; Linq retries inbound for up to 24 h
C (noos) down	Bot: "I can't reach my knowledge graph right now — try again?"	Attention gate still works; non-graph questions still answered
SSH key rotation mid-flight	Single turn fails loudly; reply is "I had a glitch, try again"	Key reloaded from Vault/env on next turn
Claude CLI OOMs on B	Turn errors out; reply "Jacob's bot had a glitch"	cgroup memory limit per chat user prevents cross-chat impact

7Why this might be the answer

Codex's session review flagged S3 (single shared tmux) and the tmux sentinel protocol as the weakest parts of the original plan. Piyush already ran the spike we were being asked to run — claude -c -p per turn, via SSH, with session memory in the workspace's own ~/.claude/. It worked. The machines are cheap, the isolation story matches the threat model, and the consent-loop broker pattern composes naturally on top.

8What this doesn't solve

Does not protect against a compromised noos (machine C) leaking data. That's a separate trust boundary.
Does not eliminate shared-kernel risk on B. If the D2 isolation report finds Linux-user perms insufficient for the real threat model, B's per-chat users get wrapped in bubblewrap or Landlock — composes naturally, no re-architecture.
Does not give you an ephemeral per-chat machine (Option 3). If that becomes the real privacy requirement, B's per-chat users are replaced with Fly Machines per chat.

Compare with the platonic ideal (Mockup 00) or the alternative Option 4 stateless design (Mockup 02).

Mockup 01 · Option 2Piyush-style (two-box, claude -c -p per turn)