Spec 002 — tmux session spawning

Status: Draft Related: PRD FR-9..FR-11, ADR-0003

Goal

Each chat owns a tmux session running a Claude Code process in its chat-user home dir. Sessions persist across backend restarts, are visible from the web terminal, and have clean reply-capture semantics.

Session naming

picortex:<chat_id> — colon is tmux's session-name separator but also legal. Using full chat_id (not hex) so the name is recognizable when Jacob lists sessions manually.

Creation

Triggered on first inbound message, after chat-user provisioning (see Spec 001):

sudo -u "chat-$HEX" -H tmux new-session -d -s "picortex:$CHAT_ID" -x 120 -y 40
sudo -u "chat-$HEX" -H tmux pipe-pane -t "picortex:$CHAT_ID" -o \
  "cat >> $HOME/.picortex/session.log"
sudo -u "chat-$HEX" -H tmux send-keys -t "picortex:$CHAT_ID" \
  "cd ~ && claude --dangerously-skip-permissions" Enter

(--dangerously-skip-permissions is tentative — see PRD Q1. Likely OK because the chat user is already sandboxed to $HOME and can't escalate.)

Message dispatch

Per inbound message that passes attention gating:

1. Emit start sentinel:  tmux send-keys "<<PICORTEX-TURN-$TURN_ID-START>>"  Enter
2. Emit user text:       tmux send-keys <escaped-payload>  Enter
3. Emit end sentinel:    tmux send-keys "<<PICORTEX-TURN-$TURN_ID-END>>"  Enter
4. Tail session.log waiting for the end sentinel to appear.
5. Extract bytes between start and end sentinels; strip ANSI; that's the reply.
6. POST to Linq (or return to channel abstraction).

The sentinel protocol is robust against Claude Code's streaming output as long as the model doesn't generate sentinels verbatim (probability ~0 in practice; we seed $TURN_ID with a UUID so sentinels are unique).

Fallback: if an end sentinel doesn't appear within 120 s, send a claude-stop keystroke (Ctrl-C) and reply with an apology + request-id.

Lifecycle

Active: at least one message in the last 7 days.
Idle: tmux session exists but no message in 7+ days. On next inbound, send a "good morning" to the existing session and keep going.
Hibernated: after 30 days idle, archive the home dir and userdel -r. Next inbound triggers re-provisioning (cold path).

Cron job scripts/cron/lifecycle.sh runs hourly. All state changes write to events table.

Warm pool (stretch goal for S6)

Keep N=3 pre-provisioned "unclaimed" chat users with tmux sessions and Claude Code booted. On first inbound for a new chat, rename + assign instead of provisioning from scratch.

Worth it only if cold-start P95 exceeds NFR-1. Skip in v0.1 unless observed.

Web terminal integration

See Spec 003. The WS terminal bridge runs tmux attach -t picortex:$CHAT_ID inside sudo -u chat-$HEX.

Testing

Unit: sentinel protocol (mock tmux capture-pane outputs).
Integration: real tmux; send-keys a message; assert reply extraction matches.
Stress: 50 concurrent chats; latency budget.
Recovery: kill backend during a turn; restart; assert the session survives and the in-flight turn logs a recovery event.

Open questions

OQ1: What if Claude Code crashes inside tmux? (Respawn, log an event, reply "I had a glitch, try again".)
OQ2: What if the user sends a message while the previous turn is still running? (Queue. Only one turn at a time per chat.)
OQ3: Do we ever want a long-lived streaming reply (partial messages to Linq as it generates)? Probably post-v0.1.