orcha docs
orcha runs on the machine where your coding agents live. It watches your tmux panes and types into them on your command, so you can drive every session from one chat on your phone or any browser.
You need two things on that machine: a tmux server already running your
agents (Claude Code, Codex, and friends, each in a pane), and a brain
CLI (the default is Anthropic's claude). Both are covered below.
Install
Linux or macOS, x86_64 or arm64, no root. tmux runs on both; on macOS,
brew install tmux.
curl -fsSL https://orcha.cc | sh
The script detects your OS and architecture, downloads the matching release
binary, verifies its checksum, and installs to ~/.local/bin. Inspect it
first if you like:
curl -fsSL https://orcha.cc/install.sh | less
Or build from source:
go install github.com/spitis/orcha/cmd/orcha@latest
To update, re-run the install command. To remove the binary,
rm ~/.local/bin/orcha (your ~/.config/orcha and
~/.local/state/orcha stay until you delete them too).
Run it
With tmux running your agents and a brain CLI installed (see Brains), on that machine:
orcha up
orcha up embeds a relay, auto-detects your Tailscale or LAN address,
opens a 10-minute pairing window, and prints a QR code and a link. Scan the QR
from your phone (on the same tailnet or Wi-Fi), then approve the device in the
terminal. (Headless and systemd runs auto-accept whoever holds the one-time
secret.) Each device you pair is remembered; pair more the same way.
Voice input needs a secure context, since browsers only hand over the microphone
over HTTPS. When a Tailscale HTTPS cert is available, orcha up also
serves HTTPS on the next port and prints an https:// link. Open that
one for voice.
Configuration
orcha reads ~/.config/orcha/config.toml
($XDG_CONFIG_HOME/orcha/config.toml if set). It is created with
commented defaults the first time you run orcha up, so the file
documents itself. When you switch backends, set command and
model to match: the defaults below only change for keys you actually
set. The sections:
[brain]
backend = "agent-cli" # agent-cli | codex-cli | gemini-cli | antigravity | openai-compat
command = "claude" # the CLI to drive: claude / codex / gemini / agy
model = "haiku" # blank uses the backend's own default
# openai-compat (ollama or any OpenAI-compatible API):
base_url = "http://127.0.0.1:11434/v1"
# api_key = "" # or set ORCHA_API_KEY
# system_prompt = """ # override the built-in orchestrator persona (multi-line)
# You are a helpful fleet manager...
# """
[safety]
default_policy = "confirm-destructive" # read-only | confirm-destructive | confirm-all | full
# protected_sessions = ["prod-experiment"]
[tts]
command = "" # server-side voice: reads reply text on stdin, writes WAV to stdout
# voice = "af_heart" # default voice preset (passed as ORCHA_VOICE)
# voices = ["af_heart", "af_bella", "am_michael", "bf_emma"] # shown in settings UI
[remote]
relay = "" # wss://relay.example.com; also used by `orcha up` for dual-relay
Brains
The brain is the model that answers your questions and decides what to type. It is pluggable and uses a tool you already have:
- agent-cli (default): your installed
claudeCLI (Claude Code). A persistent, streaming process; fastest, and the recommended default. Blankmodelmeanshaiku. - codex-cli: your installed
codexCLI (OpenAI Codex CLI). - gemini-cli / antigravity: Gemini via Google's CLI
(
command = "gemini"or"agy"). Google is moving individual and AI Pro/Ultra users from gemini-cli to the Antigravity CLI on 2026-06-18; use theantigravitybackend there. - openai-compat: a local model via
ollama, or any OpenAI-compatible
endpoint. Set
base_urland amodel(required), plusapi_keyif needed. The endpoint and model must support OpenAI-style tool calls.
Connect from anywhere
Device-to-host traffic is end-to-end encrypted (see Security); the relay just passes it through. There are a few ways to run that relay:
- Your own, over a tailnet (recommended).
orcha upembeds a relay and your devices reach it across your Tailscale network, where the transport is additionally WireGuard-encrypted. Nothing leaves machines you control. - A relay you host elsewhere. To reach orcha when you are off your tailnet,
run the
orcha-relaybinary (go install github.com/spitis/orcha/cmd/orcha-relay@latest, or from the release archive) on a VPS or any host with a public address, pointremote.relayat it, then runorcha serveon your machine. The relay never holds keys or plaintext; serve it over HTTPS/WSS. - Both at once. When
remote.relayis set,orcha upconnects to both the embedded local relay and the remote relay. Your phone pairs through whichever path is available: local on the tailnet, remote when away. Both pairing URLs are printed at startup.
Safety
Every byte typed into a pane passes a destructive-input classifier, the pane's
policy, and an append-only audit log that fails closed: if the action cannot be
logged, it is not sent. The log lives at
~/.local/state/orcha/audit.jsonl.
The policy below governs input from your devices. The brain's own
typing always stops for your confirmation first (except under read-only,
which denies it). orcha never lets the model act unattended.
read-only: observe only; never type.confirm-destructive(default): your input goes through, but anything the classifier flags as destructive or irreversible (rm -rf,git push, publishes,docker push, remotersync) stops to ask.confirm-all: ask before every input.full: your input goes through without a prompt (still audited).
protected_sessions are tmux session names orcha must never type into: a hard deny, not a prompt. Observation still works. Use it for a mid-experiment or production session you want visible but untouchable.
Prose typed into agent panes carries a [user via orcha] or
[<model> via orcha] prefix, so anyone reading the pane can tell
what came from orcha.
Voice
Voice input uses the browser's speech recognition and needs a secure context
(HTTPS), which orcha up provides on a tailnet (see
Run it). Spoken transcripts are shown before any gated action runs.
Spoken replies default to your phone or browser's built-in speech. For far more
natural replies, set [tts].command to a wrapper that reads text on stdin
and writes WAV to stdout (kokoro gives the best quality at CPU-realtime; piper is the
fastest), and use voice / voices to pick presets.
Commands
orcha up: embed a relay, serve, and open a pairing window. The quickstart path.orcha serve: connect out to the relay inremote.relayand serve paired devices (for a relay you host).orcha pair: open a 10-minute pairing window for one device and print its QR and link. Run it withorcha servealready running, or to add a device.orcha chat: talk to the brain from this terminal, no device needed (handy for testing your config).orcha status: print the current fleet (sessions, panes, activity).orcha version: print the version.
Always-on (systemd)
To keep orcha running across logouts on Linux, install the user service:
mkdir -p ~/.config/systemd/user
curl -fsSL https://raw.githubusercontent.com/spitis/orcha/main/packaging/systemd/orcha.service \
-o ~/.config/systemd/user/orcha.service
systemctl --user daemon-reload
systemctl --user enable --now orcha
loginctl enable-linger "$USER" # keep running after logout
journalctl --user -u orcha -f # retrieve the pairing QR / link and logs
Security
orcha types into your terminals on command from a phone. That is remote code execution as a feature, so the design is explicit about what is protected and what is trusted.
Guarantees
- End-to-end encryption. Device-to-host traffic is NaCl box (X25519 + XSalsa20-Poly1305), fresh random nonce per frame. The relay forwards ciphertext only; it never holds keys or plaintext.
- Pairing secrets never touch the server. They travel in the URL fragment (browsers do not send fragments to the server), are one-time use, and expire in 10 minutes.
- Verification code. During pairing, both the terminal and the phone display a 6-digit code derived from both public keys. Verify they match before approving — this prevents an attacker who photographs the QR from silently substituting their own device.
- Gated input. Every keystroke into any pane passes a destructive-input classifier, per-pane policies, and an append-only audit log that fails closed: if the audit entry cannot be written, the action does not execute. Control characters (newlines, tabs) are rejected in text input to prevent classifier bypass.
- Content Security Policy. The web client is served with
script-src 'self'andframe-ancestors 'none'to block XSS and clickjacking. Vendored crypto (nacl.min.js) carries a subresource integrity hash. - No listening ports on your machine. The host dials out; the
only listener is the relay (yours, embedded in
orcha up, or one you chose).
Known limitations
- No forward secrecy. Static NaCl keys mean a future private-key compromise decrypts recorded relay traffic. A Noise handshake with ephemeral session keys is planned.
- LAN without TLS is trust-the-network. Without Tailscale HTTPS
certs, the web client is served over plaintext. An active attacker on the same
WiFi could swap the client code. With TLS this is eliminated;
orcha upbinds plaintext to localhost-only when TLS is available. - No device revocation UI (yet). A lost phone retains access
until you manually edit
~/.config/orcha/identity.json. Aorcha devices revokecommand is planned. - Headless mode auto-accepts pairing. Without a terminal
(e.g.
orcha serveunder systemd), the 128-bit time-limited secret is the only guard. Interactiveorcha upalways shows the verification code and prompts y/N. - Relay is trusted for code delivery. The web client is served
by the relay, so a compromised relay could serve a backdoored client (same model
as all web-delivered E2E apps). Mitigations: run your own relay
(
orcha upembeds one), or use HTTPS/WSS to a relay you trust.
The full threat model is in SECURITY.md in the repository.