orcha docs

orcha runs on the machine where your coding agents live. It watches your tmux panes and types into them on your command, so you can drive every session from one chat on your phone or any browser.

You need two things on that machine: a tmux server already running your agents (Claude Code, Codex, and friends, each in a pane), and a brain CLI (the default is Anthropic's claude). Both are covered below.

Install

Linux or macOS, x86_64 or arm64, no root. tmux runs on both; on macOS, brew install tmux.

curl -fsSL https://orcha.cc | sh

The script detects your OS and architecture, downloads the matching release binary, verifies its checksum, and installs to ~/.local/bin. Inspect it first if you like:

curl -fsSL https://orcha.cc/install.sh | less

Or build from source:

go install github.com/spitis/orcha/cmd/orcha@latest

To update, re-run the install command. To remove the binary, rm ~/.local/bin/orcha (your ~/.config/orcha and ~/.local/state/orcha stay until you delete them too).

Run it

With tmux running your agents and a brain CLI installed (see Brains), on that machine:

orcha up

orcha up embeds a relay, auto-detects your Tailscale or LAN address, opens a 10-minute pairing window, and prints a QR code and a link. Scan the QR from your phone (on the same tailnet or Wi-Fi), then approve the device in the terminal. (Headless and systemd runs auto-accept whoever holds the one-time secret.) Each device you pair is remembered; pair more the same way.

Voice input needs a secure context, since browsers only hand over the microphone over HTTPS. When a Tailscale HTTPS cert is available, orcha up also serves HTTPS on the next port and prints an https:// link. Open that one for voice.

Configuration

orcha reads ~/.config/orcha/config.toml ($XDG_CONFIG_HOME/orcha/config.toml if set). It is created with commented defaults the first time you run orcha up, so the file documents itself. When you switch backends, set command and model to match: the defaults below only change for keys you actually set. The sections:

[brain]
backend = "agent-cli"   # agent-cli | codex-cli | gemini-cli | antigravity | openai-compat
command = "claude"      # the CLI to drive: claude / codex / gemini / agy
model   = "haiku"       # blank uses the backend's own default
# openai-compat (ollama or any OpenAI-compatible API):
base_url = "http://127.0.0.1:11434/v1"
# api_key = ""          # or set ORCHA_API_KEY
# system_prompt = """   # override the built-in orchestrator persona (multi-line)
# You are a helpful fleet manager...
# """

[safety]
default_policy = "confirm-destructive"   # read-only | confirm-destructive | confirm-all | full
# protected_sessions = ["prod-experiment"]

[tts]
command = ""            # server-side voice: reads reply text on stdin, writes WAV to stdout
# voice  = "af_heart"   # default voice preset (passed as ORCHA_VOICE)
# voices = ["af_heart", "af_bella", "am_michael", "bf_emma"]  # shown in settings UI

[remote]
relay = ""              # wss://relay.example.com; also used by `orcha up` for dual-relay

Brains

The brain is the model that answers your questions and decides what to type. It is pluggable and uses a tool you already have:

  • agent-cli (default): your installed claude CLI (Claude Code). A persistent, streaming process; fastest, and the recommended default. Blank model means haiku.
  • codex-cli: your installed codex CLI (OpenAI Codex CLI).
  • gemini-cli / antigravity: Gemini via Google's CLI (command = "gemini" or "agy"). Google is moving individual and AI Pro/Ultra users from gemini-cli to the Antigravity CLI on 2026-06-18; use the antigravity backend there.
  • openai-compat: a local model via ollama, or any OpenAI-compatible endpoint. Set base_url and a model (required), plus api_key if needed. The endpoint and model must support OpenAI-style tool calls.

Connect from anywhere

Device-to-host traffic is end-to-end encrypted (see Security); the relay just passes it through. There are a few ways to run that relay:

  • Your own, over a tailnet (recommended). orcha up embeds a relay and your devices reach it across your Tailscale network, where the transport is additionally WireGuard-encrypted. Nothing leaves machines you control.
  • A relay you host elsewhere. To reach orcha when you are off your tailnet, run the orcha-relay binary (go install github.com/spitis/orcha/cmd/orcha-relay@latest, or from the release archive) on a VPS or any host with a public address, point remote.relay at it, then run orcha serve on your machine. The relay never holds keys or plaintext; serve it over HTTPS/WSS.
  • Both at once. When remote.relay is set, orcha up connects to both the embedded local relay and the remote relay. Your phone pairs through whichever path is available: local on the tailnet, remote when away. Both pairing URLs are printed at startup.

Safety

Every byte typed into a pane passes a destructive-input classifier, the pane's policy, and an append-only audit log that fails closed: if the action cannot be logged, it is not sent. The log lives at ~/.local/state/orcha/audit.jsonl.

The policy below governs input from your devices. The brain's own typing always stops for your confirmation first (except under read-only, which denies it). orcha never lets the model act unattended.

  • read-only: observe only; never type.
  • confirm-destructive (default): your input goes through, but anything the classifier flags as destructive or irreversible (rm -rf, git push, publishes, docker push, remote rsync) stops to ask.
  • confirm-all: ask before every input.
  • full: your input goes through without a prompt (still audited).

protected_sessions are tmux session names orcha must never type into: a hard deny, not a prompt. Observation still works. Use it for a mid-experiment or production session you want visible but untouchable.

Prose typed into agent panes carries a [user via orcha] or [<model> via orcha] prefix, so anyone reading the pane can tell what came from orcha.

Voice

Voice input uses the browser's speech recognition and needs a secure context (HTTPS), which orcha up provides on a tailnet (see Run it). Spoken transcripts are shown before any gated action runs.

Spoken replies default to your phone or browser's built-in speech. For far more natural replies, set [tts].command to a wrapper that reads text on stdin and writes WAV to stdout (kokoro gives the best quality at CPU-realtime; piper is the fastest), and use voice / voices to pick presets.

Commands

  • orcha up: embed a relay, serve, and open a pairing window. The quickstart path.
  • orcha serve: connect out to the relay in remote.relay and serve paired devices (for a relay you host).
  • orcha pair: open a 10-minute pairing window for one device and print its QR and link. Run it with orcha serve already running, or to add a device.
  • orcha chat: talk to the brain from this terminal, no device needed (handy for testing your config).
  • orcha status: print the current fleet (sessions, panes, activity).
  • orcha version: print the version.

Always-on (systemd)

To keep orcha running across logouts on Linux, install the user service:

mkdir -p ~/.config/systemd/user
curl -fsSL https://raw.githubusercontent.com/spitis/orcha/main/packaging/systemd/orcha.service \
  -o ~/.config/systemd/user/orcha.service
systemctl --user daemon-reload
systemctl --user enable --now orcha
loginctl enable-linger "$USER"      # keep running after logout
journalctl --user -u orcha -f       # retrieve the pairing QR / link and logs

Security

orcha types into your terminals on command from a phone. That is remote code execution as a feature, so the design is explicit about what is protected and what is trusted.

Guarantees

  • End-to-end encryption. Device-to-host traffic is NaCl box (X25519 + XSalsa20-Poly1305), fresh random nonce per frame. The relay forwards ciphertext only; it never holds keys or plaintext.
  • Pairing secrets never touch the server. They travel in the URL fragment (browsers do not send fragments to the server), are one-time use, and expire in 10 minutes.
  • Verification code. During pairing, both the terminal and the phone display a 6-digit code derived from both public keys. Verify they match before approving — this prevents an attacker who photographs the QR from silently substituting their own device.
  • Gated input. Every keystroke into any pane passes a destructive-input classifier, per-pane policies, and an append-only audit log that fails closed: if the audit entry cannot be written, the action does not execute. Control characters (newlines, tabs) are rejected in text input to prevent classifier bypass.
  • Content Security Policy. The web client is served with script-src 'self' and frame-ancestors 'none' to block XSS and clickjacking. Vendored crypto (nacl.min.js) carries a subresource integrity hash.
  • No listening ports on your machine. The host dials out; the only listener is the relay (yours, embedded in orcha up, or one you chose).

Known limitations

  • No forward secrecy. Static NaCl keys mean a future private-key compromise decrypts recorded relay traffic. A Noise handshake with ephemeral session keys is planned.
  • LAN without TLS is trust-the-network. Without Tailscale HTTPS certs, the web client is served over plaintext. An active attacker on the same WiFi could swap the client code. With TLS this is eliminated; orcha up binds plaintext to localhost-only when TLS is available.
  • No device revocation UI (yet). A lost phone retains access until you manually edit ~/.config/orcha/identity.json. A orcha devices revoke command is planned.
  • Headless mode auto-accepts pairing. Without a terminal (e.g. orcha serve under systemd), the 128-bit time-limited secret is the only guard. Interactive orcha up always shows the verification code and prompts y/N.
  • Relay is trusted for code delivery. The web client is served by the relay, so a compromised relay could serve a backdoored client (same model as all web-delivered E2E apps). Mitigations: run your own relay (orcha up embeds one), or use HTTPS/WSS to a relay you trust.

The full threat model is in SECURITY.md in the repository.