design(t-paliad-151) amend: port 22022 bypass + Phase A.0 results

Phase A.0 revealed Tailscale SSH on mRiver intercepts :22 from tailnet
peers and bypasses OpenSSH's authorized_keys entirely (banner
"SSH-2.0-Tailscale", auth method "none", command= restriction never
fires). The fix is port 22022 via a systemd ssh.socket drop-in:
Tailscale SSH only intercepts :22, so :22022 hits real OpenSSH where
the design's command=/from= shim restriction works as specified.

Updated:
- §3 locked decisions: row 5 added (port 22022, m's call 23:35)
- §4.5 new subsection: Tailscale SSH bypass via socket drop-in
  + records the "Address already in use" first-attempt failure as a
  "don't retry without cleaning sshd_config Port directives first"
  lesson
- §5.2/5.3: ssh-keyscan now uses -p 22022; known_hosts is host:port
  keyed for non-22 ports
- §6.1/6.2/6.3: SSHPort field on RemotePaliadinService config, -p
  flag in callShim, PALIADIN_REMOTE_PORT env (default 22022)
- §7 phasing: A.0 completion checked off step-by-step with concrete
  fingerprints; A.5/A.6/A.7 split out as m-driven
- §8 security: Tailscale-SSH-on-:22 risk explicitly tabled with
  port-22022 mitigation
- §10 deliverables: mRiver host-setup artifacts noted
- §12 new Phase A.0 completion summary with the three secrets m
  needs to register in Dokploy

Phase A.0 verified end-to-end:
- ssh -p 22022 paliad-prod-key m@mriver health → ok
- run-turn UUID base64msg → 3.4 s including a real Claude response
- from="100.99.98.201" correctly rejects connections from mRiver
  itself

mRiver host state in place (not in repo): authorized_keys with
restrictions, /home/m/.local/bin/paliadin-shim, ssh.socket drop-in.
Three secrets staged at ~/.paliad-staging/ on mRiver for m to copy
into Dokploy: paliad-prod-key (PALIADIN_SSH_PRIVATE_KEY),
known_hosts (PALIADIN_KNOWN_HOSTS), and the three plain env vars.

Refs m/paliad#12
This commit is contained in:
m
2026-05-07 23:37:26 +02:00
parent 024841129f
commit f952fb85c3

View File

@@ -50,9 +50,12 @@ m made four design-shaping calls via the inventor's `AskUserQuestion` pass. They
| 2 | SSH-to-mRiver protocol granularity | **Server-side `paliadin-shim` (one RPC per turn)** | | 2 | SSH-to-mRiver protocol granularity | **Server-side `paliadin-shim` (one RPC per turn)** |
| 3 | Routing trigger | **Env var `PALIADIN_REMOTE_HOST` + interface split** | | 3 | Routing trigger | **Env var `PALIADIN_REMOTE_HOST` + interface split** |
| 4 | SSH private key storage | **Dokploy secret env var `PALIADIN_SSH_PRIVATE_KEY`** | | 4 | SSH private key storage | **Dokploy secret env var `PALIADIN_SSH_PRIVATE_KEY`** |
| 5 | SSH port to bypass Tailscale SSH | **Port 22022 via `ssh.socket` drop-in (Phase A finding, 23:30)** |
Decision (1) was *not* the inventor's recommendation — host mode has known interaction risk with traefik (§4.2). m is overriding the recommendation; this design accepts the call and codifies a Phase A test step that gates the rollout on traefik still working under host mode. If Phase A blows up, the fallback is to revisit (1) in a follow-up issue, not to silently swap to a sidecar. Decision (1) was *not* the inventor's recommendation — host mode has known interaction risk with traefik (§4.2). m is overriding the recommendation; this design accepts the call and codifies a Phase A test step that gates the rollout on traefik still working under host mode. If Phase A blows up, the fallback is to revisit (1) in a follow-up issue, not to silently swap to a sidecar.
Decision (5) emerged during Phase A: Tailscale SSH on mRiver was found to intercept `:22` from tailnet peers and bypass OpenSSH's `authorized_keys` entirely (banner says "Tailscale", auth method "none"). The `command=` shim restriction therefore never fires on the standard port. Adding port 22022 via a `systemd ssh.socket` drop-in routes paliad's connections to real OpenSSH where the restriction works. m's interactive `tailscale ssh m@mriver` on `:22` stays untouched. See §4.4 for the implementation.
--- ---
## 4. Sub-design A — Container Tailscale shape ## 4. Sub-design A — Container Tailscale shape
@@ -73,9 +76,10 @@ services:
... ...
# NEW Paliadin remote-routing knobs # NEW Paliadin remote-routing knobs
- PALIADIN_REMOTE_HOST=${PALIADIN_REMOTE_HOST} # 100.99.98.203 - PALIADIN_REMOTE_HOST=${PALIADIN_REMOTE_HOST} # 100.99.98.203
- PALIADIN_REMOTE_PORT=${PALIADIN_REMOTE_PORT} # 22022 (bypasses Tailscale SSH, see §4.5)
- PALIADIN_REMOTE_USER=${PALIADIN_REMOTE_USER} # m - PALIADIN_REMOTE_USER=${PALIADIN_REMOTE_USER} # m
- PALIADIN_SSH_PRIVATE_KEY=${PALIADIN_SSH_PRIVATE_KEY} - PALIADIN_SSH_PRIVATE_KEY=${PALIADIN_SSH_PRIVATE_KEY}
- PALIADIN_KNOWN_HOSTS=${PALIADIN_KNOWN_HOSTS} # one-line ssh-keyscan output - PALIADIN_KNOWN_HOSTS=${PALIADIN_KNOWN_HOSTS} # one-line ssh-keyscan -p 22022 output
restart: unless-stopped restart: unless-stopped
``` ```
@@ -115,6 +119,31 @@ Image-size delta: alpine `openssh-client` is ~1.1 MB compressed — negligible.
- No tailscaled process inside the container. - No tailscaled process inside the container.
- No new sidecar container. - No new sidecar container.
### 4.5 Bypassing Tailscale SSH via port 22022 (Phase A discovery)
**Phase A revealed** that Tailscale SSH on mRiver intercepts `:22` from tailnet peers before OpenSSH sees the connection. The SSH banner reads `SSH-2.0-Tailscale`, the verbose log shows `Authenticated using "none"`, and the `authorized_keys command=` directive is therefore inert. mRiver's `tailscale status --json` confirms the `https://tailscale.com/cap/ssh` capability is enabled.
The fix: a separate listening port for the paliad route, where Tailscale SSH does not intercept and real OpenSSH handles auth.
mRiver uses systemd socket activation for sshd (`/usr/lib/systemd/system/ssh.socket` binds `:22`). Setting `Port 22022` in `sshd_config` is **ignored** under socket activation — listen ports come from the socket unit, not sshd's own config. The correct change is a drop-in:
```ini
# /etc/systemd/system/ssh.socket.d/paliad.conf
[Socket]
ListenStream=0.0.0.0:22022
ListenStream=[::]:22022
```
Followed by `systemctl daemon-reload && systemctl restart ssh.socket`. Both `:22` (still routed through Tailscale SSH for m's interactive use) and `:22022` (real OpenSSH) end up listening. The same sshd binary handles both — same host key, same `authorized_keys`, same sshd_config. The only difference is *which port* a peer dials.
A failed first attempt (2026-05-07 23:07) added the drop-in while a stale `Port 22022` directive in `sshd_config.d/99-paliad-test.conf` was still bound — the resulting `Address already in use` took `ssh.socket` down for ~30 s until reverted. Lesson: clean any prior `Port` directives out of `sshd_config.d/*.conf` before retrying the socket drop-in.
Phase A end-to-end test (2026-05-07 23:31) succeeded with port 22022:
- `ssh -p 22022 -i paliad-prod-key m@100.99.98.203 health``ok`
- `run-turn <uuid> <base64-msg>` → 3.4 s round-trip including a Claude-Code response
- `from="100.99.98.201"` correctly rejected a connection sourced from mRiver itself (`Permission denied (publickey,password)`)
--- ---
## 5. Sub-design B — SSH identity, restricted shim, host-key pinning ## 5. Sub-design B — SSH identity, restricted shim, host-key pinning
@@ -167,8 +196,8 @@ Each restriction matters:
`StrictHostKeyChecking=accept-new` is too loose for a long-lived production identity (one-time MITM during first connect substitutes a different key forever). Instead: `StrictHostKeyChecking=accept-new` is too loose for a long-lived production identity (one-time MITM during first connect substitutes a different key forever). Instead:
- During Phase A, run `ssh-keyscan -t ed25519 100.99.98.203` on mLake. - During Phase A, run `ssh-keyscan -p 22022 -t ed25519 100.99.98.203` on mLake.
- Capture the single output line. - Capture the single output line. The host-key portion is identical to the `:22` entry — same sshd, same keys — but the `[100.99.98.203]:22022` prefix matters because OpenSSH's `known_hosts` is `host:port`-keyed for non-22 ports.
- Store as Dokploy secret `PALIADIN_KNOWN_HOSTS`. - Store as Dokploy secret `PALIADIN_KNOWN_HOSTS`.
- At container startup, write to `/tmp/paliadin-known_hosts` chmod 644. - At container startup, write to `/tmp/paliadin-known_hosts` chmod 644.
- Pass to OpenSSH via `-o UserKnownHostsFile=/tmp/paliadin-known_hosts -o StrictHostKeyChecking=yes`. - Pass to OpenSSH via `-o UserKnownHostsFile=/tmp/paliadin-known_hosts -o StrictHostKeyChecking=yes`.
@@ -318,6 +347,7 @@ type RemotePaliadinService struct {
db *sqlx.DB db *sqlx.DB
users *UserService users *UserService
sshHost string // 100.99.98.203 sshHost string // 100.99.98.203
sshPort int // 22022 — bypasses Tailscale SSH on :22 (see §4.5)
sshUser string // m sshUser string // m
sshKeyPath string // /tmp/paliadin-id_ed25519-<rand> sshKeyPath string // /tmp/paliadin-id_ed25519-<rand>
knownHosts string // /tmp/paliadin-known_hosts knownHosts string // /tmp/paliadin-known_hosts
@@ -345,13 +375,15 @@ case remoteHost != "":
if keyPath == "" { log.Fatalf("paliadin: PALIADIN_REMOTE_HOST set but no PALIADIN_SSH_PRIVATE_KEY") } if keyPath == "" { log.Fatalf("paliadin: PALIADIN_REMOTE_HOST set but no PALIADIN_SSH_PRIVATE_KEY") }
knownHosts, err := loadPaliadinKnownHosts() knownHosts, err := loadPaliadinKnownHosts()
if err != nil { log.Fatalf("paliadin: load known_hosts: %v", err) } if err != nil { log.Fatalf("paliadin: load known_hosts: %v", err) }
port, _ := strconv.Atoi(cmpOr(os.Getenv("PALIADIN_REMOTE_PORT"), "22022"))
paliadin = services.NewRemotePaliadinService(db, userSvc, services.RemotePaliadinConfig{ paliadin = services.NewRemotePaliadinService(db, userSvc, services.RemotePaliadinConfig{
SSHHost: remoteHost, SSHHost: remoteHost,
SSHPort: port,
SSHUser: cmpOr(os.Getenv("PALIADIN_REMOTE_USER"), "m"), SSHUser: cmpOr(os.Getenv("PALIADIN_REMOTE_USER"), "m"),
SSHKeyPath: keyPath, SSHKeyPath: keyPath,
KnownHostsPath: knownHosts, KnownHostsPath: knownHosts,
}) })
log.Printf("paliadin: remote mode → ssh %s@%s", "m", remoteHost) log.Printf("paliadin: remote mode → ssh %s@%s:%d", "m", remoteHost, port)
case localTmuxAvailable(): case localTmuxAvailable():
paliadin = services.NewLocalPaliadinService(db, userSvc, "", "") paliadin = services.NewLocalPaliadinService(db, userSvc, "", "")
log.Printf("paliadin: local tmux mode") log.Printf("paliadin: local tmux mode")
@@ -370,7 +402,10 @@ default:
```go ```go
func (s *RemotePaliadinService) callShim(ctx context.Context, args ...string) ([]byte, error) { func (s *RemotePaliadinService) callShim(ctx context.Context, args ...string) ([]byte, error) {
sshArgs := []string{ sshArgs := []string{
"-F", "/dev/null", // ignore /etc/ssh/ssh_config + ~/.ssh/config
"-i", s.sshKeyPath, "-i", s.sshKeyPath,
"-p", strconv.Itoa(s.sshPort), // 22022 — bypasses Tailscale SSH on :22
"-o", "IdentitiesOnly=yes", // don't fall back to other keys
"-o", "UserKnownHostsFile=" + s.knownHostsPath, "-o", "UserKnownHostsFile=" + s.knownHostsPath,
"-o", "StrictHostKeyChecking=yes", "-o", "StrictHostKeyChecking=yes",
"-o", "BatchMode=yes", "-o", "BatchMode=yes",
@@ -500,25 +535,49 @@ Verdict: skip ControlMaster in v1. If turn latency over Tailscale is measured >3
Goal: validate the round-trip end-to-end on a deployed paliad, before touching the image. Goal: validate the round-trip end-to-end on a deployed paliad, before touching the image.
Steps: **Phase A.0 (DONE 2026-05-07 23:31):** SSH+shim end-to-end on the tailnet.
1. **Generate keypair** on mRiver: `ssh-keygen -t ed25519 -N "" -C "paliad-prod" -f /tmp/paliad-prod-key`. 1. **Generate keypair** on mRiver: `ssh-keygen -t ed25519 -N "" -C "paliad-prod" -f ~/.paliad-staging/paliad-prod-key`. Fingerprint `SHA256:5uV8v872F/IhJycjjq0crFue/emAYfw71N9bxTvkl9c`.
2. **Install shim** at `/home/m/.local/bin/paliadin-shim` (script from §5.4), `chmod 755`. 2. **Commit shim** to `scripts/paliadin-shim` and **install** at `/home/m/.local/bin/paliadin-shim`, `chmod 755`.
3. **Write authorized_keys** with the public key + restrictions from §5.2. 3. **Write authorized_keys** with public key + `command=`/`from="100.99.98.201"`/no-pty/no-port-forwarding/no-agent-forwarding/no-X11-forwarding/no-user-rc restrictions (§5.2).
4. **Capture mRiver host key**: `ssh-keyscan -t ed25519 100.99.98.203 > /tmp/paliad-known_hosts` from mLake. 4. **Add port 22022 socket drop-in** at `/etc/systemd/system/ssh.socket.d/paliad.conf`, `systemctl daemon-reload && systemctl restart ssh.socket`. Both `:22` (Tailscale SSH for m) and `:22022` (real OpenSSH for paliad) listening (§4.5).
5. **Confirm host networking trade-off (§4.2):** flip the running paliad-prod compose to `network_mode: host` on a temporary branch; redeploy via Dokploy; verify `https://paliad.de/` still serves correctly via traefik. **Gate**: if traefik 502s, abort Phase A and revisit decision 1 in a follow-up issue. 5. **Capture mRiver:22022 host key**: `ssh-keyscan -p 22022 -t ed25519 100.99.98.203 > ~/.paliad-staging/known_hosts` from mLake. Fingerprint `SHA256:HPoUzy60Cb8yLERIBQcB2mHihNST3NaTODx5Ypd1XpA`.
6. **Smoke-test SSH from inside the container**: 6. **Smoke-test from mLake** (without paliad container, just raw ssh from mLake's host shell):
``` ```
docker exec -it paliad-prod sh ssh -F /dev/null -i /tmp/paliad-prod-key -o UserKnownHostsFile=/tmp/paliad-known_hosts \
apk add --no-cache openssh-client # one-shot, before Dockerfile change -o StrictHostKeyChecking=yes -o IdentitiesOnly=yes -o BatchMode=yes \
ssh -i /tmp/key -o UserKnownHostsFile=/tmp/known_hosts m@100.99.98.203 health -p 22022 m@100.99.98.203 health
# expected: "ok" ok
ssh … run-turn $(uuidgen) "$(printf 'Hallo Paliadin' | base64 -w0)" ssh … run-turn $(uuidgen) "$(printf 'Sag …' | base64 -w0)"
# expected: response body, then ".../uuid.txt" cleaned up → "test ok" (3.4 s round-trip including a real Claude response)
``` ```
7. **Wire env vars manually** via Dokploy UI for one deploy; confirm `/paliadin` works end-to-end against mRiver. 7. ✅ **from= rejection verified**: the same key from mRiver itself (`100.99.98.203`) → `Permission denied (publickey,password)` as expected.
If Phase A passes: codify into Phase B. If it fails on (5), the design rolls back to a sidecar in a new issue (decision 1 follow-up). If it fails elsewhere, fix the shim or the SSH config; the architecture is fine. **Phase A.5 (PENDING m's hands):** validate `network_mode: host` + traefik routing on prod paliad.de.
- Branch the live `docker-compose.yml` on a temp branch.
- Add `network_mode: host` to the `web` service; remove `expose: ["8080"]`.
- Push to trigger a Dokploy redeploy.
- `curl --connect-timeout 5 -sSI https://paliad.de/` — expect 200 (or login redirect), NOT 502.
- If 502: revert the temp branch (`git revert HEAD && git push`); revisit decision 1 in a follow-up issue.
- If 200: keep the host-mode change; ready for Phase B.
This is **m's call to execute** — it briefly touches prod paliad.de. Inventor/coder should not flip prod compose without explicit go-ahead. Rollback is one revert + redeploy.
**Phase A.6 (after A.5 passes):** smoke-test SSH from inside the paliad-prod container itself (the real container, not just the mLake host shell):
```
docker exec -it <paliad-container> sh
apk add --no-cache openssh-client # one-shot, before Dockerfile change
ssh -F /dev/null -i /tmp/paliad-prod-key -o UserKnownHostsFile=/tmp/paliad-known_hosts \
-o StrictHostKeyChecking=yes -o IdentitiesOnly=yes -o BatchMode=yes \
-p 22022 m@100.99.98.203 health
# expected: "ok"
```
This proves the container's host-mode networking actually delivers a tailnet connect.
**Phase A.7:** wire env vars manually via Dokploy UI for one deploy; confirm `/paliadin` chat works against mRiver from paliad.de.
If A.5 fails: the design rolls back to a sidecar in a new issue (decision 1 follow-up). The SSH path (A.0) and traefik path (A.5) are independent — A.0 is already proven; only A.5+ is at risk.
### Phase B — bake into Dockerfile + Dokploy secrets ### Phase B — bake into Dockerfile + Dokploy secrets
@@ -540,8 +599,9 @@ If Phase A passes: codify into Phase B. If it fails on (5), the design rolls bac
| Risk | Mitigation | | Risk | Mitigation |
|---|---| |---|---|
| Stolen private key → arbitrary SSH on mRiver | `command=` shim restriction + `from="100.99.98.201"` + ed25519 key + private key only in Dokploy secret store (encrypted at rest) | | Stolen private key → arbitrary SSH on mRiver | `command=` shim restriction + `from="100.99.98.201"` + ed25519 key + private key only in Dokploy secret store (encrypted at rest); paliad route uses port 22022 where real OpenSSH enforces all of the above |
| Stolen private key → tailnet-wide SSH from non-mLake host | `from="100.99.98.201"` clause | | Stolen private key → tailnet-wide SSH from non-mLake host | `from="100.99.98.201"` clause (verified: rejected from mRiver itself in Phase A.0) |
| Tailscale SSH on `:22` bypasses `authorized_keys` | The paliad-prod key's `command=` restriction is not enforced on `:22`. Mitigation: paliad always dials `:22022`, which is real OpenSSH. m's interactive `tailscale ssh m@mriver` on `:22` continues to be governed by Tailscale ACLs, separate from paliad's identity. |
| Container compromise → key extraction | Key written to tmpfile chmod 600, only root inside container can read; alpine container has no shell-on-error trampolines | | Container compromise → key extraction | Key written to tmpfile chmod 600, only root inside container can read; alpine container has no shell-on-error trampolines |
| Host-key MITM during connect | Pinned `known_hosts`; `StrictHostKeyChecking=yes` | | Host-key MITM during connect | Pinned `known_hosts`; `StrictHostKeyChecking=yes` |
| Shim argument injection (e.g. via `run-turn $(rm -rf /)`) | Shim parses positional args from `$SSH_ORIGINAL_COMMAND` via `read -r -a`; never passes args to a subshell `eval`; turn_id validated by UUID regex; message body always base64-decoded into a single shell variable, never re-evaluated | | Shim argument injection (e.g. via `run-turn $(rm -rf /)`) | Shim parses positional args from `$SSH_ORIGINAL_COMMAND` via `read -r -a`; never passes args to a subshell `eval`; turn_id validated by UUID regex; message body always base64-decoded into a single shell variable, never re-evaluated |
@@ -567,15 +627,16 @@ These were called out in the issue but the design intentionally does not solve t
When this design is approved and the coder shift starts, the work splits roughly into: When this design is approved and the coder shift starts, the work splits roughly into:
- `Dockerfile` — `+openssh-client`. - `Dockerfile` — `+openssh-client`.
- `docker-compose.yml` — `network_mode: host`, four new env entries. - `docker-compose.yml` — `network_mode: host`, five new env entries (`PALIADIN_REMOTE_HOST`, `PALIADIN_REMOTE_PORT`, `PALIADIN_REMOTE_USER`, `PALIADIN_SSH_PRIVATE_KEY`, `PALIADIN_KNOWN_HOSTS`).
- `internal/services/paliadin.go` — extract `Paliadin` interface; rename existing to `LocalPaliadinService`; pull DB-only methods (`ListRecentTurns`, `Stats`, `IsOwner`) into a shared embedded `paliadinDB` so both implementations get them for free. - `internal/services/paliadin.go` — extract `Paliadin` interface; rename existing to `LocalPaliadinService`; pull DB-only methods (`ListRecentTurns`, `Stats`, `IsOwner`) into a shared embedded `paliadinDB` so both implementations get them for free.
- `internal/services/paliadin_remote.go` — new file: `RemotePaliadinService`, `RemotePaliadinConfig`, `callShim`, `healthGate`, `ensureBootstrapped`, `classifySSHError`, `ErrMRiverUnreachable`. - `internal/services/paliadin_remote.go` — new file: `RemotePaliadinService`, `RemotePaliadinConfig` (with `SSHPort`), `callShim`, `healthGate`, `ensureBootstrapped`, `classifySSHError`, `ErrMRiverUnreachable`.
- `internal/services/paliadin_remote_test.go` — unit tests with a mocked `callShim`. - `internal/services/paliadin_remote_test.go` — unit tests with a mocked `callShim`.
- `cmd/server/main.go` — env-var-based wiring (§6.2), `loadPaliadinSSHKey`, `loadPaliadinKnownHosts`. - `cmd/server/main.go` — env-var-based wiring (§6.2), `loadPaliadinSSHKey`, `loadPaliadinKnownHosts`, `PALIADIN_REMOTE_PORT` parse with default `22022`.
- `frontend/src/client/paliadin.ts` — one `case` in `friendlyErrorMessage` for `mriver_unreachable`. - `frontend/src/client/paliadin.ts` — one `case` in `friendlyErrorMessage` for `mriver_unreachable`.
- `frontend/src/i18n.ts` — two new keys (`paliadin.error.mriver_unreachable.de` / `.en`). - `frontend/src/i18n.ts` — two new keys (`paliadin.error.mriver_unreachable.de` / `.en`).
- `scripts/paliadin-shim` — server-side script (§5.4); copied to mRiver during Phase A, not part of any container. - `scripts/paliadin-shim` — server-side script (§5.4); already shipped + installed on mRiver during Phase A.0, not part of any container. Repo location chosen so the security-relevant script is version-controlled.
- `docs/project-status.md` — note Phase 0.5 (PoC) → Phase 0.6 (Tailscale-SSH prod route). - `docs/project-status.md` — note Phase 0.5 (PoC) → Phase 0.6 (Tailscale-SSH prod route).
- **mRiver host setup (one-time, already done in Phase A.0):** `/etc/systemd/system/ssh.socket.d/paliad.conf` (port 22022 listen drop-in); `~/.ssh/authorized_keys` (paliad-prod public key with restrictions); `/home/m/.local/bin/paliadin-shim` (executable). These are NOT in the repo because they live on m's laptop; `docs/project-status.md` should reference them.
No DB migrations needed — `paliad.paliadin_turns` schema already covers everything (`error_code` field already accepts free-form codes including `mriver_unreachable`). No DB migrations needed — `paliad.paliadin_turns` schema already covers everything (`error_code` field already accepts free-form codes including `mriver_unreachable`).
@@ -583,10 +644,34 @@ No DB migrations needed — `paliad.paliadin_turns` schema already covers everyt
## 11. Open questions for review ## 11. Open questions for review
- **Q (m):** Phase A test step 5 expects traefik to keep working under host-mode. If a quick search confirms Dokploy's traefik can't route to host-network services without manual `loadbalancer.server.url` config, we should know before Phase A. Worth a 5-minute Dokploy doc check before merging Phase B. - **Q (m), still open:** Phase A.5 (traefik+host-mode on prod paliad.de) is not yet executed. m drives this; rollback is one revert. Dokploy doc check before flipping is recommended but not blocking.
- **Q (m):** Should the `paliadin-shim` script live in this repo (`scripts/paliadin-shim`) and be version-pinned, or is it a one-off that lives only on mRiver? Repo location lets us audit changes; mRiver-only keeps deploy footprint smaller. - **Q (m), resolved 2026-05-07 23:50:** shim location → repo (`scripts/paliadin-shim`, committed in `0248411`). Version-controlled and auditable.
- **Q (m):** `ANTHROPIC_API_KEY` env var reservation in compose comments — keep the comment line for production-v1, or strip it now that this design supersedes that path for the foreseeable future? - **Q (m), still open:** `ANTHROPIC_API_KEY` env var reservation in compose comments — keep for production-v1, or strip now? Not blocking either phase; defer.
--- ---
**Inventor stops here.** No code shipped this shift. Awaiting m's go/no-go on the design before the coder shift starts. ## 12. Phase A.0 completion summary (2026-05-07 23:50)
**Coder shift (noether) executed Phase A.0 in full:**
1. ✅ shim committed at `scripts/paliadin-shim` (commit `0248411`, repo-version-controlled)
2. ✅ shim installed at `/home/m/.local/bin/paliadin-shim` on mRiver
3. ✅ ed25519 keypair `paliad-prod` generated, public-key fingerprint `SHA256:5uV8v872F/IhJycjjq0crFue/emAYfw71N9bxTvkl9c`, private key staged at `~/.paliad-staging/paliad-prod-key` on mRiver (mode 600)
4. ✅ `~/.ssh/authorized_keys` written with `command=`/`from=`/no-pty/no-port-forwarding/no-agent-forwarding/no-X11-forwarding/no-user-rc restrictions
5. ✅ `ssh.socket` drop-in installed at `/etc/systemd/system/ssh.socket.d/paliad.conf`; both `:22` and `:22022` listening
6. ✅ host key for `:22022` captured at `~/.paliad-staging/known_hosts` (fingerprint `SHA256:HPoUzy60Cb8yLERIBQcB2mHihNST3NaTODx5Ypd1XpA`)
7. ✅ end-to-end SSH+shim+Claude run-turn validated from mLake → mRiver:22022 (3.4 s round-trip)
8. ✅ `from="100.99.98.201"` rejection verified
**Three secrets ready for Dokploy registration** (m to copy from `~/.paliad-staging/` on mRiver):
- `PALIADIN_SSH_PRIVATE_KEY` ← `cat ~/.paliad-staging/paliad-prod-key`
- `PALIADIN_KNOWN_HOSTS` ← `cat ~/.paliad-staging/known_hosts`
- `PALIADIN_REMOTE_HOST=100.99.98.203`, `PALIADIN_REMOTE_PORT=22022`, `PALIADIN_REMOTE_USER=m`
**Phase A.5 (traefik+host-mode test) and Phase A.6/A.7 (in-container SSH smoke + paliad/paliadin end-to-end) await m's hands** — they touch prod paliad.de.
**Phase B (Dockerfile + Go interface split + Dokploy secrets) is unblocked from a code perspective** — but should not merge until Phase A.5 confirms the host-mode networking trade-off is acceptable.
---
**Inventor design + coder Phase A.0 complete.** Awaiting m for Phase A.5 traefik validation before the coder writes the Go interface split.