Contably OS → oxi Cutover Spec¶
Date: 2026-04-27 Status: Draft — proposes retiring the legacy PSOS / contably-os engine in favor of oxi as the production autonomous coding engine for the Contably codebase. Owner: Pierre Schurmann
0. TL;DR¶
The legacy engine ("Contably OS", "PSOS v3", psos-core package, deployed at /opt/contably-os/ on the Contabo VPS) ships software changes to the Contably product autonomously: roadmap → planner → 20 parallel claude -p workers in worktrees → critic → merge. It works, but its design is single-tenant (Contably-only), the codebase carries hardcoded paths and project strings, and its current production is sitting on a stack of CRITICAL/HIGH bugs documented in docs/psos-engine-root-cause-2026-04-23.md (liveness check broken, killswitch bypass in three modules, scope gate single-repo, auto_merge state-machine confusion).
oxi is a clean reimplementation of the same engine, with three structural improvements:
- Adapter pattern — every project-specific value (repo path, budget, dispatch host, branch prefixes, GitHub repo, critic policy) lives in a ~70-line adapter class, not in core.
- Atomic state transitions by construction —
last_progress_atis stamped in the same SQL transaction as every status update; the legacy engine's CRITICAL liveness bug is structurally unrepresentable in oxi. - 940+ tests against fake-claude + fake-GitHub — CI never touches real services; refactors do not require a production engine to validate.
oxi already has every module the legacy engine ships (heartbeat, ship_recovery, pr_watcher, auto_merge, critic, budget, deadman, oauth_watch, pr_overlap, branch_janitor, deep_fix, handoff, ingest_roadmap, seed_from_roadmap, dashboard, kill, tail_dispatch, cto_verdict, notification). The cutover is not a rebuild from scratch — it is "wire a Contably adapter, switch the systemd units to call oxi v3 tick instead of psos v3 tick, freeze the legacy engine."
Estimated effort: 5–7 days of focused work (including 1.5 days of Phase 0.5 prerequisite PRs against oxi-core) plus a 14-day parallel-run observation window before the legacy engine is decommissioned.
1. The legacy engine — what it does today¶
1.1 Components¶
| Service (systemd) | Schedule | Role |
|---|---|---|
contably-os-v3.service/.timer |
every 30s | plan-one tick: dispatch one ready task |
contably-os-v3-dashboard.service/.timer |
every 30s | render DASHBOARD.md |
contably-os-v3-dashboard-server.service |
persistent | serve DASHBOARD.html on :8765 over Tailscale |
contably-os-v3-deadman.service/.timer |
every 30 min | detect stale sessions |
contably-os-v3-pr-watcher.service/.timer |
every 10 min | auto-mark merged PRs as done |
contably-os-v3-ship-recovery.service/.timer |
every 5 min | recover abandoned ship-step sessions |
contably-os-v3-seed.service/.timer |
every 30 min | replenish task queue from roadmap fronts |
Source-of-truth code is in Contably/contably-os (the source-of-truth flipped on 2026-04-25 — the repo was unarchived, and the previous canonical escotilha/psos was retired). Deployed as the contably-os Python package, editable-installed from /opt/psos/contably-os-repo/ into the venv at /opt/psos/venv/. Currently deployed: contably-os 0.4.9 (verified 2026-04-27 during Phase 0 drift audit; tree clean, in sync with origin/main at 39abd81). The previously-attempted psos-core 0.5.0 rewrite is broken — it expects a psos_* table prefix but the live DB uses contably_os_*. Pin to contably-os 0.4.x.
1.2 Per-task lifecycle¶
- Intake. A row lands in the
psos_tasktable with a goal and acceptance criteria. Either a human writes it orseed_from_roadmapcreates one from a roadmap front. - Planner (Claude Opus 4.7) reads the task, lists files-to-touch, lists tests-to-pass, identifies dependencies, scores risk. Plan is the contract.
- Dispatch. VPS
dispatch.pypicks a host fromdispatch_pool, SSHs into the Mac Mini, firesclaude -p /orchestratein a fresh git worktree on a session-tagged branch (feat/sa-<task>-auto). - Worker loop. Read plan → write code → run tests → commit → open PR.
- Critic (Claude Opus 4.7) reads the PR diff, checks plan-conformance, security, scope creep. Approve or reject.
- Auto-merge if critic approved AND PR mergeable AND no sensitive paths AND budget under cap.
- Ship recovery picks up workers that wrote code but exited before committing.
- Consolidator (Claude Haiku 4.5) writes lessons from the cycle into the memory store.
Roles per the non-technical overview:
| Component | Model | Cadence |
|---|---|---|
| Planner | Opus 4.7 | every task |
| Worker | Opus 4.7 | up to 20 in parallel |
| Critic | Opus 4.7 | every PR |
| Consolidator | Haiku 4.5 | overnight / between cycles |
| Ship recovery | Opus 4.7 | triggered by alerts |
1.3 Topology¶
- VPS (Contabo,
100.77.51.51): runs all systemd services, holds the SQLite ledger (/opt/contably-os/db/contably-os.db), the killswitch file (/opt/contably-os/KILLSWITCH), and.env(GH_TOKEN, MINI_SSH_HOST, CLAUDE_API_KEY). - Mac Mini (Tailscale
100.66.244.112): runs the dispatchedclaude -pworker sessions in git worktrees under/Volumes/AI/Code/<project>-<session-tag>. Default main clone for Contably is/Volumes/AI/Code/contably. - Dashboard: served on
http://100.77.51.51:8765(Tailscale only). Auto-refresh 60s, three tabs (in-flight / completed / activity). - GitHub:
Contably/contably(product code),escotilha/psos(engine source), branches followfeat/<session-tag>-<topic>perconcurrent-sessions.md.
1.4 Known critical/high bugs (live as of 2026-04-23)¶
From psos-engine-root-cause-2026-04-23.md:
- CRITICAL — broken liveness.
last_progress_atis never written.tracker.bump_progress()exists but has zero callers. Orphan-sweep grace (2h) measures fromcreated_atinstead, so every dispatched task older than 2h gets reaped regardless of whether the worker is alive. Cascades into a 4.5-minute dispatch→reap→re-dispatch thrash when combined withship_recoveryre-dispatching on stable task keys. - CRITICAL — multi-repo unsafe.
worktree_provision.DEFAULT_MAIN_CLONE = "/Volumes/AI/Code/contably"is hardcoded. Any task routed to a different repo would land in the wrong worktree. - HIGH — killswitch bypass.
ship_recovery.py,auto_merge.py, andseed_from_roadmap.pyhave zero references tokill.is_set(). The killswitch only haltsheartbeat.plan_one. WithKILLSWITCHfile present, the engine still dispatches (via ship_recovery), still rejects PRs (via auto_merge), still seeds (via seed_from_roadmap). - HIGH — scope gate is single-repo.
_CONTABLY_SCOPE_ALLOWencodes only the Contably monorepo layout. PSOS self-maintenance tasks accumulatecontably_scope_violationevents every tick forever. - HIGH —
auto_mergestate-machine confusion.PR state=MERGEDtreated as rejection. Inflates the "rejected" counter, masks real rejections, may feed back into re-poll logic. - HIGH — host routing inconsistency.
plan_oneprovisions worktree ondispatch_hostnamebut fires SSH athostname(the original fallback param). Currently identical, will diverge once theairhost activates. - HIGH — observability noise.
killswitch_checkemits on every tick instead of state transitions. Ledger fills with duplicates.
There are also documented patterns for dirty-tree contamination (PR #647 swept 4607 LOC of stray host files) and engine-stub merge of staged-loss (15 of 22 worktrees on 2026-04-27 had work staged but not committed because the wrapper merged an empty stub branch on top). See ~/.claude-setup/memory/auto/semantic/pattern_dirty_tree_contamination_in_workers.md and pattern_engine_stub_merge_staged_loss.md.
These are the conditions a cutover should make structurally impossible, not patch around.
2. oxi — what it already is¶
oxi is at /Volumes/AI/Code/oxi, currently 0.1.0b1 on PyPI (pip install --pre oxi-core). It is the legacy engine, redesigned with three constraints:
- No project-specific strings in core. A CI-enforced forbidden-string list at
scripts/lint-for-leaks.shblocks any reference to "contably", project-specific paths, env var names, etc. from landing inoxi-core/. - Adapter Protocol owns project state.
oxi_core.adapterdeclares a 10-methodAdapterProtocol; every fork writes its own ~70-line class. Theoxi initwizard scaffolds it from 8 prompts. - Fake the world in tests. 940+ tests use
tests/fake_claude.py(impersonatesclaude -pstream-json) andtests/fakes.py::FakeGitHubClient(impersonatesgh). CI never touches Anthropic or GitHub.
2.1 Module map¶
oxi-core/src/oxi_core/
├── adapter.py # 10-method Protocol every fork implements
├── cli.py # oxi, oxi init, oxi status, oxi v3 tick ...
├── db.py # SQLite schema + append-only migrations
├── planner.py # roadmap → tasks
├── prompts.py # templated planner/dispatch/critic prompts
├── wizard.py # oxi init
└── v3/
├── dispatch.py # state-machine driver
├── dispatch_invoke.py # claude -p subprocess wrapper
├── dispatch_pool.py # host selection
├── heartbeat.py # reaper for stalled tasks
├── ship_recovery.py # rescue uncommitted work
├── pr_watcher.py # reconcile DB with GitHub PR state
├── pr_overlap.py # file-level conflict gate
├── auto_merge.py # critic-gated merge
├── critic.py # CriticBackend + ClaudeCriticBackend
├── judge.py # secondary verdict / cto_verdict pipeline
├── budget.py # daily-cap enforcement
├── deadman.py # silence detector
├── oauth_watch.py # credential-expiry monitor
├── cto_verdict.py # structured /cto report parser
├── notification.py # pluggable notification backends
├── brief.py # daily recap markdown
├── dashboard.py # localhost HTTP dashboard
├── engine_state.py # killswitch + plan_tier
├── kill.py # killswitch file handling
├── worktree_provision.py # git worktree lifecycle
├── github_client.py # GitHubClient protocol + gh CLI impl
├── tail_dispatch.py # live-tail stream-json logs
├── ingest_roadmap.py # roadmap → fronts table
├── seed_from_roadmap.py # auto-replenish queue
├── branch_janitor.py # delete merged remote branches
├── ci_issue_filer.py # surface stuck-red autonomous PRs
├── handoff.py # context-compaction recovery
├── deep_fix.py # multi-attempt fix loop
├── auto_recover.py # session recovery
├── auto_observe.py # observation window machinery
└── engine_health.py # liveness/health metrics
Every module the legacy engine relies on is present and tested.
2.2 What's structurally different from the legacy engine¶
| Concern | Legacy | oxi |
|---|---|---|
last_progress_at |
optional, defined but never written | stamped in the same UPDATE as every status transition |
| Reaper grace clock | falls through to created_at when last_progress_at is empty |
reads last_progress_at only; never trusts created_at |
| Killswitch | guards plan_one only; bypassed by 3 modules |
unified EngineState checked at every entrypoint |
| Repo / worktree path | DEFAULT_MAIN_CLONE hardcoded |
comes from adapter.paths() per task |
| Scope gate | single hardcoded allowlist | adapter.scope_policy() returns per-repo allowlists |
| Project strings in core | many (paths, env vars, table names, URLs) | zero, CI-enforced |
| Tests | live against real services | fake-claude + fake-GitHub, 940+ tests |
| Critic backend | hardcoded to Claude | CriticBackend Protocol; pluggable |
| GitHub client | hardcoded gh shell |
GitHubClient Protocol; pluggable |
| Notification | hardcoded Slack | NotificationBackend Protocol; pluggable |
3. The Contably adapter¶
The cutover's central deliverable is a single oxi-adapter-contably package that implements the 10-method Adapter Protocol with Contably's specific values. Skeleton:
# adapters/contably/src/oxi_adapter_contably/adapter.py
from pathlib import Path
from oxi_core.adapter import (
Adapter, BudgetCaps, DispatchPolicy, DispatchHost,
Naming, Paths, ScopePolicy, PromoteRecipe,
)
class ContablyAdapter(Adapter):
def github_repo(self) -> str:
return "Contably/contably"
def branch_prefixes(self) -> tuple[str, ...]:
return ("feat/", "fix/", "chore/", "docs/", "test/", "refactor/")
def paths(self) -> Paths:
return Paths(
main_clone=Path("/Volumes/AI/Code/contably"),
worktrees_root=Path("/Volumes/AI/Code"),
db_path=Path("/opt/oxi/db/contably.db"),
killswitch_path=Path("/opt/oxi/KILLSWITCH"),
log_dir=Path("/opt/oxi/logs"),
roadmap_path=Path("docs/roadmap.md"),
)
def budget(self) -> BudgetCaps:
return BudgetCaps(
daily_soft_warn_usd=500,
daily_hard_cap_usd=1000,
per_task_cap_usd=8.00,
)
def dispatch(self) -> DispatchPolicy:
# Single-host: VPS engine SSHs into the Mac Mini for every worker.
# `air` deferred to post-cutover (see §8.4).
return DispatchPolicy(
max_concurrent=20,
hosts=[
DispatchHost(name="mini", ssh_alias="mini", concurrency=20),
],
session_tag_pool=("sa","sb","sc","sd","se","sf"),
)
def scope_policy(self) -> ScopePolicy:
return ScopePolicy(
allow_prefixes=(
"apps/", "packages/", ".github/", "docs/",
"tasks/", "alembic/", "scripts/", "infra/", "manual/",
),
block_prefixes=(".env", "secrets/"),
sensitive_for_critic=("alembic/versions/", "infra/contably-os/"),
)
def naming(self) -> Naming:
return Naming(
engine_brand="oxi",
project_brand="Contably",
dashboard_title="Contably Engine",
)
def auto_merge(self) -> DispatchPolicy:
# opt in explicitly per Contably's risk profile
return DispatchPolicy(auto_merge=True, sensitive_block=True)
def promote_recipe(self) -> PromoteRecipe:
return PromoteRecipe(
staging_workflow="staging-deploy.yml",
production_workflow="production-promote.yml",
health_check_url="https://api.contably.ai/health",
window=("03:00", "05:00"),
timezone="America/Sao_Paulo",
)
def notification(self):
from .notification import SlackNotification
return SlackNotification(channel="#engine")
Where the legacy engine has _CONTABLY_SCOPE_ALLOW literal in dispatch.py, the adapter returns it. Where the legacy engine has DEFAULT_MAIN_CLONE = "/Volumes/AI/Code/contably" hardcoded in worktree_provision.py, the adapter returns it. The cutover surface is one file plus its tests.
The oxi init wizard already produces ~80% of this from 8 prompts; the remaining work is the project-specific scope policy, dispatch hosts, and promote recipe.
4. Cutover plan — 5 phases¶
Phase 0: Pre-flight (half day)¶
Step-by-step commands for these four are in docs/runbooks/cutover-phase-0.md (PR #205). High level:
- [x] Audit
Contably/contably-os@mainfor any drift since0.4.9(the deployed version). Done 2026-04-27 — repo clean, in sync withorigin/mainat39abd81, no drift. - [x] Snapshot
/opt/contably-os/db/contably-os.dbaslegacy-ledger-<YYYY-MM-DD>.db. Keep the snapshot for at least 90 days. Done 2026-04-27 →~/cutover-snapshots/legacy-ledger-2026-04-27.db(15 MB, 38080 ledger events, task counts: 253 merged / 27 abandoned / 1 running). - [x] Set the legacy
KILLSWITCHfile. Already set 2026-04-27 02:38 UTC ("halted by contably-os v3 halt"). Engine confirmed quiescent — onlydashboard_renderedevents firing. - [x] Verify
oxi-core(currently0.1.0b2) plus the Contably adapter (oxi-adapter-contably, PR #203 branch) install cleanly on the VPS. Done 2026-04-27 —oxi statusreports correct identity (contably), repo (Contably/contably), plan tier (max_20x), budget caps ($500 / $1000).
Phase 0.5: Prerequisite PRs against oxi-core (1.5 days)¶
These land in oxi-core before the Contably adapter is built, because they change the engine surface the adapter consumes:
- [x]
MemoryBackendProtocol (~1 day, see §8.1). Newoxi_core.memorymodule, default file-backed implementation reading~/.claude-setup/memory/auto/,adapter.memory()method added to the Protocol with a default ofNullMemoryBackend. Planner reads viaadapter.memory().read_relevant(task)before each plan. Tests against a fake backend. Done 2026-04-26 — PR #202 (def80d2); shipsNullMemoryBackend+FileMemoryBackend+get_memory_backend()accessor with adapter fallback. - [x] Ship-recovery dirty-tree filter (~0.5 day, see §8.3). Add
ship_recovery_dirty_tree_policytoDispatchPolicy; default"files_touched_only". Tested with a stray-host-file fixture. Done 2026-04-26 — PR #201 (14be113); three policies (permissive/strict/files_touched_only), defaultfiles_touched_only, emitsship_recovery_skipped_dirty_treeevents. Extended 2026-04-27 — PR #210 (d982cb3) added Case B (committed-but-unpushed branch push).
Both ship as regular PRs through oxi's own dogfood loop (oxi shipping its own code change, with critic). Confirms the engine is healthy enough to extend itself before being asked to take over Contably.
Phase 1: Build the Contably adapter (1 day)¶
Lives at adapters/contably/ (in-tree, not a separate repo) — oxi-adapter-contably 0.1.0b1, depends on oxi-core>=0.1.0b3.
- [x] Adapter scaffolded — PR #203 (
1a25614). In-tree atadapters/contably/rather than a sibling../oxi-adapter-contably/repo, registered through theoxi.adapterssetuptools entry point. - [x] All 8 wizard-equivalent values filled in:
github_repo,roadmap_location,branch_prefixes,budget(BudgetCaps $500/$1000/$8/$2),dispatch_hosts(singlemini, max_concurrent 2 — PR #208),paths(.oxi/underrepo_root),plan_tier=max_20x,policy(auto_merge=True,ship_recovery_dirty_tree_policy="files_touched_only"). - [x]
promote_recipe()wired — staging/prod workflow names, 06:00 UTC cron (03:00 BRT), 2h release lock. Dormant per §8.2 until post-cutover. - [x] Tests at
adapters/contably/tests/test_contably_adapter.py— 22 tests, all pass underPYTHONPATH=oxi-core/src:adapters/contably/src python3 -m pytest. Cover Protocol conformance, budget caps, dispatch single-host, branch prefixes, promote recipe, plan tier, auto_merge, plus a forbidden-string lint (psos,nuvini,sourcerank, Tailscale IPs). - [x] Adapter loads from a fresh interpreter:
python3 -c "from oxi_adapter_contably import ContablyAdapter; ContablyAdapter().policy()"returnsDispatchPolicy(... ship_recovery_dirty_tree_policy='files_touched_only').oxi statussmoke from Phase 0 confirms VPS-side load (identitycontably, repoContably/contably).
Gaps surfaced by this audit (separate from the cutover; capture as follow-up tickets, not Phase 1 blockers):
- The §3 spec example shows
scope_policy()andnotification()as adapter methods, but the actualoxi_core.adapter.AdapterProtocol does not define either — it has 9 methods (naming,paths,budget,github_repo,roadmap_location,branch_prefixes,dispatch_hosts,promote_recipe,plan_tier,policy) plus the optionalmemory()from §8.1. The Contably adapter therefore omits both. Implications: - No
scope_policy()means no per-adapter scope gate. The legacy engine's_CONTABLY_SCOPE_ALLOW(withinfra/+manual/from PR #32) is not yet expressible.policy()returns aDispatchPolicywithout scope fields. This regresses §1.4 bug #4 ("scope gate is single-repo") only nominally — oxi-core does not run a scope gate at all today, so there's no hardcoded scope leaking into core, but there's also no enforcement. Action: scope this as a separateoxi-corePR before Phase 4 unkill —Adapter.scope_policy()returning aScopePolicy(allow_prefixes, block_prefixes, sensitive_for_critic), defaultScopePolicy()(allow-all) for back-compat, consumed bydispatch.pyandauto_merge.py. - No adapter
notification()method.oxi_core.v3.notification.NotificationBackendexists withLoggingBackend+ a stderr backend; modules (e.g.deadman) take anotifier:arg directly. A Slack notifier and anAdapter.notification()accessor are still to-do. Today the engine logs operator events to stdout/stderr — workable for the parallel window, not for a final cutover. - ~~No
memory()method on the Contably adapter.~~ Resolved 2026-04-27 — adapter now returnsFileMemoryBackend(root=~/.claude-setup/memory/auto). Reads from the sharedauto/tree (wheremem-searchalready indexes lessons) and writes episodic entries intoauto/episodic/; identifier prefixes keep cross-fork files distinguishable. A future "Contably-scoped episodic dir" would require a backend change (the shipped backend reads fromroot/and writes toroot/episodic/, which conflicts withroot=auto/episodic/contably/); not blocking the cutover.
Phase 2: Migrate the ledger (1 day)¶
The legacy engine's SQLite schema and oxi's are close cousins but not identical (psos_task vs task, psos_event vs event, slightly different column sets). Two options:
Option A — fresh start (recommended). oxi starts with an empty DB and re-seeds from the same roadmap.md the legacy engine reads. Historical task records stay in the snapshot from Phase 0. Pros: no migration code to write or test. Cons: loses the in-flight task table — any task currently dispatched or planned in the legacy engine has to be re-discovered from the roadmap.
Option B — translation script. Write a one-shot tools/migrate-from-psos.py that reads the legacy DB and translates psos_task → task, psos_event → event, deduping by stable task key. Pros: preserves history, smooth handover. Cons: extra ~half-day of work, plus a test pass against a copy of production data.
Decision: Option A (Pierre, 2026-04-27)¶
The choice is A. Rationale:
- The roadmap is already the source of truth. Every long-lived legacy task originated from
docs/contably-product-roadmap-2026-Q2.md(or its predecessor) and was re-seeded byseed_from_roadmapwhenever a slot opened. Re-seeding from the same roadmap on day 1 of the parallel window reconstructs the queue; nothing strategic is lost. - The Phase 0 snapshot of
legacy-ledger-2026-04-27.db(15 MB, 38,080 events, 253 merged / 27 abandoned / 1 running) preserves history. Forensics, cost reconciliation, and audit replays read the snapshot directly. oxi has no need to ingest legacy events — its own ledger starts fresh. - The 27 abandoned tasks are evidence against migration, not for it. A task the legacy engine couldn't finish in 14+ days, after multiple dispatch retries, is exactly the kind of input oxi's planner should re-evaluate from scratch — likely with a smaller scope, a different file set, or marked obsolete. Importing them as
pendingin oxi'stasktable just hands the new engine the legacy engine's stuck queue. The singlerunningtask at snapshot time is at this point cold; the legacy killswitch was set 2026-04-27 02:38 UTC and the engine has been quiescent since. - Schema drift is a real cost.
psos_task→tasklooks shallow but hides differences infiles_touchedJSON shape,last_progress_atsemantics (oxi's invariant: always populated; legacy's bug: never populated), critic-result fields, and event-payload schemas. A migration script that gets any of those wrong silently corrupts the new engine's reaper math, which is the exact bug the cutover exists to eliminate (§1.4 #1 — broken liveness). Writing translation code for legacy schemas reintroduces the conditions for the bug. - Effort delta is decisive. Option A is zero engineering work —
oxi v3 initcreates an empty DB atpaths().db_path(/Volumes/AI/Code/contably/.oxi/oxi.dbon the Mini per the adapter;/opt/oxi/db/contably.dbon the VPS once the Phase 3 deploy points there). Option B is half-a-day to writetools/migrate-from-psos.py, plus a careful test pass against a copy of the snapshot, plus a second pass during Phase 4 if the production legacy DB has drifted between the snapshot and cutover. The benefit (preserved in-flight queue) is precisely what point 3 says we don't want.
What we explicitly accept by choosing A:
- The 1
runningtask at snapshot time is dropped. Per the killswitch state, it's not actually running anywhere — it's a dead row. - Cost dashboards starting day 1 of oxi's parallel run will show only oxi's spend; comparison against legacy's last-week numbers happens by reading the snapshot, not the live ledger.
cumulativebrief metrics start at zero. The 14-day parallel window's daily comparison againstlegacy-ledger-2026-04-27.dbcovers any "did oxi regress" question.
What stays preserved (independent of A vs B):
- The Phase 0 snapshot at
~/cutover-snapshots/legacy-ledger-2026-04-27.dbis kept ≥ 90 days (per §6 risk row 3). - The roadmap markdown at
docs/contably-product-roadmap-2026-Q2.mdis unchanged — oxi reads it viaadapter.roadmap_location(). - All historical merged PRs on
Contably/contablyremain on GitHub; nothing in either ledger is the source of truth for shipped code.
Action items that follow from A:
- [x] Decide A vs B → A, Pierre, 2026-04-27.
- [ ] Set up oxi's DB at the path the adapter declares (
/Volumes/AI/Code/contably/.oxi/oxi.dbon the Mini for local-dev runs;/opt/oxi/db/contably.dbon the VPS for production). Phase 3'soxi v3 initdoes this. - [ ] Confirm Phase 0 snapshot is replicated off the VPS (S3 or the macOS Time Machine target) before Phase 3 begins — single-machine snapshots fail the 90-day retention test. Open: needs a one-line
resticoraws s3 cpstep; track in Phase 3 deploy checklist.
Option B is not pursued. If a later need surfaces (e.g., "we need to reconcile a specific past task's cost against oxi's records"), the snapshot remains queryable as a read-only sqlite file. There's no version of the cutover plan where Option B becomes the right call after this point.
Phase 3: Deploy oxi to the VPS, parallel-mode (1 day)¶
- [ ] Provision
/opt/oxi/on the VPS (same Contabo box,100.77.51.51). Oxi runs alongside the legacy engine, not on a different host — they share the GitHub remote and the dispatch host (Mac Mini), so colocating reduces moving parts. - [ ]
pip install --pre oxi-coreinto a fresh venv at/opt/oxi/venv/. - [ ]
pip install -e /opt/oxi/oxi-adapter-contably(deploy the adapter). - [ ] Copy the
.envkeys (GH_TOKEN, MINI_SSH_HOST, CLAUDE_API_KEY) from/opt/contably-os/.envto/opt/oxi/.env. The keys are identical. - [ ] Install oxi systemd units from
infra/oxi/systemd/(units authored separately — see PR #204). Two units, not eight: in oxi's current architecture every reaper / pr-watcher / auto-merge / ship-recovery pass runs insideoxi v3 tick, so the deployment surface collapses to:
| Service | Schedule | Calls |
|---|---|---|
oxi-tick.service/.timer |
every 30s | oxi v3 tick --real-claude --times 1 |
oxi-dashboard.service |
persistent | oxi dashboard --port 8766 --bind ${OXI_DASHBOARD_BIND} |
The dashboard binds to :8766 (one port above the legacy engine's :8765) so both can run during the parallel window. After cutover (Phase 5) the port flips to :8765 so the muscle-memory Tailscale URL keeps working.
The full first-time install runbook lives at infra/oxi/README.md (also in PR #204): create the oxi user, clone the repo to /opt/oxi/repo, build the venv, install oxi-core + the project adapter, write /opt/oxi/.env from the env.example template, set the killswitch ON, enable both units, then lift the killswitch when ready.
- [ ] Set the oxi killswitch (
/opt/oxi/KILLSWITCH) before enabling timers — first start should be quiet. - [ ]
systemctl enable --now oxi-tick.timer oxi-dashboard. - [ ] Confirm
curl -sI http://${OXI_DASHBOARD_BIND}:8766/returns 200.
Phase 4: Parallel-run observation window (14 days)¶
The legacy engine's killswitch is set (Phase 0) so it dispatches nothing new. Oxi runs against the same roadmap, dispatches into the same Mac Mini, opens PRs against the same Contably/contably repo. Both dashboards stay live (:8765 legacy, :8766 oxi).
- [ ] Day 1–3: unkill oxi (
oxi v3 unkill). Daily check the brief (oxi brief --hours 24), the dashboard, and thetasktable. Compare cost-per-merged-PR against the legacy engine's last-week numbers. - [ ] Day 4–7: raise the daily hard cap incrementally (start at $200, lift to $500, then $1000) as confidence grows. Watch for any regression in:
- Liveness (no
last_progress_at=""rows underdispatched). - Killswitch (set it, confirm all 8 services freeze except
pr-watcher). - Multi-repo safety (try a
contably-osself-maintenance task; confirm scope policy + worktree path route correctly). - Auto-merge state machine (a manually-merged PR should produce a benign
auto_merge_noop_already_mergedevent, not arejected). - [ ] Day 8–14: if oxi has merged ≥30 PRs without operator intervention beyond decision-gates, promote it to the only engine. If anything wobbles, re-set its killswitch, unkill the legacy engine, and decide whether to rebuild or roll back.
The 14-day number matches what the origin-feature-gap-2026-04-24.md doc treats as oxi's "observation window" elsewhere in the project. This is the same window — just against the Contably codebase instead of oxi's own.
Phase 5: Decommission the legacy engine (1 day)¶
After 14 successful days of parallel running:
- [ ]
systemctl disable --now contably-os-v3-*.timer(all 7 timers). - [ ]
systemctl disable --now contably-os-v3-dashboard-server. - [ ] Move
/opt/contably-os/→/opt/contably-os-archive-2026-MM-DD/. Don't delete — keep the DB and logs for forensics. - [ ] Archive the
escotilha/psosrepo on GitHub (set to read-only, add a redirect README pointing tooxi). Don't delete. - [ ] Update
Contably/contably/infra/README.mdto point atinfra/oxi/instead ofinfra/contably-os/. - [ ] Move oxi's dashboard from
:8766back to:8765(so the muscle-memory Tailscale URL keeps working). - [ ] Update
~/.claude-setup/memory/auto/MEMORY.mdto mark the PSOS-specific patterns as historical.
5. What does NOT get migrated¶
These exist in the legacy engine but should not cross into oxi or its Contably adapter — per origin-feature-gap-2026-04-24.md and the forbidden-string lint:
- Project-named tables/files in code.
psos_task,psos_event,psos-corepackage name. Oxi usestask,event, packageoxi_core. - Hardcoded env vars referencing the project. Oxi reads from
~/.claude/.credentials.jsonstandard path or via the adapter. registry.py(legacy session/pattern registry). Replaced by the adapter model.council.py(project-specific deliberation skill invocation). Out of scope for the engine; if needed, rebuild as an adapter-level skill.- Verbatim copy of any legacy file. All re-implementations study behavior, then write fresh code. Forbidden-string lint at
/Volumes/AI/Code/oxi/scripts/lint-for-leaks.shenforces this in CI. - The episodic+semantic memory machinery (
hooks.py+consolidator.pyfrom the legacy engine). Listed as "Phase later" inorigin-feature-gap-2026-04-24.md. Re-design needed; rsync-to-central-host model assumes the legacy topology. Until oxi has its own memory backend behind a Protocol, the consolidator role remains a manual~/.claude-setup/skills/memory-consolidation/invocation triggered nightly via/loop.
6. Risks and mitigations¶
| Risk | Mitigation |
|---|---|
| oxi has a latent bug not surfaced by the fake-claude harness | 14-day parallel-run window with the legacy engine's killswitch set; killswitch oxi at the first regression and unkill the legacy engine |
| Cost regression (oxi spends more per merged PR than legacy) | Daily oxi brief --hours 24 comparison; adapter's daily_hard_cap_usd defaults to $200, raised incrementally to $1000 |
| In-flight tasks lost during ledger migration | Phase 0 snapshot is preserved 90 days; Option A (fresh start) explicitly accepts this since tasks dispatched >14 days are stale anyway |
| Multi-repo collision (oxi worktree path vs legacy worktree path on the Mac Mini) | Both engines use the same /Volumes/AI/Code/<project>-<session-tag> convention; distinct session-tag pools per engine prevent overlap. Add oxi- prefix to oxi's session tags during the parallel window to make ownership unambiguous |
Dashboard URL muscle memory points at :8765 (legacy) |
Oxi runs on :8766 during the parallel window; on cutover, switch oxi to :8765 and the legacy dashboard server is already disabled |
| Killswitch mistake (set the wrong one) | oxi v3 kill --reason "..." writes the file with a reason; oxi status shows which engine is killed; brief reminder card in the runbook |
| PSOS bug carries into the Contably adapter | The adapter is ~70 lines; it's pure data, no logic. The engine's bugs were in core; the adapter can't reproduce them |
7. Acceptance criteria¶
The cutover is complete when:
- [ ] Legacy
contably-os-v3-*services are disabled and the directory archived. - [ ] Oxi has merged ≥50 PRs against
Contably/contablyautonomously, with critic approvals, in ≤14 days. - [ ] Zero
last_progress_at=""rows ever appear in oxi'stasktable fordispatchedstatus (structurally guaranteed by oxi, but verify in production). - [ ] Killswitch file freezes ALL of: dispatch, ship-recovery, seed, auto-merge — confirmed by toggling and observing zero new ledger events except from
pr-watcher. - [ ] A
contably-osself-maintenance task (or any non-Contably-monorepo task) routes to the right worktree and the right scope policy without code changes. - [ ] Dashboard at
http://100.77.51.51:8765/is oxi's, auto-refreshes 60s, three tabs working. - [ ]
oxi brief --hours 24produces a clean markdown daily recap and posts to Slack via theNotificationBackend. - [ ]
infra/contably-os/in the Contably repo is replaced byinfra/oxi/(systemd units, README, runbook).
8. Resolved scope decisions (Pierre, 2026-04-27)¶
8.1 Memory backend — scope a PR before Phase 4¶
A MemoryBackend Protocol PR lands against oxi-core before the parallel-run window opens, so oxi can read what the legacy consolidator wrote and start producing its own episodic entries on day 1. Sketch:
oxi_core.memory.MemoryBackendProtocol withread_relevant(task) -> list[MemoryEntry]andwrite_episodic(task, entry) -> None.- Default implementation reads from
~/.claude-setup/memory/auto/(the path Pierre's existingmem-searchindex already covers). - Adapter exposes
memory()returning a configured backend; Contably adapter points at~/.claude-setup/memory/auto/episodic/contably/. - Planner reads via
memory().read_relevant(task)before each plan (mirrors the legacyconsolidator → plannerloop without the rsync-to-central-host coupling). - Nightly consolidation stays manual (Pierre's existing
/loopovermemory-consolidationskill) until day-30 of oxi production.
This is its own PR before Phase 1 starts. Not bundled with the adapter work because it touches oxi-core and needs its own test pass.
8.2 Promote recipe — post-cutover¶
daily_promote does not block the cutover. Staging→production promotion stays on its current path (manual /deploy-conta-production invocation, or whatever cron the legacy engine has) until oxi has a green 14-day window and Pierre signs off on automated production promotes. Adapter's promote_recipe() is wired (so it doesn't need a code change later) but the engine-side daily_promote.run() is left dormant via engine_state.promote_enabled = False for the cutover and the parallel window.
Tracked as a Phase 6 item: after acceptance criteria pass, flip the flag and observe for another 7 days before considering the promote loop production.
8.3 Ship-recovery dirty-tree guard — verified, must fix before parallel window¶
Verification result: oxi's ship_recovery.py has the F-01 risk built into its core flow.
oxi-core/src/oxi_core/v3/ship_recovery.py:97-105 checks git status --porcelain to find candidate worktrees (non-empty = recovery needed), then at line 184 runs git add -A and commits everything. No filter on untracked files, no allowlist of expected paths, no diff against the worker's plan.
This is the inverse invariant from the legacy git-worker-push wrapper. The legacy wrapper refuses to push from a dirty tree (treats dirty as F-01 risk). oxi's recovery embraces the dirty tree (treats dirty as "the worker forgot to commit, sweep it all in"). Both patterns can ship correct code in the happy path. Both are vulnerable to F-01 when the worktree has stray host files that don't belong to the task — and on the Mac Mini, where worktrees live next to manuals, backups, and other Code/ directories, that's the realistic state.
Fix before the parallel window opens:
Modify _recover_one in ship_recovery.py to filter git add -A through the planned files set:
- Read the task's
files_touchedfrom the plan JSON (already in the DB schema). - Run
git status --porcelainand split into tracked-modified vs untracked. - Only
git addpaths that are either tracked-modified OR untracked-and-in-files_touched. - If untracked-and-NOT-in-
files_touchedexist, emit aship_recovery_skipped_dirty_treeevent with the suspicious paths and bail. Do NOT commit. - Add a config knob on
DispatchPolicy:ship_recovery_dirty_tree_policy: "strict" | "files_touched_only" | "permissive". Default"files_touched_only"(the new behavior)."strict"matches the legacy wrapper (refuse)."permissive"matches today's behavior (sweep all) — kept only for forks that explicitly want it.
Tests: extend oxi-core/tests/v3/test_ship_recovery.py with a fixture that drops a stray random_host_file.txt in the worktree alongside legitimate task changes; assert that under "files_touched_only" the recovery commits only the legitimate diff and emits the skip event for the stray file.
Estimated: half a day. Lands as a regular PR through oxi's own dogfood loop before Phase 0 of this cutover begins.
8.4 Hosts — Mac Mini only, dispatched from VPS¶
Confirmed topology for the cutover:
- Engine (oxi
v3 tick, all timers, dashboard, killswitch, ledger): runs on the Contabo VPS at100.77.51.51, in/opt/oxi/. Same physical box as the legacy engine. - Worker dispatch: VPS SSHs into the Mac Mini (Tailscale
100.66.244.112) and firesclaude -pthere. Worktrees live on the Mini under/Volumes/AI/Code/<project>-<session-tag>. The Mini has the Claude Code OAuth credentials, the local repo clones, and theclaudebinary; the VPS does not. - Single dispatch host. The Contably adapter's
dispatch()returns oneDispatchHost(name="mini", ssh_alias="mini", concurrency=20). Theairhost is dropped from the cutover. If concurrency needs to grow beyond what the Mini can sustain (RAM-probed by oxi'scompute_probe.pyper worker), revisit by addingairas a second host after the 14-day window — not before. - Why not air on day 1. The legacy engine has
airlisted in itspool.confbut never actually dispatched to it (active window 20:05). Addingairsimultaneously with the cutover doubles the failure surface for no proven benefit.
Update Phase 1 of this spec accordingly: the adapter scaffold should list mini only, with concurrency=20.
9. Sources merged into this spec¶
| Source | What it contributed |
|---|---|
Contably/contably/infra/contably-os/README.md |
systemd service inventory, deploy procedure, .env keys, dashboard topology |
Contably/contably/docs/contably-os-for-non-technical-readers-2026-04-23.md |
per-task lifecycle, Planner/Worker/Critic/Consolidator role table, "what it is and is not" framing |
Contably/contably/docs/psos-engine-root-cause-2026-04-23.md |
the seven critical/high bugs in §1.4, the structural requirements for §2.2 |
Contably/contably/docs/handoff-2026-04-22.md |
engine state at handoff, source-of-truth confirmation (escotilha/psos, not Contably/contably-os), version history (0.4.6 last good, 0.5.0 rolled back), cutover decisions made in PRs #25–#33 |
Contably/contably/docs/research/psos-discovery-pipeline-2026-04-23.md |
discovery / source-scanning architecture (T2-118, T2-119) — out of scope for the cutover but captured for the open question on adjacent pipelines |
oxi/README.md + oxi/docs/architecture.md + oxi/docs/origin-feature-gap-2026-04-24.md |
what oxi already is, what it consciously does NOT port from the legacy engine, the adapter Protocol, the test discipline |
~/.claude-setup/memory/auto/semantic/pattern_dirty_tree_contamination_in_workers.md + pattern_engine_stub_merge_staged_loss.md |
known failure patterns the cutover should make structurally impossible |
No source from escotilha/psos (the legacy engine's source code) was read directly; this spec works from documentation alone, consistent with oxi's "no verbatim port" rule.