Codex Sandbox Silent Drop

When running codex exec -s workspace-write inside a Conductor workspace, the codex CLI sandbox can silently DROP file patches on disk despite reporting success in its own log output. Tonight 2 of 5 fix tabs (P0-2 backfill_outcomes and the original P1-3+P1-5 ops fixes) lost their patches this way. The codex log showed clean diffs and “implemented” reports; git status showed nothing changed. This is a known Conductor/codex CLI failure mode.

Core claim

Never trust a codex exec tab’s success report alone. Always verify the expected file changes appear in git status --short BEFORE committing. If they don’t, the patch was silently dropped - re-fire with an explicit verification step.

Mechanism (best understanding)

  • Codex CLI runs inside a sandbox managed by the Conductor app
  • Sandbox sees worktree at one path: /Users/codysmith/conductor/workspaces/cortanaroi/cortanaroi-mk2/...
  • Parent shell sees the same worktree at a different path: /Users/codysmith/conductor/workspaces/cortanaroi-mk2/belo-horizonte
  • Most patches propagate from sandbox view to parent view correctly
  • Some patches do not - failure mode is intermittent and content- dependent (Python files seem more affected than markdown)
  • The sandbox CAN’T commit (fatal: index.lock Operation not permitted) because the .git/worktrees/ path is outside its writable roots. This is the visible part. The invisible part is patches that don’t even land on the parent-visible filesystem.

How to detect

After every codex exec finishes:

git status --short                     # are the expected files listed?
git diff --stat <expected files>       # are diffs non-empty?
grep -l '<expected change>' <files>    # does the change actually appear?

If git status doesn’t list the file the codex log claims to have edited, the patch was silently dropped. Re-fire.

How to mitigate

  1. Verify before committing. Mandatory grep-for-claimed-change step in every codex workflow.
  2. Tighter prompts on retry. When re-firing, include explicit git diff verification commands in the prompt and ask codex to “report only after diffs are visible on disk.”
  3. Belt-and-suspenders for P0 fixes. Anything affecting live trading or ML labels: BOTH the codex success report AND a non-empty git diff --stat on expected files before trusting.
  4. Don’t daisy-chain trust. Earlier tonight I committed Tab 1’s work and then assumed the system-reminder mentions of Tab 5’s plist edits meant Tab 5 worked. They didn’t. Always verify directly, not via reminders.

When this concept applies

Every codex exec invocation in a Conductor workspace where the output should be code edits on Python/shell/plist files. If you’re running codex outside Conductor (e.g. plain CLI on a vanilla repo with full filesystem access), this failure mode shouldn’t occur - but verify anyway.

When this concept breaks

If Conductor + codex CLI ship a fix that makes sandbox writes reliable, this concept becomes obsolete. As of 2026-05-05 with codex-cli 0.125.0 (research preview), it’s still active.

  • File a Conductor issue (Task #80)
  • Workflow change: bake git status verify into every codex tab follow-up

See Also


Timeline

2026-05-05 | observed - 5 codex tabs fired tonight for adversarial review fixes. 3 of 5 persisted patches cleanly (Tabs 1, 3, 4). 2 of 5 silently dropped patches (Tabs 2 and 5) despite reporting success. Discovered by accident on Tab 5 when grepping for the claimed plist fix; cascading audit revealed Tab 2 also failed. Re-fired both successfully with tighter “verify diff visible” prompts. Filed this concept so we never trust a codex report unverified again.