Codex Sandbox Silent Drop
When running
codex exec -s workspace-writeinside a Conductor workspace, the codex CLI sandbox can silently DROP file patches on disk despite reporting success in its own log output. Tonight 2 of 5 fix tabs (P0-2 backfill_outcomes and the original P1-3+P1-5 ops fixes) lost their patches this way. The codex log showed clean diffs and “implemented” reports;git statusshowed nothing changed. This is a known Conductor/codex CLI failure mode.
Core claim
Never trust a codex exec tab’s success report alone. Always verify
the expected file changes appear in git status --short BEFORE
committing. If they don’t, the patch was silently dropped - re-fire
with an explicit verification step.
Mechanism (best understanding)
- Codex CLI runs inside a sandbox managed by the Conductor app
- Sandbox sees worktree at one path:
/Users/codysmith/conductor/workspaces/cortanaroi/cortanaroi-mk2/... - Parent shell sees the same worktree at a different path:
/Users/codysmith/conductor/workspaces/cortanaroi-mk2/belo-horizonte - Most patches propagate from sandbox view to parent view correctly
- Some patches do not - failure mode is intermittent and content- dependent (Python files seem more affected than markdown)
- The sandbox CAN’T commit (
fatal: index.lock Operation not permitted) because the .git/worktrees/ path is outside its writable roots. This is the visible part. The invisible part is patches that don’t even land on the parent-visible filesystem.
How to detect
After every codex exec finishes:
git status --short # are the expected files listed?
git diff --stat <expected files> # are diffs non-empty?
grep -l '<expected change>' <files> # does the change actually appear?If git status doesn’t list the file the codex log claims to have
edited, the patch was silently dropped. Re-fire.
How to mitigate
- Verify before committing. Mandatory grep-for-claimed-change step in every codex workflow.
- Tighter prompts on retry. When re-firing, include explicit
git diffverification commands in the prompt and ask codex to “report only after diffs are visible on disk.” - Belt-and-suspenders for P0 fixes. Anything affecting live
trading or ML labels: BOTH the codex success report AND a
non-empty
git diff --staton expected files before trusting. - Don’t daisy-chain trust. Earlier tonight I committed Tab 1’s work and then assumed the system-reminder mentions of Tab 5’s plist edits meant Tab 5 worked. They didn’t. Always verify directly, not via reminders.
When this concept applies
Every codex exec invocation in a Conductor workspace where the
output should be code edits on Python/shell/plist files. If you’re
running codex outside Conductor (e.g. plain CLI on a vanilla repo
with full filesystem access), this failure mode shouldn’t occur -
but verify anyway.
When this concept breaks
If Conductor + codex CLI ship a fix that makes sandbox writes reliable, this concept becomes obsolete. As of 2026-05-05 with codex-cli 0.125.0 (research preview), it’s still active.
Related
- File a Conductor issue (Task #80)
- Workflow change: bake
git statusverify into every codex tab follow-up
See Also
- launchd Calendar Catch-up - another “silent failure” gotcha worth documenting
- ML Training Label Grounding
Timeline
2026-05-05 | observed - 5 codex tabs fired tonight for adversarial review fixes. 3 of 5 persisted patches cleanly (Tabs 1, 3, 4). 2 of 5 silently dropped patches (Tabs 2 and 5) despite reporting success. Discovered by accident on Tab 5 when grepping for the claimed plist fix; cascading audit revealed Tab 2 also failed. Re-fired both successfully with tighter “verify diff visible” prompts. Filed this concept so we never trust a codex report unverified again.