MK3 architecture spine - the Setup Hunter

This is the canonical MK3 design doc. Every other 2026-05-1x MK3 brain page (lead-lag study, five corpus rules, trailing validation) is a component of this spine, not a peer. Read this first.

1. The inversion

MK2 is a signal firehose. It scores continuously and emits a trade whenever the composite crosses a threshold. Result: ~20-35 trades/day, the majority regime-blind entries into chop. 2026-05-15 live: ~35 trades, IBKR-real -$21,215. The corpus ceiling is ~22% TP-hit; no bucket exceeds ~45%. The firehose cannot be tuned into profitability because its defect is structural: it has no concept of “nothing good is happening - stay flat.”

MK3 is a Setup Hunter. It inverts the posture from reactive-emit to patient-hunt. Default state is flat. It watches every signal stream (composite, IMPULSE, Market Tide, dealer/gamma, regime, price structure) and acts only when a named, pre-validated confluence - a “setup” - is present. Most of its job, by trade count, is saying no.

The unit of work changes: signal → setup.

  • A signal is “score crossed 58.” Context-free, always-on, lagging.
  • A setup is “conditions {A,B,C} co-occur, in regime R, anchored on the leading indicator L, in a configuration that - validated out-of-sample - precedes a clean move.” Named. Falsifiable. Rare.

2. Reconciliation with the standing mandates (do not skip this)

CLAUDE.md holds two permanent north stars: 80% win rate and early and right. The corpus has empirically falsified the literal “80% TP-hit on +10% within 30min” target (22% ceiling, project_impulse_corpus_may14). That does not discard the mandates - it tells us the firehose was the wrong vehicle for them. The Setup Hunter is how you actually pursue both:

  • “80% win rate” → reliability principle. Reinterpreted honestly: the goal is high-conviction trades that are not coin flips, not a literal 80% on an unreachable TP geometry. Selectivity is the only lever the corpus supports for raising reliability - fewer trades, each a validated setup, beats spraying.
  • “Early and right” → the lead-anchor requirement. Every setup MUST be anchored on a signal empirically shown to lead (the lead-lag study is what determines which - Tide is the user’s hypothesis, IMPULSE the live observation, neither yet confirmed). “Early” is meaningless until the lead-lag study says what actually leads. The Setup Hunter operationalizes “early and right” by only building setups on confirmed-leading anchors.

Net: the Setup Hunter serves the spirit of both mandates better than the firehose ever could. The falsified part is the number 80%@10%TP; the preserved part is reliability + leading-signal discipline. The operational north star becomes: 5-10 high-conviction winners/day via selectivity - a hard cap, not a quota to fill.

3. What a “setup” is (the data structure)

Every setup is a record, not a vibe:

  • name - e.g. hiro_strike_stack_trend
  • hypothesis - one falsifiable sentence (“when hiro AND strike_stack fire together in a TREND_UP regime, the suggested option reaches +10% within N min more than X% of the time”)
  • anchor - the leading signal it is built on (must be a lead-lag-confirmed leader; no setup may anchor on a lagging signal)
  • conditions - the confluence (signal states + thresholds)
  • regime_filter - the regime(s) it is valid in (and explicitly invalid in)
  • invalidators - conditions that veto it even if conditions met (e.g. Tide-divergence against bias, if lead-lag confirms Tide leads)
  • position_policy - size, scale-out, trail (inherits the corpus-validated trailing config; sizing scaled by conviction)
  • validation_status - UNVALIDATED | CORPUS_ONLY | OOS_PASSED | SHADOW | LIVE (see §5)
  • kill_criterion - the metric + threshold that demotes it (mirrors the trailing A/B kill-criterion, #107)

4. Components hang off the spine

These are no longer scattered pages - they are roles in the Setup Hunter:

Prior finding / pageRole in the Setup Hunter
Lead-lag study (2026-05-15-mk3-regime-design-leadlag-and-churn)Defines the ingredient layer: which signals are allowed to anchor a setup (only confirmed leaders)
Regime detector (#85; HMM-calls-everything-RANGE is the cautionary tale)A setup filter: a setup is valid only in its regime; a useless regime classifier = useless filter, so the regime model must be richer than today’s HMM
hiro+strike_stack (project_impulse_corpus_may14, #101)The first candidate setup - the only thing in the corpus that beat baseline. n=16, needs n≥30 + OOS before promotion
Re-entry/churn (#509-511)The debounce rule: one setup occurrence = one position; no re-fire on the same signal cluster
Post-loss cooldown, hold-time, time-of-day, MAE dist (2026-05-14-five-corpus-rules-for-mk3)Setup filters / sizing inputs (e.g. no new setup within 30min of a loss; midday setups need higher conviction)
Tide-divergence (2026-05-15-mk3-regime-design-leadlag-and-churn)A setup invalidator - pending lead-lag confirming Tide leads
Trailing + scale-out (2026-05-14-trailing-stop-corpus-validation)The default position_policy (already corpus-validated, MK2-shipped, carries forward)
“5-10 winners/day”The trade budget: a hard daily cap + default-flat, NOT a quota

5. The validation pipeline (non-negotiable - this is what stops curve-fitting)

A setup may not go live by inspection. It graduates through gates, each with a kill criterion, mirroring the discipline applied to the trailing A/B (#107):

  1. HYPOTHESIS - written as one falsifiable sentence. No code yet.
  2. CORPUS_ONLY - conditional-EV test on the live corpus (now clean: #103 counterfactuals + 109 canonical P&L). Must clear a pre-set EV/hit bar. This is necessary but NOT sufficient - it can overfit 5 weeks of one regime.
  3. OOS_PASSED - out-of-sample on the 1-year UW bundle (regime-spread backtest, mk3-build-week plan). Must hold across regimes, not just the corpus window. A setup that fails OOS is dead, no appeal. This gate is the entire defense against curve-fitting.
  4. SHADOW - runs live in MK3 emitting “would-fire” telemetry only, no orders, for a defined window. Compared to actuals.
  5. LIVE - emits orders. Retains its kill_criterion; demotes automatically if the metric goes negative over the defined window (same pattern as the trailing kill-criterion).

Failure mode named explicitly: “setup hunter” degenerates into curve-fitting-with-a-nicer-name if setups are fit to the recent corpus and shipped. The OOS_PASSED gate on the 1Y bundle is the only thing that prevents this. It is not optional. A setup that only has CORPUS_ONLY is a hypothesis, not a setup.

6. Nautilus conformance (MK3 framework rule)

Per CLAUDE.md, MK3 trading-code design must map to Nautilus primitives. The Setup Hunter maps cleanly - it is idiomatic, not a fight:

  • Strategy / Actor (nautilus-actors, nautilus-architecture): the Setup Hunter is a Strategy (or an Actor feeding a thin Strategy) that subscribes to multiple data + indicator streams and emits orders ONLY on confluence. “Default flat / abstain” is the natural Nautilus posture - a Strategy that simply does not call submit_order is a first-class, zero-friction state. The firehose’s missing “do nothing” is free here.
  • Custom data (nautilus-custom-data): IMPULSE conviction, Market Tide decomposition, regime label, dealer/gamma - each a registered custom data type published on the message bus. Setups subscribe to the ones they need.
  • Message bus (nautilus-architecture): confluence detection is a natural multi-subscription pattern - the Setup Hunter consumes N signal topics and evaluates setup conditions on each relevant tick.
  • Cache / positions (nautilus-cache, nautilus-accounting): position + P&L derived from OrderFilled events, by construction. This structurally eliminates the entire week’s bug class - the outcomes/partial_exits trap, the order_id-collision contamination, the reconciler wedges (#106), the canonical-column workaround (#108/#109). In MK3 there is no separate engine counter to drift; Portfolio IS the fills. The MK2 P&L-integrity firefighting does not port - it evaporates.
  • Backtesting (nautilus-backtesting): the OOS validation gate (§5 step 3) runs on Nautilus’s backtest engine against the 1Y bundle - same strategy code, historical data, no reimplementation. The validation pipeline is native, not bolted on.
  • Order types / position policy (nautilus-orders): the trailing + scale-out position_policy maps to Nautilus order contingencies; the MK2 hand-rolled trail (and its wedge bug) is replaced by framework order management.

Pin policy still applies (feedback_nautilus_pin_policy): exact-pin nautilus_trader==1.x.y per milestone; 1.x.y is not semver.

7. The honest worst case (the user must hear this)

The Setup Hunter is the right architecture. It does not guarantee an edge exists. The corpus is sobering: 22% TP-hit ceiling, IMPULSE ≈ baseline except one thin compound (n=16), composite barely predicts.

The intellectually honest possible outcome: very few setups survive OOS validation. The Setup Hunter might conclude the system should trade 2-3 times some days, zero on others. That is a success, not a failure. A system that trades 3x/day at genuine edge, or stays flat when there is none, strictly dominates a firehose that trades 35x into chop and loses $21k. The discipline to not trade includes the discipline to accept the data may say “almost never.” If the spike’s honest finding is “no robust setups exist in this signal stack,” that is a real, valuable, money-saving answer - and it redirects effort to new signal acquisition rather than re-tuning a dead one.

The Setup Hunter’s value is not that it finds winners. It is that it only trades validated edge and provably refuses everything else - including, if the data demands it, refusing almost everything.

8. What the MK3 spike must deliver (acceptance)

  1. Lead-lag study result: which signal(s) lead, by how much, regime-stable? (Determines legal setup anchors.)
  2. A richer regime classifier than the HMM (which uselessly labels ~97% “RANGE”). (Determines setup filters.)
  3. hiro+strike_stack run through the full §5 pipeline → OOS verdict.
  4. The Setup Hunter Strategy skeleton on Nautilus: subscribes to custom signal data, default-flat, hard daily cap, emits only on a registered validated setup, position state from fills.
  5. ≥1 setup at OOS_PASSED - or the honest finding that none clear OOS, with the redirect that implies.

9. Open questions the spike resolves

  • Does ANY signal lead price reliably and regime-stably? (Unknown. The user’s prior is Tide; live observation suggests IMPULSE; neither confirmed.)
  • Minimum confluence for a setup - 2 conditions? 3? (Empirical.)
  • Right daily cap - is “5-10” itself right, or does OOS say 2-3? (Empirical.)
  • Does the regime model need to be HMM-class, or simpler/different given the HMM’s RANGE-collapse failure? (Open.)

Put-side asymmetry - hard requirement (added 2026-05-18 from live exhibit)

Corpus, realized: CALL/BULL n=217 62% +26/trade) vs PUT/BEAR n=151 57% -102/trade). The put book has lost more than the call book has ever made. Full forensic: 2026-05-18-put-side-structural-loss-and-bypass-impulse.

The SPY Hunter MUST treat puts and calls as different instruments:

  1. Payoff-geometry-first, not win rate. A setup qualifies only if its expected structure is asymmetric in our favor. The +4.7%-flush / -29%-ride pattern (worst on puts) must be structurally impossible.
  2. No exhaustion entries. No entry in the direction of a completed leg at a band/range extreme. (Killed the 2026-05-18 -$6k 739P.)
  3. No impulse bypass override - every entry respects the meta gate. The 739P had meta 0.27; bypass=True took it anyway.
  4. Directional-flow agreement required - never buy puts into a call-heavier option stack.
  5. Degraded data → stand down. “slow tape fallback / insufficient flow / stack $0.0M” must force no-trade, not low-conviction pass-through that still sizes a position.
  6. Short side default-flat until OOS-proven. Given the corpus, the honest MK3 stance is: do not trade puts until a specific bearish setup passes OOS validation. Possibly a call-biased SPY Hunter v1. This is a feature of the selectivity thesis, not a gap.
  7. Broker-truth positions. 2026-05-18 ran ~23 min with two untracked broker positions while the dashboard showed flat. MK3 position + P&L derive from Nautilus OrderFilled events. An orphan must be structurally impossible, not alert-only.