How-To - Run a Backtest

Nautilus’s /how_to/run_rust_backtest/ page (HTTP 200 on 2026-05-07) is a runnable, dependency-pinned worked example for both backtest API levels in Rust: low-level BacktestEngine (build engine → add venue → add instruments → add data → add strategy → run) and high-level BacktestNode (write ParquetDataCatalog → assemble BacktestRunConfig from BacktestVenueConfig + BacktestDataConfig + BacktestEngineConfignode.build() → fetch engine → add strategy → node.run()). The parallel /how_to/run_python_backtest/ URL 404s as of 2026-05-07 - the Python backtest “how-to” is not at the URL the spike Step 0 expected. The Python backtest path lives instead in (a) Nautilus Backtesting (concept page, deeper than a how-to), (b) the Tutorials index (“Quickstart” and “Backtest (High-Level)”), and (c) the EMA cross examples in examples/backtest/. Python and Rust APIs are structurally identical

  • same engine class, same config object names, same run loop - so this page documents the API surface as a single shape and gives Cortana a Python cortana_backtest.py template that mirrors the Rust how-to exactly. Filed during pre-spike concept mastery sweep batch 6.

This page complements nautilus-backtesting.md (concept-level deep-dive on BacktestEngine vs BacktestNode, ThreeTierFillModel recommendation, ts_init carryover #7), nautilus-data.md (ParquetDataCatalog schema, Bar/QuoteTick/TradeTick), nautilus-custom-data.md (@customdataclass for Cortana UWFlowAlert / ScoreUpdate), nautilus-databento.md (Step 0.5 OPRA ingest pipeline that feeds the catalog this page reads from), nautilus-instruments.md (the IBKR↔Databento InstrumentId re-stamp during ingest), nautilus-reports.md (post-run generate_*_report() calls + Brier/AUC custom statistic), nautilus-visualization.md (Plotly tearsheet), nautilus-rust.md (v1-Cython-only IBKR support, why Cortana stays Python), nautilus-howto-write-strategy.md (parallel - what to put inside add_strategy).

URL status (Spike Step 0)

URLHTTPNotes
https://nautilustrader.io/docs/latest/how_to/run_rust_backtest/200Source of the worked example below.
https://nautilustrader.io/docs/latest/how_to/run_python_backtest/404Does not exist as of 2026-05-07. The Python equivalent path is the Tutorials index + examples/backtest/.

Spike Step 0 implication: the spike plan’s reference to “the Python how-to” needs to be retargeted. Use either (a) the Rust how-to as the canonical structural reference and translate to Python (1:1 mapping - the API names match), (b) the Tutorials → Backtest (High-Level) page, or (c) examples/backtest/databento_quotes_market_maker.py / examples/backtest/example_01_quickstart.py as concrete templates.

Core claim

A Nautilus backtest is assembly + run + extract. Assembly is two flavors:

  1. Low-level (BacktestEngine) - imperative: instantiate engine, add_venue, add_instrument, add_data, add_strategy, run. Use when data fits in RAM, when you want to iterate strategies/ parameters fast against the same dataset (call reset() between runs), and for the spike Step 6 (replay decisions.db).
  2. High-level (BacktestNode) - declarative: build one or more BacktestRunConfig objects, hand them to BacktestNode(configs=...), call node.build() then node.run(). Each run gets a fresh engine. Use for the Step 0.5 catalog path (ParquetDataCatalog populated by the Databento pull) and the M2 parallel MK2/MK3 harness.

Run is one method call. Extract is engine.trader.generate_*_report() plus create_tearsheet(engine, output_path=...). Backtest and live share the same Trader / Cache / MessageBus / Strategy surface - only the clock and the venue differ.

Section 1 - BacktestEngineConfig and BacktestRunConfig assembly

BacktestEngineConfig (engine-level defaults)

Engine-level configuration. Wraps the Cache, MessageBus, RiskEngine, ExecutionEngine, DataEngine, Logging, and Catalog options. Pass the default if you don’t need to override:

from nautilus_trader.config import BacktestEngineConfig
 
engine_config = BacktestEngineConfig(
    trader_id="CORTANA-MK3-001",
    log_level="INFO",
    # cache, msgbus, risk_engine, exec_engine, data_engine all default
)

Common overrides Cortana cares about:

  • trader_id - must be unique per run; informs client_order_id generation. The spike pattern: f"CORTANA-MK3-{date}-{strategy}".
  • cache.tick_capacity / bar_capacity - bounded ring buffers in the cache. Default 10_000 each is fine for one trading day.
  • risk_engine.max_order_submit - pre-trade rate limit. Cortana’s natural cadence (≤30 trades/day) is far below default; leave alone.
  • logging.log_level_file - write a file log per run for audit.

BacktestRunConfig (one run = one engine)

Used only with BacktestNode. Each BacktestRunConfig becomes a fresh BacktestEngine at node.build() time. The components:

from nautilus_trader.config import (
    BacktestRunConfig,
    BacktestVenueConfig,
    BacktestDataConfig,
    BacktestEngineConfig,
    ImportableStrategyConfig,
)
 
run_config = BacktestRunConfig(
    engine=BacktestEngineConfig(trader_id="CORTANA-MK3-V1"),
    venues=[venue_config],          # one or more
    data=[data_config_quotes,        # one per (data_cls, instrument_id)
          data_config_trades],
    strategies=[importable_strategy], # ImportableStrategyConfig list
    actors=[],                        # ImportableActorConfig list (optional)
    chunk_size=None,                  # None = full-load; int = stream chunks
    dispose_on_completion=True,       # free engine after run
    start=None,                       # ISO 8601 string or None for full range
    end=None,
)

ImportableStrategyConfig carries a fully-qualified class path plus a config dict; BacktestNode instantiates the strategy from this when the engine is built.

from nautilus_trader.config import ImportableStrategyConfig
 
importable_strategy = ImportableStrategyConfig(
    strategy_path="cortana.strategies.mk3:CortanaStrategy",
    config_path="cortana.strategies.mk3:CortanaStrategyConfig",
    config={
        "instrument_id": "SPY.ARCA",
        "score_threshold": 65,
        "max_position_size": 1,
        "tp_pct": 10,
        "sl_pct": 50,
    },
)

Section 2 - BacktestVenueConfig (the simulated venue)

Mirrors the Rust how-to’s SimulatedVenueConfig 1:1. Required fields in Cortana shape:

venue_config = BacktestVenueConfig(
    name="SMART",                      # IBKR SMART routing - matches live
    oms_type="NETTING",                # Cortana V1: single-position-per-instrument
    account_type="MARGIN",             # CASH for paper-equity-only
    base_currency="USD",
    starting_balances=["100_000 USD"], # paper-account balance
    book_type="L1_MBP",                # Cortana uses quotes/bars, not L2/L3
    fill_model=None,                   # see Section 4
    latency_model=None,                # default zero latency
    bar_execution=True,                # bars trigger matching
    bar_adaptive_high_low_ordering=True, # ~75-85% accuracy for H/L within bar
    trade_execution=True,              # trade ticks trigger matching
    queue_position=False,              # OFF for L1 + Cortana single contracts
    liquidity_consumption=False,       # OFF - we're tiny relative to depth
    price_protection_points=0,         # CME-style filter; OPRA-irrelevant
    reject_stop_orders=False,          # PM emits STOP_LIMIT for SL fallback
    support_gtd_orders=False,          # 0DTE - DAY only
    support_contingent_orders=True,    # for OCO TP+SL bracket
)

Notes on Cortana-specific defaults:

  • name="SMART" aligns with IBKR’s routing exchange so the same InstrumentId (e.g. SPY260509C00727000.SMART) flows through both backtest and live without a re-stamp.
  • oms_type="NETTING" matches IBKR’s account semantics. HEDGING doesn’t use snapshots (per nautilus-reports.md) - Cortana stays on NETTING so positions reports include reopen history.
  • bar_adaptive_high_low_ordering=True is critical when TP and SL fall inside the same bar (common at 1-minute Cortana cadence). The doc cites ~75-85% accuracy for the heuristic.
  • support_contingent_orders=True is required for the dual-TP defense-in-depth (LMT TP + STOP_MARKET SL contingent OCO).

Section 3 - Instrument provider for backtest

Pre-loaded from catalog (the canonical path)

When data was written via DatabentoDataLoader (Step 0.5), the catalog already contains OptionContract (or Equity) instruments. BacktestNode auto-loads them at node.build() time:

from nautilus_trader.persistence.catalog import ParquetDataCatalog
 
catalog = ParquetDataCatalog.from_env()  # NAUTILUS_PATH
print(f"Catalog has {len(catalog.instruments())} instruments")
# Cortana spike Step 0.5: SPY equity + ~600 SPY 0DTE options

The BacktestDataConfig.instrument_ids field optionally restricts which instruments stream during the run. Leaving it None streams all matching instruments - fine for spike Step 6 but for M2 production, list explicitly so the run doesn’t accidentally pull a different chain.

Hand-built (low-level path, when no catalog)

For Step 6 replay of decisions.db, where there’s no Databento catalog, instantiate the SPY equity instrument directly:

from nautilus_trader.model.instruments import Equity
from nautilus_trader.model.identifiers import InstrumentId, Symbol, Venue
from nautilus_trader.model.objects import Price, Quantity, Currency
from nautilus_trader.model.enums import AssetClass
 
spy = Equity(
    instrument_id=InstrumentId(Symbol("SPY"), Venue("ARCA")),
    raw_symbol=Symbol("SPY"),
    asset_class=AssetClass.EQUITY,
    currency=Currency.from_str("USD"),
    price_precision=2,
    price_increment=Price.from_str("0.01"),
    multiplier=Quantity.from_int(1),
    lot_size=Quantity.from_int(1),
    isin=None,
    margin_init=Decimal("0"),
    margin_maint=Decimal("0"),
    maker_fee=Decimal("0"),
    taker_fee=Decimal("0"),
    ts_event=0,
    ts_init=0,
)
engine.add_instrument(spy)

Section 4 - Data loading: catalog + DataLoader + DataWrangler

Three loading patterns, in increasing order of “raw”:

(a) ParquetDataCatalog query (the production path)

from nautilus_trader.persistence.catalog import ParquetDataCatalog
from nautilus_trader.model import QuoteTick, TradeTick
 
catalog = ParquetDataCatalog.from_env()
quotes = catalog.query(
    data_cls=QuoteTick,
    instrument_ids=["SPY.ARCA"],
    start="2026-05-06T13:30:00Z",
    end="2026-05-06T20:00:00Z",
)
# returns sorted list - ready to feed engine.add_data

(b) DatabentoDataLoader.from_dbn_file (Step 0.5 ingest)

Convert raw DBN to Nautilus types once, write catalog, query forever. Per nautilus-databento.md. Always step-1 = DEFINITION, step-2 = market data. as_legacy_cython=False is faster + supports PyO3-only types (IMBALANCE, STATISTICS).

(c) Hand-built via DataWrangler (Step 6 replay path)

For Cortana’s Step 6 (1-hour spike target: replay decisions.db → Nautilus → backtest), wrap each scoring event as a custom Data subclass and feed the engine directly:

import sqlite3
from nautilus_trader.model.custom import customdataclass
from nautilus_trader.core import Data
from nautilus_trader.model import InstrumentId
 
@customdataclass
class UWFlowAlert(Data):
    instrument_id: InstrumentId = InstrumentId.from_str("SPY.ARCA")
    composite_score: float = 0.0
    bias: str = ""
    conviction: str = ""
    flow_premium: float = 0.0
    spy_price_at_score: float = 0.0
 
def load_decisions(db_path: str) -> list[UWFlowAlert]:
    conn = sqlite3.connect(db_path)
    rows = conn.execute(
        "SELECT ts_event, composite_score, bias, conviction, "
        "flow_premium, spy_price_at_score "
        "FROM scoring_events WHERE date(ts_event/1e9, 'unixepoch') = ? "
        "ORDER BY ts_event",
        ("2026-05-06",),
    ).fetchall()
    return [
        UWFlowAlert(
            ts_event=int(r[0]),       # ns since epoch (Cortana stores ms*1e6)
            ts_init=int(r[0]),
            composite_score=float(r[1]),
            bias=str(r[2]),
            conviction=str(r[3]),
            flow_premium=float(r[4] or 0.0),
            spy_price_at_score=float(r[5]),
        )
        for r in rows
    ]

Pass sort=False per call, then one engine.sort_data() at the end - avoids the O(N²) re-sort trap from nautilus-backtesting.md.

Section 5 - Fill model + commission model selection

Per nautilus-backtesting.md recommendation, ThreeTierFillModel (50/30/20 across three ticks) is the spike default - pessimistic enough to surface MK3-vs-MK2 differences that aren’t artifacts of optimistic fills, while not so conservative that nothing fills.

from nautilus_trader.backtest.models import ThreeTierFillModel, FillModel
 
fill_model = ThreeTierFillModel(
    prob_fill_on_limit=0.5,   # middle of queue - Cortana isn't a market maker
    prob_slippage=0.10,       # 10% of fills slip one tick (L1 path)
    random_seed=42,           # reproducibility - same-process only (per docs)
)
 
# OR: simpler base model for first-pass spike validation
fill_model = FillModel(
    prob_fill_on_limit=1.0,   # always fills when touched
    prob_slippage=0.0,        # no slippage
    random_seed=42,
)

Fee model

For options on IBKR, the per-contract commission is roughly $0.65/ contract (no exchange/regulatory fee passthrough modeled).

from nautilus_trader.model.objects import Money, Currency
from nautilus_trader.backtest.models import FixedFeeModel
 
fee_model = FixedFeeModel(
    commission=Money(0.65, Currency.from_str("USD")),
    charge_commission_once=True,  # one charge per fill, not per side
)

Wire into the venue config:

venue_config = BacktestVenueConfig(
    ...,
    fill_model=fill_model,
    fee_model=fee_model,
)

For the low-level path, pass on engine.add_venue(...):

engine.add_venue(
    venue=Venue("SMART"),
    oms_type=OmsType.NETTING,
    account_type=AccountType.MARGIN,
    base_currency=Currency.from_str("USD"),
    starting_balances=[Money(100_000, Currency.from_str("USD"))],
    book_type=BookType.L1_MBP,
    fill_model=fill_model,
    fee_model=fee_model,
    bar_adaptive_high_low_ordering=True,
)

Section 6 - Strategy registration

Two paths, one for each API level.

Low-level: direct instantiation

strategy = CortanaStrategy(config=CortanaStrategyConfig(
    instrument_id=InstrumentId.from_str("SPY.ARCA"),
    score_threshold=65,
    max_position_size=1,
    tp_pct=10,
    sl_pct=50,
))
engine.add_strategy(strategy)

After engine.reset(), add_strategy must be called again - strategies are removed across resets (per nautilus-backtesting.md).

High-level: ImportableStrategyConfig

Already shown in Section 1. The point: the BacktestNode instantiates the strategy from the importable spec at build() time, so the run config can be persisted to JSON/YAML and re-run later without code in the loop.

Section 7 - Run loop

Low-level

engine.run()                       # full single-shot
 
# OR streaming
for batch in data_batches:
    engine.add_data(batch)
    engine.run(streaming=True)
    engine.clear_data()
engine.end()                       # flush deferred timers
 
# OR parameter sweep
for params in param_grid:
    engine.reset()                 # zero accounts/orders/strats
    engine.add_strategy(make_strategy(params))
    engine.run()
    save_results(params, engine.trader.generate_positions_report())

High-level

node = BacktestNode(configs=run_configs)
node.build()
results = node.run()       # list[BacktestResult] - one per RunConfig

BacktestResult carries summary stats; the underlying engine is accessible via node.get_engine(run_config_id) for full report extraction before dispose_on_completion triggers.

Section 8 - Result extraction

After engine.run() (low-level) or node.run() (high-level), reports come from the trader’s helper methods (per nautilus-reports.md):

trader = engine.trader  # or node.get_engine(run_id).trader
 
orders_df    = trader.generate_orders_report()
fills_df     = trader.generate_fills_report()
order_fills  = trader.generate_order_fills_report()
positions_df = trader.generate_positions_report()
account_df   = trader.generate_account_report(Venue("SMART"))
 
# Performance stats
analyzer = engine.portfolio.analyzer
stats_general = analyzer.get_performance_stats_general()
print(f"Win Rate: {stats_general.get('Win Rate'):.2%}")

Always use the Trader helper, not ReportProvider directly - the Trader path auto-includes position_snapshots for NETTING OMS, which the direct path does not (silent PnL undercount).

Section 9 - Tearsheet generation

from nautilus_trader.analysis import create_tearsheet
 
create_tearsheet(
    engine,
    output_path=f"runs/{run_id}/tearsheet.html",
    title=f"Cortana MK3 - {date}",
    benchmark=None,           # SPY equity returns optional overlay
)

Install: uv pip install "nautilus_trader[visualization]". Output is a self-contained HTML file with equity curve, drawdown, monthly heatmap, returns distribution, stats table. Brier and AUC are not in the built-in stats - see nautilus-reports.md for the post-hoc Parquet pipeline pattern.

Section 10 - Multi-run / parallel patterns

Two configs in one node (sequential, one process)

mk2_config = BacktestRunConfig(...)  # MK2-equivalent strategy
mk3_config = BacktestRunConfig(...)  # MK3 candidate strategy
 
node = BacktestNode(configs=[mk2_config, mk3_config])
node.build()
results = node.run()
# results[0] is MK2; results[1] is MK3

Two processes (true parallel - what M2 actually wants)

Spawn two Python processes via multiprocessing.Pool or concurrent. futures.ProcessPoolExecutor, each running one BacktestNode against the same ParquetDataCatalog. The catalog is read-only at run time so concurrent reads are safe. Results land in disjoint runs/{run_id}/ directories; M2 diff harness reads both.

Parameter sweep (low-level, one process, fastest)

results = []
for params in param_grid:
    engine.reset()
    engine.add_strategy(CortanaStrategy(config=make_config(params)))
    engine.run()
    results.append({
        "params": params,
        "win_rate": engine.portfolio.analyzer
                    .get_performance_stats_general()
                    .get("Win Rate"),
        "pnl": engine.portfolio.analyzer
               .get_performance_stats_pnls()
               .get("PnL (total)"),
    })

Use this for Cortana score-threshold / TP/SL grid search post-spike.

Cortana-applicable example: complete cortana_backtest.py

End-to-end script that reads from a ParquetDataCatalog (or stubs in-memory for Step 6), runs CortanaStrategy, dumps reports to Parquet, and writes a tearsheet. Drop-in for spike Step 6 and M2.

"""cortana_backtest.py - Nautilus backtest harness for Cortana MK3.
 
Two modes:
  --mode catalog   Use ~/cortana-data/catalog (Databento Step 0.5 path)
  --mode decisions Use data/decisions.db (Step 6 replay path)
 
Outputs:
  runs/{run_id}/orders.parquet
  runs/{run_id}/fills.parquet
  runs/{run_id}/positions.parquet
  runs/{run_id}/tearsheet.html
  runs/{run_id}/stats.json
"""
 
from __future__ import annotations
 
import argparse
import json
import sqlite3
from datetime import datetime, timezone
from decimal import Decimal
from pathlib import Path
 
import pandas as pd
 
from nautilus_trader.analysis import create_tearsheet
from nautilus_trader.backtest.engine import BacktestEngine
from nautilus_trader.backtest.models import FillModel, FixedFeeModel
from nautilus_trader.backtest.node import BacktestNode
from nautilus_trader.config import (
    BacktestDataConfig,
    BacktestEngineConfig,
    BacktestRunConfig,
    BacktestVenueConfig,
    ImportableStrategyConfig,
)
from nautilus_trader.core import Data
from nautilus_trader.model import QuoteTick, TradeTick
from nautilus_trader.model.custom import customdataclass
from nautilus_trader.model.enums import AccountType, BookType, OmsType
from nautilus_trader.model.identifiers import InstrumentId, Symbol, Venue
from nautilus_trader.model.objects import Currency, Money, Price, Quantity
from nautilus_trader.persistence.catalog import ParquetDataCatalog
 
 
# ---------------------------------------------------------------------------
# Custom data - Cortana scoring events
# ---------------------------------------------------------------------------
 
@customdataclass
class UWFlowAlert(Data):
    instrument_id: InstrumentId = InstrumentId.from_str("SPY.ARCA")
    composite_score: float = 0.0
    bias: str = ""
    conviction: str = ""
    flow_premium: float = 0.0
    spy_price_at_score: float = 0.0
 
 
def load_decisions(db_path: Path, date_str: str) -> list[UWFlowAlert]:
    """Step 6 path: replay decisions.db scoring events."""
    conn = sqlite3.connect(str(db_path))
    rows = conn.execute(
        "SELECT ts_event, composite_score, bias, conviction, "
        "       flow_premium, spy_price_at_score "
        "FROM scoring_events "
        "WHERE date(ts_event/1e9, 'unixepoch') = ? "
        "ORDER BY ts_event",
        (date_str,),
    ).fetchall()
    return [
        UWFlowAlert(
            ts_event=int(r[0]),
            ts_init=int(r[0]),
            composite_score=float(r[1]),
            bias=str(r[2]),
            conviction=str(r[3]),
            flow_premium=float(r[4] or 0.0),
            spy_price_at_score=float(r[5]),
        )
        for r in rows
    ]
 
 
# ---------------------------------------------------------------------------
# Mode A: low-level engine driving decisions.db replay (Step 6, ~1 hr target)
# ---------------------------------------------------------------------------
 
def run_decisions_mode(date_str: str, run_id: str) -> BacktestEngine:
    engine = BacktestEngine(
        config=BacktestEngineConfig(
            trader_id=f"CORTANA-MK3-{run_id}",
            log_level="INFO",
        ),
    )
 
    spy = build_spy_equity()
    engine.add_instrument(spy)
 
    engine.add_venue(
        venue=Venue("SMART"),
        oms_type=OmsType.NETTING,
        account_type=AccountType.MARGIN,
        base_currency=Currency.from_str("USD"),
        starting_balances=[Money(100_000, Currency.from_str("USD"))],
        book_type=BookType.L1_MBP,
        fill_model=FillModel(
            prob_fill_on_limit=1.0,
            prob_slippage=0.0,
            random_seed=42,
        ),
        fee_model=FixedFeeModel(
            commission=Money(0.65, Currency.from_str("USD")),
            charge_commission_once=True,
        ),
        bar_adaptive_high_low_ordering=True,
    )
 
    db_path = Path("~/conductor/workspaces/cortanaroi-mk2/"
                   "belo-horizonte/data/decisions.db").expanduser()
    alerts = load_decisions(db_path, date_str)
    engine.add_data(alerts, sort=False)
    engine.sort_data()
 
    from cortana.strategies.mk3 import CortanaStrategy, CortanaStrategyConfig
    engine.add_strategy(CortanaStrategy(config=CortanaStrategyConfig(
        instrument_id=spy.id,
        score_threshold=65,
        max_position_size=1,
        tp_pct=10,
        sl_pct=50,
    )))
 
    engine.run()
    return engine
 
 
# ---------------------------------------------------------------------------
# Mode B: high-level node driving ParquetDataCatalog (Step 0.5 + M2 path)
# ---------------------------------------------------------------------------
 
def run_catalog_mode(date_str: str, run_id: str):
    catalog_path = "/Users/codysmith/cortana-data/catalog"
 
    venue_config = BacktestVenueConfig(
        name="SMART",
        oms_type="NETTING",
        account_type="MARGIN",
        base_currency="USD",
        starting_balances=["100_000 USD"],
        book_type="L1_MBP",
        bar_adaptive_high_low_ordering=True,
    )
 
    quotes_config = BacktestDataConfig(
        catalog_path=catalog_path,
        data_cls=QuoteTick,
        instrument_ids=["SPY.ARCA"],
        start_time=f"{date_str}T13:30:00Z",
        end_time=f"{date_str}T20:00:00Z",
    )
    trades_config = BacktestDataConfig(
        catalog_path=catalog_path,
        data_cls=TradeTick,
        instrument_ids=None,           # all SPY OPRA contracts
        start_time=f"{date_str}T13:30:00Z",
        end_time=f"{date_str}T20:00:00Z",
    )
 
    strategy_config = ImportableStrategyConfig(
        strategy_path="cortana.strategies.mk3:CortanaStrategy",
        config_path="cortana.strategies.mk3:CortanaStrategyConfig",
        config={
            "instrument_id": "SPY.ARCA",
            "score_threshold": 65,
            "max_position_size": 1,
            "tp_pct": 10,
            "sl_pct": 50,
        },
    )
 
    run_config = BacktestRunConfig(
        engine=BacktestEngineConfig(trader_id=f"CORTANA-MK3-{run_id}"),
        venues=[venue_config],
        data=[quotes_config, trades_config],
        strategies=[strategy_config],
        chunk_size=10_000,         # streaming
        dispose_on_completion=False,  # we want the engine for reports
    )
 
    node = BacktestNode(configs=[run_config])
    node.build()
    node.run()
    return node.get_engine(run_config.id)
 
 
# ---------------------------------------------------------------------------
# Result extraction + tearsheet
# ---------------------------------------------------------------------------
 
def dump_results(engine: BacktestEngine, run_id: str) -> None:
    out = Path(f"runs/{run_id}")
    out.mkdir(parents=True, exist_ok=True)
 
    trader = engine.trader
 
    trader.generate_orders_report().to_parquet(out / "orders.parquet")
    trader.generate_fills_report().to_parquet(out / "fills.parquet")
    trader.generate_positions_report().to_parquet(out / "positions.parquet")
 
    analyzer = engine.portfolio.analyzer
    stats = {
        **analyzer.get_performance_stats_general(),
        **analyzer.get_performance_stats_returns(),
        **analyzer.get_performance_stats_pnls(),
    }
    (out / "stats.json").write_text(
        json.dumps({k: str(v) for k, v in stats.items()}, indent=2)
    )
 
    create_tearsheet(
        engine,
        output_path=str(out / "tearsheet.html"),
        title=f"Cortana MK3 - {run_id}",
    )
 
    print(f"Wrote runs/{run_id}/ - Win Rate: {stats.get('Win Rate')}")
 
 
def build_spy_equity():
    from nautilus_trader.model.enums import AssetClass
    from nautilus_trader.model.instruments import Equity
    return Equity(
        instrument_id=InstrumentId(Symbol("SPY"), Venue("ARCA")),
        raw_symbol=Symbol("SPY"),
        asset_class=AssetClass.EQUITY,
        currency=Currency.from_str("USD"),
        price_precision=2,
        price_increment=Price.from_str("0.01"),
        multiplier=Quantity.from_int(1),
        lot_size=Quantity.from_int(1),
        isin=None,
        margin_init=Decimal("0"),
        margin_maint=Decimal("0"),
        maker_fee=Decimal("0"),
        taker_fee=Decimal("0"),
        ts_event=0,
        ts_init=0,
    )
 
 
def main() -> None:
    p = argparse.ArgumentParser()
    p.add_argument("--mode", choices=["catalog", "decisions"], required=True)
    p.add_argument("--date", required=True, help="YYYY-MM-DD")
    args = p.parse_args()
 
    run_id = f"{args.date}-{args.mode}-{datetime.now(timezone.utc):%H%M%S}"
 
    if args.mode == "catalog":
        engine = run_catalog_mode(args.date, run_id)
    else:
        engine = run_decisions_mode(args.date, run_id)
 
    dump_results(engine, run_id)
 
 
if __name__ == "__main__":
    main()

Run:

# Step 6 (decisions.db replay, ~1 hour spike target)
python cortana_backtest.py --mode decisions --date 2026-05-06
 
# Step 0.5 + M2 (Databento catalog path)
python cortana_backtest.py --mode catalog --date 2026-05-06

Cortana MK3 implications

Spike Step 6 path: decisions.db → DataLoader → custom UWFlowAlert events → BacktestEngine

Use Mode B-decisions (low-level BacktestEngine + UWFlowAlert custom data). Concrete shape per nautilus-backtesting.md:

  1. Load 15 today’s scoring events from SQLite.
  2. Wrap each as UWFlowAlert(@customdataclass) with ts_init = scoring event timestamp (ns). Multiply Cortana’s millisecond timestamps by 1e6 before assignment.
  3. engine.add_data(alerts, sort=False); one sort_data() after all loads.
  4. engine.add_venue(name="SMART", book_type=L1_MBP, ...) with simple FillModel (no slippage) for first pass.
  5. engine.add_strategy(MK2EquivalentStrategy()); run(); capture decisions. engine.reset(). Re-add MK3CandidateStrategy(); run(); capture decisions. Diff.
  6. Pass criterion: ≥60% decision parity with MK2 actuals.

Step 0.5 path: ParquetDataCatalog (OPRA Trades + MBP-1) → BacktestEngine

Use Mode A-catalog (high-level BacktestNode + BacktestRunConfig). Per nautilus-databento.md the catalog is already populated by Saturday morning: SPY OPRA DEFINITION + TRADES + MBP-1 for one date. BacktestDataConfig with data_cls=QuoteTick and data_cls=TradeTick covers both. chunk_size enables streaming - for OPRA’s volume, set chunk_size=10_000 and let the node manage memory.

M2 parallel MK2/MK3 harness

Two BacktestRunConfig objects pointing at the same ParquetDataCatalog, different ImportableStrategyConfig. Either:

  • Same node, sequential: BacktestNode(configs=[mk2, mk3]).run(). Runs back-to-back in one process. Simpler.
  • Two processes, parallel: multiprocessing.Pool(2) running two BacktestNode(configs=[one_config]) calls. True parallel; each reads the catalog independently. Faster wall clock for daily M2 rollup. Requires the catalog be on local SSD (not network FS) for read concurrency.

Decision-diff harness: read runs/{mk2_id}/orders.parquet and runs/{mk3_id}/orders.parquet, inner-join on (ts_init, instrument_id), compute disagreement rate. M2 success: <5%/day.

InstrumentId re-stamp during ingest (per nautilus-instruments.md)

Databento returns equity instruments with the listing-MIC venue (SPY.ARCX or similar). IBKR returns SPY.ARCA. Cortana’s strategy hardcodes SPY.ARCA. Solution: during the DatabentoDataLoader → catalog.write_data step in Step 0.5, re-stamp the InstrumentId.venue to ARCA before the write so backtest and live see identical strings. Verify the loader supports this on Saturday; if not, file a translation layer in the spike handoff.

For options: IBKR uses SPY260509C00727000.SMART, Databento likely uses the per-exchange MIC (.OPRA, .XCBO, etc.). Same pattern - re-stamp venue to .SMART during catalog write so cache.instrument (InstrumentId.from_str("SPY260509C00727000.SMART")) resolves both in backtest and live without strategy-side branching.

ts_init nanosecond-tie carryover (#7) - does this doc resolve it?

No. The Rust how-to (the only how-to that loads, per the URL status check) does not address ts_init nanosecond-tie ordering. It just calls engine.add_data(quotes, None, true, true) (where the booleans are auto_sort and validate) and moves on. The semantics of two events at the same ts_init are not specified at the how-to level. Per nautilus-backtesting.md:

  • Doc says data is sorted “into monotonic order based on ts_init.”
  • time_bars_build_delay (microseconds) addresses the bar-vs-tick edge case at bar-close timestamps.
  • The settle loop guarantees commands at T finish before T+1.
  • But what happens when two UWFlowAlert events share a nanosecond ts_init? Python’s sorted() is stable so insertion order wins, but the doc doesn’t make this an explicit guarantee. The Rust path may behave differently (Rust’s stable-sort guarantee is explicit: slice::sort_by_key is stable).

Mitigation in cortana_backtest.py above: the SQL ORDER BY ts_event provides stable insertion order; sort=False then one sort_data() preserves it through the engine’s stable sort. For nanosecond collisions, pre-sort by (ts_event, source_priority, event_id) at load time. Treat #7 as still open; file a follow-up (“Document Nautilus tie-breaking semantics for ts_init collisions across Data subtypes”).

Open questions

  1. Python how-to URL. Does it exist at a different path? Search the Tutorials index for “Backtest (High-Level)” / “Backtest (Low-Level)” - those may be the Python equivalents. (Action: add to spike Step 0 follow-up - verify on Saturday.)
  2. BacktestDataConfig.data_cls=UWFlowAlert (custom data). Does the high-level path support custom-data classes through the catalog? nautilus-custom-data.md says yes via @customdataclass
    • register_serializable_type, but the how-to doesn’t show it. Action: smoke test on Saturday.
  3. BacktestNode.get_engine lifecycle. Does the engine stay accessible after node.run() if dispose_on_completion=False? The how-to uses get_engine_mut before run(); verify the read-only post-run path works for report extraction.
  4. Tearsheet input. Does create_tearsheet(engine, ...) accept the engine returned by node.get_engine(...), or only the low-level BacktestEngine handle? Tested on nautilus-visualization.md; not stated in the how-to.
  5. InstrumentId re-stamp during DatabentoDataLoader. Is there an explicit kwarg, or do we need a post-load mutate-then-rewrite step? Action: Saturday spike Step 0.5.

See Also

  • Nautilus Backtesting - concept-level deep-dive on BacktestEngine vs BacktestNode, fill model family, ts_init carryover #7, deterministic replay guarantees.
  • Nautilus Rust - IBKR-only-on-v1-Cython context for why Cortana stays Python end-to-end; PyO3 boundary cost.
  • Nautilus Data - ParquetDataCatalog, QuoteTick, TradeTick, Bar, instrument lifecycle.
  • Nautilus Custom Data - @customdataclass for UWFlowAlert / ScoreUpdate / MetaProb in the spike Step 6 replay.
  • Nautilus Databento - Step 0.5 OPRA ingest pipeline that populates the ParquetDataCatalog this page reads.
  • Nautilus Instruments - IBKR vs Databento InstrumentId mismatch; re-stamp during ingest pattern.
  • Nautilus Reports - generate_orders_report() etc., Brier/AUC custom-statistic pattern, Trader-helper-vs-direct snapshot caveat.
  • Nautilus Visualization - Plotly tearsheet contents, install, customization.
  • Nautilus How-To: Write a Strategy
    • parallel; what goes inside add_strategy (the Cortana strategy body itself).
  • 2026-05-09 Nautilus Spike Plan: ~/conductor/workspaces/cortanaroi-mk2/belo-horizonte/plans/2026-05-09-nautilus-spike.md (Step 0 how-to verification; Step 0.5 Databento ingest; Step 6 decisions.db replay).

Timeline

2026-05-07 | Cody - Filed during pre-spike concept mastery sweep batch 6 (how-tos).