Nautilus Data Testing Spec
The
developer_guide/spec_data_testing/page defines a rigorous test matrix every NautilusDataClientadapter must pass to be considered baseline-data-compliant. The matrix is implemented as theDataTesteractor - shipped in both Python (nautilus_trader.test_kit.strategies.tester_data) and Rust (nautilus_testkit::testers) - and is configured by a singleDataTesterConfigwhose flags toggle entire test groups on or off. Tests are organized into 9 groups (Instruments, Order book, Quotes, Trades, Bars, Derivatives data, Instrument status, Option greeks, Lifecycle), 27 numbered test cards (TC-D01 through TC-D72, with spaced numbering for future inserts), and ordered least-derived-to-most-derived data (instruments → book → quotes/trades → bars → derivatives). An adapter that passes Groups 1–4 is considered baseline data compliant. For Cortana MK3, this is the contract the UW WebSocket adapter must satisfy - week-1 post-spike work - plus a Cortana-specific superset for sub-second alert payloads, sweep/block flags, premium ordering, and millisecond-to-nanosecond timestamp conversion. This page is the canonical Nautilus-data-testing-spec reference for the Saturday 2026-05-09 spike and its week-1 follow-on.
This page complements
Nautilus Adapters (the five-component contract
the UW adapter implements), Nautilus Data Model
(the built-in Data types every test card asserts on), and
Nautilus Custom Data (the @customdataclass
mechanics that UWFlowAlert rides). It is a sibling to
Nautilus Adapter Authoring Guide and
Nautilus Testing Guide, both filed in the
same batch-7 sweep - those cover general adapter and test patterns;
this page is the specific data-side acceptance matrix.
Core claim
The DataTester actor + DataTesterConfig is the single contract a
Nautilus data adapter must pass. Pass the matching subset of TC-Dxx
cards for the data types your adapter supports, or you are not
production grade. Adapter-specific behavior (custom channels,
throttling, snapshot semantics) is additionally documented in the
adapter’s own guide - never inside this matrix. The matrix tests the
Nautilus contract; the adapter guide tests the venue idiosyncrasies.
How the matrix is run - DataTester mechanics
The DataTester is itself a Nautilus actor - registered into a
LiveNode like any other actor. It exposes lifecycle hooks
(on_start, on_stop, the standard on_* data callbacks per
nautilus-actors.md), and on start it issues
the subscriptions and requests that its config flags enable. The
adapter under test responds; the tester observes; pass/fail is
determined by what arrives at the on_* callbacks within a bounded
window.
Python node setup (verbatim from doc)
Legacy examples still use nautilus_trader.live.node.TradingNode,
but new Rust-backed PyO3 adapters should prefer
nautilus_trader.live.LiveNode. Use LiveNode.builder(...) when
you need to register adapter client factories before the node is
built:
from nautilus_trader.common import Environment
from nautilus_trader.live import LiveDataEngineConfig, LiveNode
from nautilus_trader.model import TraderId
node = (
LiveNode.builder("TESTER-001", TraderId("TESTER-001"), Environment.SANDBOX)
.with_data_engine_config(
LiveDataEngineConfig(time_bars_build_with_no_updates=False)
)
.add_data_client(None, adapter_data_client_factory, data_client_config)
.build()
)
node.add_actor_from_config(importable_actor_config)
# Register remaining components, then start or runRust node setup
Reference: crates/adapters/{adapter}/examples/node_data_tester.rs:
use nautilus_testkit::testers::{DataTester, DataTesterConfig};
let tester_config = DataTesterConfig::new(client_id, vec![instrument_id])
.with_subscribe_quotes(true);
let tester = DataTester::new(tester_config);
node.add_actor(tester)?;
node.run().await?;Prerequisites for any data test
- Target instrument available and loadable via the instrument provider.
- API credentials set via environment variables (
{VENUE}_API_KEY,{VENUE}_API_SECRET) when the venue requires authentication for the data being tested. - Demo/testnet mode (e.g.
is_demo=True): use credentials created for that environment. Demo and production keys are typically separate and not interchangeable; using the wrong credentials produces authentication errors (e.g. HTTP 401).
The required test categories - verbatim list
The doc enumerates 9 groups. Reproduced here verbatim by group number, name, and the test-card range that lives in each:
- Group 1: Instruments (TC-D01, TC-D02, TC-D03)
- Group 2: Order book (TC-D10, TC-D11, TC-D12, TC-D13, TC-D14, TC-D15)
- Group 3: Quotes (TC-D20, TC-D21)
- Group 4: Trades (TC-D30, TC-D31)
- Group 5: Bars (TC-D40, TC-D41)
- Group 6: Derivatives data (TC-D50, TC-D51, TC-D52, TC-D53)
- Group 7: Instrument status (TC-D60, TC-D61)
- Group 8: Option greeks (TC-D62, TC-D63)
- Group 9: Lifecycle (TC-D70, TC-D71, TC-D72)
Baseline data compliance = Groups 1–4 pass. Groups 5–8 apply when the adapter supports the corresponding data type. Group 9 is universal lifecycle hygiene.
Group-by-group test cards
Below: each card’s pass criteria and the DataTesterConfig flags that
enable it. Use this as the implementation checklist when writing
tests/adapters/<venue>/test_data_*.py.
Group 1: Instruments
| TC | Name | Pass criteria |
|---|---|---|
| TC-D01 | Request instruments | At least one instrument received; each has valid symbol, price precision, and size increment. Skip when: Never. |
| TC-D02 | Subscribe instrument | Instrument received with correct instrument_id, valid fields. Skip when: Adapter does not support instrument subscriptions. |
| TC-D03 | Load specific instrument | Instrument loaded with correct ID, price precision, size increment, and trading rules. Skip when: Never. Verify via self.cache.instrument(instrument_id). |
Flags: request_instruments, subscribe_instrument,
instrument_ids=[...].
Group 2: Order book
| TC | Name | Pass criteria |
|---|---|---|
| TC-D10 | Subscribe book deltas | Deltas received with valid instrument ID; at least one delta contains bid/ask updates. |
| TC-D11 | Subscribe book at interval | Book snapshots received with bid/ask levels; updates arrive at approximately the configured interval. |
| TC-D12 | Subscribe book depth | Depth snapshots received with up to 10 bid/ask levels; prices correctly ordered. Rust: Not yet supported (TODO in Rust DataTester). |
| TC-D13 | Request book snapshot | Snapshot contains bid/ask levels with valid prices and sizes. |
| TC-D14 | Managed book from deltas | Local book builds correctly from deltas; bid levels descend, ask levels ascend; book not empty after initial snapshot. |
| TC-D15 | Request historical book deltas | Deltas received with valid timestamps and book actions. Rust: Not yet supported. |
Flags: subscribe_book_deltas, subscribe_book_depth,
subscribe_book_at_interval, request_book_snapshot,
request_book_deltas, manage_book, book_type (default
L2_MBP), book_depth, book_interval_ms (default 1000),
book_levels_to_print (default 10), use_pyo3_book.
Group 3: Quotes
| TC | Name | Pass criteria |
|---|---|---|
| TC-D20 | Subscribe quotes | At least one QuoteTick received with valid bid/ask prices and sizes; bid < ask. Skip when: Never. |
| TC-D21 | Request historical quotes | Quotes received with valid timestamps, bid/ask prices and sizes. |
Flags: subscribe_quotes, request_quotes,
requests_start_delta (default 1 hour).
Group 4: Trades
| TC | Name | Pass criteria |
|---|---|---|
| TC-D30 | Subscribe trades | At least one TradeTick received with valid price, size, and aggressor side. Skip when: Never. |
| TC-D31 | Request historical trades | Trades received with valid timestamps, prices, sizes, and trade IDs. |
Flags: subscribe_trades, request_trades.
Group 5: Bars
| TC | Name | Pass criteria |
|---|---|---|
| TC-D40 | Subscribe bars | At least one Bar received with valid OHLCV; high >= low, high >= open, high >= close. |
| TC-D41 | Request historical bars | Bars received with valid OHLCV and ascending timestamps. |
Flags: subscribe_bars, request_bars,
bar_types=[BarType.from_str("...")].
Group 6: Derivatives data
| TC | Name | Pass criteria |
|---|---|---|
| TC-D50 | Subscribe mark prices | At least one MarkPriceUpdate received with valid instrument ID and mark price. |
| TC-D51 | Subscribe index prices | At least one IndexPriceUpdate received with valid instrument ID and index price. |
| TC-D52 | Subscribe funding rates | At least one FundingRateUpdate received with valid instrument ID and rate. Skip when: Not a perpetual. |
| TC-D53 | Request historical funding rates | Funding rates received with valid timestamps and rate values. Default 7-day lookback. |
Flags: subscribe_mark_prices, subscribe_index_prices,
subscribe_funding_rates, request_funding_rates.
Group 7: Instrument status
| TC | Name | Pass criteria |
|---|---|---|
| TC-D60 | Subscribe instrument status | Status events received with valid MarketStatusAction (e.g. Trading). Status events may only fire on state changes; during normal trading hours a Trading status may be received on subscribe. |
| TC-D61 | Subscribe instrument close | Close event received with valid close price and close type. Typically fires at end-of-session for traditional markets; may not fire for 24/7 crypto venues unless the adapter synthesizes a daily close. |
Flags: subscribe_instrument_status, subscribe_instrument_close.
Group 8: Option greeks
| TC | Name | Pass criteria |
|---|---|---|
| TC-D62 | Subscribe option greeks | Greeks received with valid delta, gamma, vega, theta values. rho may be zero when the venue does not provide it (Bybit, OKX). underlying_price and open_interest may be None depending on the venue channel. Some venues (Bybit, Deribit) subscribe per instrument; OKX subscribes per instrument family. |
| TC-D63 | Subscribe option chain | Chain snapshot contains greeks for instruments matching the series. Not yet configurable via DataTesterConfig - requires manual actor setup with subscribe_option_chain and an OptionSeriesId. ATM-relative strike ranges require a forward price bootstrap before subscriptions begin. |
Flags: subscribe_option_greeks. Chain subscription is managed
by the DataEngine, which creates per-instrument quote and greeks
subscriptions internally.
Group 9: Lifecycle
| TC | Name | Pass criteria |
|---|---|---|
| TC-D70 | Unsubscribe on stop | Clean unsubscribe; no errors in logs; no data events after stop. Triggered with can_unsubscribe=True (default). |
| TC-D71 | Custom subscribe params | Subscription established with custom parameters applied; data flows. The subscribe_params dict is opaque to the DataTester and passed through to the adapter - consult the adapter’s guide for supported keys. |
| TC-D72 | Custom request params | Request fulfilled with custom parameters applied; historical data received. The request_params dict is opaque and passed through. |
Flags: can_unsubscribe (default True), subscribe_params,
request_params.
DataTesterConfig reference (verbatim)
This is the load-bearing configuration table - the surface area
your test fixtures parameterize. Defaults are for the Python config.
Note: Rust DataTesterConfig::new sets manage_book=true,
while Python defaults it to False.
| Parameter | Type | Default | Affects groups |
|---|---|---|---|
instrument_ids | list[InstrumentId] | required | All |
client_id | ClientId? | None | All |
bar_types | list[BarType]? | None | 5 |
subscribe_book_deltas | bool | False | 2 |
subscribe_book_depth | bool | False | 2 |
subscribe_book_at_interval | bool | False | 2 |
subscribe_quotes | bool | False | 3 |
subscribe_trades | bool | False | 4 |
subscribe_mark_prices | bool | False | 6 |
subscribe_index_prices | bool | False | 6 |
subscribe_funding_rates | bool | False | 6 |
subscribe_bars | bool | False | 5 |
subscribe_instrument | bool | False | 1 |
subscribe_instrument_status | bool | False | 7 |
subscribe_instrument_close | bool | False | 7 |
subscribe_option_greeks | bool | False | 8 |
subscribe_params | dict? | None | 9 |
can_unsubscribe | bool | True | 9 |
request_instruments | bool | False | 1 |
request_book_snapshot | bool | False | 2 |
request_book_deltas | bool | False | 2 |
request_quotes | bool | False | 3 |
request_trades | bool | False | 4 |
request_bars | bool | False | 5 |
request_funding_rates | bool | False | 6 |
request_params | dict? | None | 9 |
requests_start_delta | Timedelta? | 1 hour | 3, 4, 5 |
book_type | BookType | L2_MBP | 2 |
book_depth | PositiveInt? | None | 2 |
book_interval_ms | PositiveInt | 1000 | 2 |
book_levels_to_print | PositiveInt | 10 | 2 |
manage_book | bool | False (Py) / True (Rust) | 2 |
use_pyo3_book | bool | False | 2 |
log_data | bool | True | All |
Test fixture patterns
The doc itself is sparse on fixture patterns; the canon comes from
inspection of crates/adapters/<adapter>/examples/node_data_tester.rs
and nautilus_trader/adapters/<adapter>/examples/. The recurring
shape:
- Per-adapter
examples/node_data_tester.rsthat constructs the live node, registers the adapter factory, instantiates aDataTesterwith a venue-appropriate config, runs the node for a bounded window, and asserts. This is the “live integration” smoke test. It needs real credentials and a real venue connection. - Per-adapter Python equivalent in
examples/live/<venue>/<venue>_data_tester.pyfor the dual- language matrix. - Per-adapter unit tests under
tests/adapters/<venue>/that stub the venue at the WebSocket / REST layer (see “Mock venue patterns” below) and run theDataTesteragainst the stub. These are the CI-runnable tests. - Golden-file fixtures under
tests/adapters/<venue>/fixtures/containing recorded venue payloads (one file per stream type) so the parser can be regression-tested without a live connection.
The split: live integration tests prove the adapter talks to the venue; unit + golden tests prove the adapter parses correctly.
Mock venue patterns
Two flavors:
A. Async WebSocket stub (preferred for streaming venues)
Spin up a local websockets server in a fixture; have it replay a
JSONL file of recorded venue frames at controlled intervals. The
adapter under test connects to ws://localhost:<port> instead of
the real venue host. Pattern:
@pytest_asyncio.fixture
async def fake_uw_ws_server(unused_tcp_port):
frames = json.loads(FIXTURE_DIR.joinpath("flow_alerts.jsonl").read_text())
async def handler(ws):
async for client_msg in ws: # consume sub messages
...
for frame in frames:
await ws.send(json.dumps(frame))
server = await websockets.serve(handler, "127.0.0.1", unused_tcp_port)
yield f"ws://127.0.0.1:{unused_tcp_port}"
server.close()
await server.wait_closed()B. HTTPX MockTransport (for REST endpoints)
Mock the httpx.AsyncClient at the transport layer; respond with
fixture JSON keyed by URL path. No network. Fast.
def fake_uw_http_transport():
routes = {
"/api/option-contract/SPY/2026-05-09": json.loads(
FIXTURE_DIR.joinpath("chain_snapshot.json").read_text()
),
}
def handler(request):
return httpx.Response(200, json=routes[request.url.path])
return httpx.MockTransport(handler)Golden-file expectations
Every adapter ships a corpus of recorded venue frames and the expected Nautilus events the adapter must emit. The test rule:
record-once → fixture/<stream>.jsonl (raw venue bytes)
parse-many → expected/<stream>.json (Nautilus events as dicts)
The test does:
- Load
fixture/<stream>.jsonl(raw venue frames). - Run each frame through the adapter’s parser.
- Serialize emitted Nautilus events via
to_dict(). assert emitted == json.loads(expected/<stream>.json).
When the venue’s wire format changes, diff the expected file in
review - that’s the audit trail. When the parser changes
intentionally, regenerate via a --update-goldens pytest flag
(common pattern: pytest tests/adapters/uw/ --update-goldens).
Cortana MK3 implications - the UW tests/adapters/uw/ plan
This is the load-bearing section. The UW adapter is the largest piece of new code MK3 requires (~870 LOC Python v0 per nautilus-adapters.md). The test surface is ~300-400 LOC and is the gate that decides whether v0 ships or gets recycled.
How the matrix applies to UW
UW is data-only and emits two custom data types
(UWFlowAlert, OptionChainSnapshot) - neither maps to a built-in
Nautilus type. So the standard matrix is partially applicable:
| Group | Applicability to UW v0 |
|---|---|
| 1. Instruments | Skip / no-op. UW does not define instruments - the UWInstrumentProvider is near-no-op (delegates to IBKR cache). TC-D01/D02/D03 trivially pass with empty results. |
| 2. Order book | Skip. UW does not provide book data. |
| 3. Quotes | Skip. UW does not provide quotes (IBKR adapter handles SPY top-of-book). |
| 4. Trades | Skip. UW does not provide trade ticks. |
| 5. Bars | Skip. UW does not provide bars. |
| 6. Derivatives data | Skip. UW does not provide mark / index / funding. |
| 7. Instrument status | Skip. Not provided. |
| 8. Option greeks | Maybe TC-D62. UW does provide Greeks per option contract. If we wire UW Greeks through OptionGreeks (built-in) instead of as a @customdataclass, this card applies. Decision deferred - see the spike’s carryover #8 (UW-as-Greeks-source). |
| 9. Lifecycle | Required: TC-D70, TC-D71. Unsubscribe-on-stop is a hard contract; subscribe_params is the surface where we pass {"underlying": "SPY"} / {"min_premium": 50000}. |
Net: the standard matrix only directly mandates Group 9 for UW v0. The bulk of UW’s test surface is Cortana-specific - custom-data correctness, sub-second flow alert parsing, sweep/block flag handling, ms→ns timestamp conversion. The matrix is necessary but insufficient for production-grade.
File-by-file plan for tests/adapters/uw/
tests/adapters/uw/
├── conftest.py # fixtures: fake_uw_ws, fake_uw_http,
│ # data_tester_node, sample_alerts
├── fixtures/
│ ├── flow_alerts.jsonl # 200+ recorded UW frames (sweeps,
│ │ # blocks, splits, malformed)
│ ├── chain_snapshot.json # one full UW REST chain pull
│ ├── reconnect_drop.jsonl # frames + a forced WS close
│ └── auth_failure.json # 401 response payload
├── expected/
│ ├── flow_alerts_emitted.json # UWFlowAlert events as dicts
│ └── chain_snapshot_emitted.json # OptionChainSnapshot event
├── test_http_client.py # auth, retries, rate-limit headers,
│ # 4xx/5xx classification
├── test_ws_client.py # connect, auth, subscribe message
│ # shape, reconnect with backoff,
│ # heartbeat handling
├── test_instrument_provider.py # near-no-op assertions; load_all_async
│ # returns []; find() delegates to cache
├── test_data_client.py # the DataTester matrix (TC-D70, TC-D71)
│ # plus Cortana-specific assertions
├── test_parsers.py # _parse_flow_alert against
│ # fixtures/flow_alerts.jsonl with
│ # golden-file diff
├── test_factory.py # UWLiveDataClientFactory builds a
│ # client from UWConfig; env-var
│ # credential resolution; bad config
│ # raises early
└── test_integration_replay.py # replay 1 trading day of recorded
# UW frames + cross-check emitted
# alert count vs MK2 scoring_events
What each test file must verify
test_http_client.py - REST surface
- Auth header inserted on every request (
Authorization: Bearer <token>). - Retry policy: 5xx + 429 → exponential backoff up to 3 retries; 4xx (except 429) → fail fast.
- Rate-limit hint headers honored if present (
Retry-After,X-RateLimit-Reset). httpx.MockTransportstub serves all fixture cases.
test_ws_client.py - WS surface
- Connect with token; receive ack; emit
_handle_dataonly after ack. - Reconnect on disconnect: backoff schedule, max attempts, active channels re-subscribed after reconnect (the framework keeps the set; the adapter replays it).
- Heartbeat / pong handling; if heartbeat fails for N consecutive intervals, force reconnect.
- Auth failure → raise
UWAuthError, do NOT retry indefinitely. - Malformed JSON → log warn, drop frame, continue (per the developer-guide rule: never panic, hang, or leak).
test_instrument_provider.py - UW-specific no-op
load_all_async()returns[](or a stub list ofInstrumentIdproxies if we choose that path).find(InstrumentId.from_str("SPY.ARCA"))delegates to the shared cache; returns the IBKR-loaded instrument when present,Noneotherwise.
test_data_client.py - the DataTester matrix subset
- TC-D70 (Lifecycle: unsubscribe on stop): start the tester with
subscribe_params={"channel": "flow-alerts:SPY"}, can_unsubscribe=True, feed a few alerts, stop the actor, assert (a) no errors logged, (b) noUWFlowAlertevents arrive aton_dataafter stop, (c) the WS unsubscribe frame was sent to the fake server. - TC-D71 (Custom subscribe params): subscribe with
{"underlying": "SPY", "min_premium": 50000}, assert the WS client sent the correctly-formed UW subscription frame and the fake server replays only matching alerts. - TC-D72 (Custom request params): request
OptionChainSnapshotfor ticker=SPY expiry=2026-05-09; assert the REST URL is/api/option-contract/SPY/2026-05-09and the parsed snapshot arrives aton_historical_data.
test_parsers.py - golden-file diffs
- Load
fixtures/flow_alerts.jsonl, run each frame through_parse_flow_alert, serialize to dict, compare againstexpected/flow_alerts_emitted.json. - Cover every UW quirk discovered in MK2: GH #54 strike-format
(some payloads use
"strike": "100.5", some"strike": 100.5), GH #59 timestamp-unit (UW gives ms; we multiply by 1e6 to get ns). - Negative cases: missing required field → log warn + drop, no event emitted.
test_factory.py - config + factory
UWLiveDataClientFactory()(UWConfig(...))returns a configuredUWLiveMarketDataClient.- Env-var credential resolution:
UW_API_TOKENenv var is read whenUWConfig.api_token=None. - Bad config → raise at build time, not at first use.
test_integration_replay.py - end-to-end smoke
- Spin up the full live node with the fake UW WS + the IBKR adapter
in paper mode; replay one trading day of UW frames; assert
len(emitted_UWFlowAlert) == len(expected_alerts)to within ±0.5%. - Cross-check: emitted alert count for SPY should equal the count
in
decisions.dbscoring_eventsrows for the same day. (This is the definitive “we ported the parser correctly” assertion.)
Mocking the UW WebSocket - concrete sketch
# tests/adapters/uw/conftest.py
import asyncio, json, pathlib
import pytest_asyncio
import websockets
FIXTURE_DIR = pathlib.Path(__file__).parent / "fixtures"
@pytest_asyncio.fixture
async def fake_uw_ws(unused_tcp_port):
frames = [
json.loads(line)
for line in FIXTURE_DIR.joinpath("flow_alerts.jsonl").read_text().splitlines()
if line.strip()
]
received_subs = []
async def handler(ws):
# Auth handshake
auth = json.loads(await ws.recv())
assert auth["msg_type"] == "auth"
await ws.send(json.dumps({"msg_type": "auth_ack", "ok": True}))
# Subscriptions
try:
sub_task = asyncio.create_task(_collect_subs(ws, received_subs))
for frame in frames:
await ws.send(json.dumps(frame))
await asyncio.sleep(0.001)
await asyncio.sleep(0.1)
sub_task.cancel()
except websockets.ConnectionClosed:
pass
async def _collect_subs(ws, sink):
async for msg in ws:
sink.append(json.loads(msg))
server = await websockets.serve(handler, "127.0.0.1", unused_tcp_port)
yield {
"url": f"ws://127.0.0.1:{unused_tcp_port}",
"received_subs": received_subs,
}
server.close()
await server.wait_closed()Test data fixture layout
tests/adapters/uw/fixtures/
├── flow_alerts.jsonl # one frame per line, 200+ frames
│ # representing morning open, midday
│ # chop, EOD power hour
├── chain_snapshot.json # one full UW REST response for SPY
├── reconnect_drop.jsonl # 50 frames followed by sentinel
│ # "DROP" line that triggers WS close
├── auth_failure.json # 401 response payload
├── malformed_strike.jsonl # frames with strike-as-string vs
│ # strike-as-float (GH #54)
├── ms_timestamp.jsonl # frames with ts_ms vs ts_ns (GH #59)
└── sweep_block_split.jsonl # frames covering all aggressor +
# is_sweep / is_block combinations
Recording protocol: capture from the live MK2 UW client during
a normal trading session via cortanaroi/data/uw_ws.py debug
logging. Sanitize any account-level IDs. Commit alongside the
adapter source. Update golden-file outputs by running
pytest tests/adapters/uw/test_parsers.py --update-goldens.
Golden-file approach for UWFlowAlert
The @customdataclass-decorated UWFlowAlert auto-generates
to_dict(). The golden file is the canonical
list[dict] representation. Test:
def test_parse_flow_alerts_matches_golden(parser):
raw = [json.loads(line) for line in FLOW_ALERTS_JSONL.read_text().splitlines()]
emitted = [parser._parse_flow_alert(frame).to_dict() for frame in raw]
expected = json.loads(EXPECTED_FLOW_ALERTS_JSON.read_text())
assert emitted == expectedWhen UW changes their wire format (rare but documented in MK2’s
GH issue history), the golden file diff is the audit trail. Any
change to expected/flow_alerts_emitted.json requires a PR
description that explains the UW-side wire change.
Cortana-specific tests the spec does NOT mandate but we MUST add
The matrix tests the Nautilus contract - not the Cortana contract. These additions are non-negotiable for the UW adapter:
- Sub-second alert ordering preserved. Replay frames with
monotonically increasing
ts_event_ms; assert emittedts_event(ns) is monotonically non-decreasing. Replay protection assumes deterministic ordering - see nautilus-data.md replay-determinism contract. - ms→ns timestamp conversion correctness. Assert
parsed.ts_event == frame["ts_ms"] * 1_000_000. Not 1e6, not* 1000 * 1000- exact integer multiply. Off-by-1000 here breaks backtest replay by 6 orders of magnitude (per project_v1_latency_april16.md context - Cortana is millisecond- sensitive on signal dispatch). - Sweep / block / split flag round-trip. For each of the 8
combinations of
(is_sweep, is_block, is_split), assert the emittedUWFlowAlertcarries the same boolean. - Premium ordering invariants. Assert
premium_usd == strike_price * size_contracts * 100to within rounding (UW’s premium is computed; verify our parser doesn’t drop precision). - Aggressor side normalization. UW emits
"BUY","SELL","NA","buy","sell",null. Adapter must canonicalize to"BUY"/"SELL"/"UNKNOWN". Test all 6 inputs. - Underlying-price freshness check. If
frame["underlying_price"]is older than 5 seconds (i.e.frame["underlying_ts_ms"] < frame["ts_ms"] - 5000), emit a warn log but still emit the alert. Reason: Cortana’s scoring engine has a hard fail-fast on stale spy_price (per spike plan §7 carryover). Adapter logs the staleness; engine decides what to do. - Backpressure: queue depth caps. If the engine is slow to
drain, the adapter must not buffer indefinitely. Set a
bounded
asyncio.Queue(maxsize=1000); on overflow, drop oldest, log warn, increment a metric. Test by replaying 2000 frames with a slow consumer. - Rate-limit respect on REST. UW’s REST tier is ~120
requests/min. Mock a 429 response with
Retry-After: 30→ adapter waits ≥30s before retry. Test that we never burst above the documented tier. - Reconnect re-subscribes all active channels. Replay
reconnect_drop.jsonl(which forces a WS close mid-stream); assert the adapter reconnects, replays the original subscribe frames against the fake server, and resumes emitting events.
The single most likely UW-specific test the spec doesn’t mandate
TC-Cortana-01: Heartbeat-loss masquerading as liveness. The matrix doesn’t cover this because most venues use TCP-level keep-alive. UW (and any vendor whose API is “WebSocket-over-HTTP- upgrade behind a CDN”) can drop the connection silently - the TCP socket stays open, no FIN sent, but no frames arrive. If the adapter only checks “is socket open?” it will sit happily “connected” while actual data is gone.
The test: in the fake WS server, after sending 50 frames, stop sending without closing. The adapter must (a) detect via heartbeat-timeout (e.g. no frame in 30s + no pong response in 10s), (b) force-close, (c) reconnect, (d) resume. Without this, Cortana goes dark in the middle of a trading session and we don’t know - exactly the failure mode the 2026-05-06 power-outage postmortem flagged for the IBKR side, ported here for UW.
This is the test that, if it fails, the trader loses real money silently. It is more important than any single TC-Dxx in the official matrix for a vendor like UW.
When this concept applies
- Designing the test plan for the UW WebSocket adapter (week-1 post-spike).
- Reviewing whether a third-party Nautilus adapter is “production grade” before depending on it.
- Adding a new data type to an existing adapter (file the new TC-Dxx-style test alongside).
- Auditing existing adapter tests against the canonical matrix to find gaps.
When it does not apply
- Execution-side testing → see the Execution Testing Spec
(sibling page,
developer_guide/spec_execution_testing/, filed separately asnautilus-dev-spec-execution-testing.mdin this batch). - General Python unit-test patterns (pytest fixtures, async testing) → nautilus-dev-testing.md.
- General adapter authoring (skeleton, factory, config) → nautilus-dev-adapters.md.
- Custom-data type ergonomics → nautilus-custom-data.md.
Anti-patterns to avoid
- Skipping Group 9 because “it’s just lifecycle.” Lifecycle bugs cause silent data loss in production. TC-D70 (clean unsubscribe) is the gate that proves the adapter doesn’t leak callbacks after stop.
- Relying on the live integration test in CI. The
node_data_tester.rsexample is for human validation against a real venue with real credentials - it is not the CI gate. CI runs the unit + golden-file tests against mocked transports. - Treating
subscribe_params/request_paramsas untyped forever. They’re opaque toDataTesterbut not opaque to the adapter’s own guide - document every supported key. - Hand-writing fixture frames. Record from the live venue during real conditions (open, midday, close). Synthetic fixtures miss the edge cases that bite in production.
- Asserting on event count alone. Always assert on event content via golden-file diff. Counts can match while payloads silently corrupt.
- Forgetting to update goldens after intentional parser
changes. Add a
--update-goldenspytest flag + a CI check that goldens haven’t drifted unintentionally.
See Also
- Nautilus Adapter Authoring Guide - Sibling page (parallel filing): how to build the adapter this page tests.
- Nautilus Testing Guide - Sibling page (parallel filing): general Nautilus testing patterns.
- Nautilus Adapters - The five-component adapter contract; the UW v0 component map.
- Nautilus Data Model - Built-in
Datatypes every TC-Dxx asserts on; ts_event / ts_init contract. - Nautilus Custom Data -
@customdataclassmechanics forUWFlowAlertandOptionChainSnapshot. - Nautilus Databento Integration - Closest reference adapter (data-only, equity-options universe); how its tests are structured.
- Nautilus Interactive Brokers Integration - The execution-side counterpart in Cortana MK3 (UW data + IBKR exec).
- Nautilus Developer Guide - Phase 1-7 adapter implementation sequence; Phase 7 is testing and references this matrix.
- Spike plan:
~/conductor/workspaces/cortanaroi-mk2/belo-horizonte/plans/2026-05-09-nautilus-spike.md- UW adapter is named as week-1 post-spike work; this matrix is the contract for that work.
- Source: https://nautilustrader.io/docs/latest/developer_guide/spec_data_testing/
Timeline
2026-05-07 | Cody - Filed during pre-spike concept mastery sweep batch 7 (developer guide).