Nautilus Custom Data

Nautilus’s /concepts/custom_data/ page describes the framework’s internal custom-data architecture: a single PyO3 CustomData wrapper, a runtime DataRegistry that resolves type handlers by type_name, and an Arrow C FFI bridge that lets pure-Python custom types persist as Parquet without hardcoded schemas. There are five extension points a Cortana developer can choose from - Data subclass, @customdataclass decorator, @customdataclass_pyo3 decorator, publish_signal, and a custom DataClient

and they trade off ergonomics, performance, replay determinism, and persistence support. This page is the canonical decision tree for picking the right extension point per Cortana custom-data class (UWFlowAlert, ScoreUpdate, MetaProb, EmaDecayValue, RegimeChange) and the answer to the open question: @customdataclass for any structured event you want to log, replay, or query; publish_signal only for ephemeral primitive notifications you’d be willing to lose. Filed during the pre-spike concept mastery sweep for the 2026-05-09 NautilusTrader spike.

This page complements nautilus-data.md (which covers the broader data model, built-in types, ts_event/ts_init, and the catalog write path). Where that page sketches custom data as one of many topics, this page goes deep on the custom-data mechanics specifically: how registration works, where the PyO3 boundary sits, the JSON envelope, the Arrow C FFI bridge, and which extension point you should reach for when.

Core claim

There is one custom-data system at runtime - the PyO3 CustomData wrapper plus the DataRegistry that resolves type-handlers by type_name - and five authoring surfaces on top of it. The decision rule is:

Need persistence + replay + structure? → @customdataclass (Python) or @customdataclass_pyo3 (Rust-backed).
Need a primitive ephemeral notification (one float/int/str/bool)? → publish_signal.
Need to ingest bytes from an external source? → custom DataClient that emits one of the above types via _handle_data(...).
Need full control over schema/serialization (the option-Greeks pattern)? → manual Data subclass with explicit register_serializable_type + register_arrow.
Just want to publish a string topic and don’t care about ordering or replay? → raw self.msgbus.publish(topic, message) (anti-pattern for anything Cortana cares about).

Everything else is a special case of one of these five.

Extension points - the complete catalog

1. `@customdataclass` decorator (Python, runtime-registered)

The 90% case for Cortana. A pure-Python class with type-annotated fields, auto-generates the to_dict/from_dict/to_bytes/from_bytes/schema helpers, registers itself with the DataRegistry so the catalog Arrow encoder/decoder works, and rides the bus through publish_data → on_data. Internally these instances live as CustomData(data_type, PythonCustomDataWrapper(self)) - the wrapper caches ts_event, ts_init, type_name and delegates JSON / Arrow operations back to Python under the GIL.

from nautilus_trader.model.custom import customdataclass
from nautilus_trader.core import Data
from nautilus_trader.model import InstrumentId
 
@customdataclass
class ScoreUpdate(Data):
    instrument_id: InstrumentId = InstrumentId.from_str("SPY.ARCA")
    composite_score: float = 0.0
    bias: str = ""
    conviction: str = ""

Pros: zero boilerplate, replay-deterministic via ts_event/ts_init, catalog-persistable, bus-routable, immutable post-publish.

Cons: Arrow encode/decode goes through GIL + Arrow C FFI per batch (measurable but small at Cortana volumes); 78-feature flat-vector schema is awkward (use a JSON-string column or split into top-N columns).

2. `@customdataclass_pyo3` decorator (Rust-backed payload)

Same authoring shape as @customdataclass but the underlying wrapper is a native same-binary Rust payload registered via register_custom_data_class(MyType). Registration precedence (per the custom_data doc) resolves the native Rust path first, falling back to the Python wrapper only if no native handler exists. JSON and Arrow go through native Rust handlers - no GIL hop, no FFI bridge. This is the fast path.

When to reach for it: only if profiling shows the Python encode path is a bottleneck. For Cortana’s ~1-10 events/sec scoring rate, the pure-Python @customdataclass is fine.

3. `publish_signal` - ephemeral primitive notifications

self.publish_signal("score_alert", 67.5, ts_event)
self.subscribe_signal("score_alert")
 
def on_signal(self, signal):
    print("Signal", signal)

Signals carry a single primitive (str/float/int/bool/bytes) plus a name and a ts_event. They flow on the bus but do not auto-register with the catalog - they’re not persisted, not replay-deterministic across runs unless you build your own sink, and not structured.

Use only for: debug pings, dashboard heartbeats, ephemeral cooldown flags. Never use for: anything you want to query, replay, or audit.

4. Custom `DataClient` - bytes-in, normalized-Data-out

A LiveDataClient / LiveMarketDataClient (Rust core + PyO3 binding + Python factory, per nautilus-developer-guide.md) that opens the WebSocket / REST connection, decodes wire bytes into a Data subclass (usually a @customdataclass instance), and pushes through _handle_data(event). The DataEngine then runs the cache-then-publish sequence (per nautilus-architecture.md).

This is where UWFlowAlert originates in MK3 - the UW WebSocket adapter is not an Actor (it is bytes-in), but every UWFlowAlert it emits is authored as a @customdataclass.

5. Manual `Data` subclass with full schema control

The option-Greeks pattern (sketched in nautilus-data.md): subclass Data, hand-write to_dict/from_dict/to_bytes/from_bytes/schema, explicitly call register_serializable_type(...) for bus serialization and register_arrow(...) for catalog persistence. Equivalent to what @customdataclass generates, but you control the Arrow schema exactly (useful for nested types, pa.Map, large binary blobs).

Use only when @customdataclass’s flat-scalar schema doesn’t fit your data shape.

Anti-pattern: raw `self.msgbus.publish(topic, message)`

Bypasses ts_event/ts_init ordering, no catalog persistence, typo-prone topic strings. The nautilus-concepts.md doc warns explicitly: “you must track topic names manually (typos could result in missed messages).” Use only for one-off internal plumbing where ordering doesn’t matter.

DataRegistry - the runtime resolver

crates/model/src/data/registry.rs is the central resolver. Singletons (OnceLock-initialized DashMap) keyed by type_name:

JSON deserializers keyed by type_name.
Arrow schemas, encoders, and decoders keyed by type_name.
Python extractors that convert a Python object into Arc<dyn CustomDataTrait>.
Rust extractor factories that produce Python extractors for same-binary types.

Concurrent registration is safe: registration uses atomic DashMap::entry() so concurrent register_* and ensure_* calls do not race. The registry is what makes “no hardcoded schemas in the binary” possible - the catalog write path looks up the encoder by type_name extracted from DataType, not by static dispatch.

CustomData - the PyO3 wrapper

The outer CustomData wrapper is the common container that crosses the FFI boundary. Constructor: CustomData(data_type, data) where DataType is first, payload second.

It contains:

A DataType.
An inner custom payload implementing CustomDataTrait (wrapped in Arc<dyn CustomDataTrait>).
Timestamps (ts_event, ts_init) delegated to the inner CustomDataTrait and exposed as properties on the wrapper.

Python-side semantics: __eq__ and __repr__ are implemented (equality uses the Rust PartialEq logic). Instances are intentionally unhashable so equality stays consistent with payload comparison.

Two backends for the inner payload:

PythonCustomDataWrapper - used for pure-Python custom data. Stores reference to the Python object, caches ts_event/ts_init/ type_name, calls Python methods for JSON / Arrow ops under the GIL.
Native same-binary Rust payload - concrete Rust type, downcastable directly from Arc<dyn CustomDataTrait>. No Python callback path.

User code only sees CustomData - the same API regardless of backend.

DataType - routing identity vs storage path

DataType(type_name, metadata=None, identifier=None).

The critical rule from the doc:

“Equality, hashing, and topic routing are derived from type_name and metadata only. Two DataType values with the same type name and metadata but different identifiers compare equal and publish to the same message bus topic. The identifier affects only the storage path under data/custom/<type_name>/<identifier...>.”

So: routing topic = (type_name, metadata). Catalog path = adds identifier. This means a Cortana subscriber for UWFlowAlert(metadata={"underlying": "SPY"}) will receive every alert regardless of whether the publisher tagged a specific identifier like expiry=2026-05-08.

CustomData JSON envelope

When serialized to JSON (e.g. to_json_bytes/from_json_bytes, SQL cache, Redis), CustomData uses a single canonical envelope so deserialization does not depend on user payload field names:

type - the custom type name (from CustomDataTrait::type_name).
data_type - object with type_name, metadata, optional identifier.
payload - the inner payload only (result of CustomDataTrait::to_json parsed as a value). Registered deserializers receive only this value in from_json, so user structs can use any field names (including value) without conflicting with wrapper metadata.

This is what makes a Redis bus hop or restart-replay safe even when your custom-data class has a field named metadata or data_type.

Serialization paths - which extension point supports what

Extension point	Bus pub/sub	Cache (in-memory)	Cache DB (Redis/Postgres)	Parquet catalog	Arrow schema
`@customdataclass`	yes	yes	yes (via JSON envelope)	yes (Python encode via Arrow C FFI)	auto-generated flat
`@customdataclass_pyo3`	yes	yes	yes	yes (native Rust encode)	native
`publish_signal`	yes	yes (transient)	no (not auto-registered)	no	n/a
Custom `DataClient` (emitting custom types)	yes	yes	yes	yes	inherits from emitted type
Manual `Data` subclass + `register_serializable_type` + `register_arrow`	yes	yes	yes	yes	hand-written
Raw `msgbus.publish(topic, ...)`	yes	no	no	no	n/a

The asymmetry of publish_signal is the load-bearing distinction: a signal is alive for the duration of the bus dispatch, then gone. A @customdataclass event is alive forever in the catalog, replayable into any future backtest, queryable by DataType, and survives a Redis bus restart.

Subscription patterns

From Strategies

from nautilus_trader.model.identifiers import ClientId
 
self.subscribe_data(
    data_type=DataType(ScoreUpdate, metadata={"underlying": "SPY"}),
    client_id=ClientId("SCORING_ACTOR"),
)
 
def on_data(self, data: Data) -> None:
    if isinstance(data, ScoreUpdate):
        self._handle_score(data)

on_data is the catch-all handler for custom data - type-check inside the handler. Strategies inherit this from Actor.

From Actors

Identical signature. Actors typically subscribe in on_start():

class MetaGateActor(Actor):
    def on_start(self) -> None:
        self.subscribe_data(
            data_type=DataType(ScoreUpdate, metadata={"underlying": "SPY"}),
        )
 
    def on_data(self, data: Data) -> None:
        if isinstance(data, ScoreUpdate):
            meta_prob = self._meta_model.predict(data.features_json)
            self.publish_data(
                DataType(MetaProb, metadata={"underlying": "SPY"}),
                MetaProb(
                    instrument_id=data.instrument_id,
                    meta_prob=meta_prob,
                    ts_event=data.ts_event,
                    ts_init=self.clock.timestamp_ns(),
                ),
            )

Pattern: Actor subscribes to upstream custom data, computes a derived event, publishes downstream custom data. Strategy subscribes to the final gate event. This is the chain nautilus-actors.md recommends for the spike - one ScoringActor → one MetaGateActor → one CortanaStrategy.

Signals (different handler)

self.subscribe_signal("score_alert")
 
def on_signal(self, signal):
    # signal is a primitive
    ...

Signals route to on_signal, NOT on_data. Don’t conflate the two.

Replay semantics - does custom data round-trip cleanly?

Yes, with caveats. The catalog write/read flow is symmetric:

Write path

ParquetDataCatalog receives a CustomData value.
Extracts type_name, metadata, identifier from DataType.
Looks up Arrow encoder in DataRegistry.
Encodes values to a RecordBatch.
Appends a data_type column containing the persisted DataType.
Attaches type_name and metadata to Arrow schema.
Writes batch to Parquet under data/custom/<type_name>/<identifier...>.

Identifiers are normalized before becoming path segments.

Read path

Catalog reads matching Parquet files.
Extracts type_name from schema metadata.
Asks DataRegistry for the registered decoder.
Decodes RecordBatch into Vec<Data>.
Reconstructs CustomData with the original DataType.

Caveat 1 - registration must match. The decoder is looked up by type_name at read time, so the consuming process must have registered the same custom-data class (@customdataclass re-import, or @customdataclass_pyo3 registration) before the catalog query runs. If the type isn’t registered, the read fails. This is why custom-data class definitions belong in a shared module imported by both the writer and the reader (the strategy, the actor, the backtest harness, the brain-logger).

Caveat 2 - feather-to-parquet conversion. When converting a Feather stream to Parquet (e.g. after a backtest), the custom-data branch decodes batches and re-writes them via write_custom_data_batch so custom data written through the Feather writer is correctly converted to Parquet. The conversion is automatic but requires the type to be registered in the converting process.

Caveat 3 - pure-Python encode goes through Arrow C FFI. For @customdataclass (not @customdataclass_pyo3), encode/decode hops the GIL once per batch, exporting/importing via RecordBatch._export_to_c / RecordBatch._import_from_c. Per-batch overhead is small but real.

Caveat 4 - schema evolution is not handled by the framework. If you add a field to a @customdataclass, old Parquet files do not auto- migrate. Plan for this in MK3: version the type name (ScoreUpdateV1, ScoreUpdateV2) or write a migration path.

ts_event vs ts_init - the contract

Every Data subclass - built-in or custom - carries two UNIX-nanosecond timestamps:

ts_event - when the event actually occurred at the source (exchange, vendor WebSocket, scoring computation moment).
ts_init - when Nautilus initialized the internal object representing that event.

Practical rules:

latency = ts_init - ts_event gives total system latency. Clocks aren’t synchronized so it’s an estimate, not a guarantee positive.
For internally-generated custom data (e.g., ScoreUpdate produced inside the runtime), ts_event and ts_init may legitimately be equal - both are self.clock.timestamp_ns() at the moment of computation.
For UW-sourced data, ts_event should be the UW WebSocket payload’s vendor timestamp (in nanoseconds - UW gives milliseconds, multiply by 1e6); ts_init is self.clock.timestamp_ns() at the moment the DataClient deserializes the frame.

Backtest replay ordering

From nautilus-data.md: backtest data is ordered by ts_init using a stable sort. This is the property MK3 needs for decisions.db replay. Live mode processes data as it arrives.

Open question carryover #7 - nanosecond-tie ordering

The spike plan’s open question #7 asks: does Nautilus’s stable-sort handle multiple events at the same ts_init nanosecond deterministically?

The custom_data doc does NOT explicitly address this. What we know:

The architecture is single-threaded dispatch on the kernel core (nautilus-architecture.md) → event order on the bus = order they were placed on the bus.
Backtest sort is documented as stable sort by ts_init (nautilus-data.md).
Stable sort preserves insertion order within tie groups - meaning ordering among ties depends on the order data was fed into the BacktestEngine.
BacktestDataConfig reads from the catalog in time order, but Parquet files don’t guarantee within-file row ordering across writes unless you explicitly sort on write.

Implication for MK3: ties are deterministic per-run if the input ordering is deterministic, but cross-run determinism depends on whether the catalog read produces identical row order every time. For decisions.db replay where multiple ScoreUpdate events legitimately share a millisecond (and therefore a ts_init after the ms→ns conversion), this could drift if the Parquet writer doesn’t preserve order.

Spike action item (Step 6): verify experimentally - write 100 custom-data events with identical ts_init, query the catalog, confirm the read order matches the write order across multiple runs. If it doesn’t, the migration path is to mint synthetic monotonic ts_init during decisions.db → Parquet ingest (add an artificial nanosecond offset per event within a millisecond bucket).

Code-reading task: the actual sort live in crates/data/src/engine/mod.rs (or similar). Confirm during the spike.

Cortana MK3 implications - extension-point mapping per type

The 9-page sibling sweep established that five custom-data classes will live in MK3. Mapping each to the right extension point:

`UWFlowAlert` - UW WebSocket flow alert

Extension point: @customdataclass, emitted by a custom UW LiveDataClient.

Why: structured event (strike, side, size, premium, sweep/block flags), needs to be persisted for replay (decisions.db is replaying these alerts indirectly via score updates today), needs to round-trip through the catalog so backtest replays the same UW event sequence.

Why NOT publish_signal: signal can carry one primitive. UW alerts have ~10 fields. Hard requirement → structured.

Why NOT @customdataclass_pyo3: Python is fine at UW’s ~1-10 alerts/sec rate. Reach for the Rust-backed version only if profiling shows Arrow encode is a bottleneck during catalog writes - not likely at Cortana volume.

`ScoreUpdate` - composite scoring engine output

Extension point: @customdataclass, published by a ScoringActor.

Why: the load-bearing audit event (78 features + composite + bias + conviction). Needs catalog persistence for postmortem replay; needs strict ts_event/ts_init ordering for backtest determinism. A signal would lose all the structure.

Schema awkwardness: the 78-feature vector is the hard part. Three options (per nautilus-data.md):

(a) Serialize as JSON string in one column. Easy. Loses query-ability via DataFusion where= predicates.
(b) 78 separate Arrow columns. Clean. Rigid - adding a feature requires a schema migration.
(c) Arrow Map type. Flexible. PyArrow backend only.

Recommendation for spike: (a) JSON string column. If we need DataFusion filtering on individual features post-spike, migrate to (b) with a versioned class name.

`MetaProb` - secondary classifier output

Extension point: @customdataclass, published by a MetaGateActor that subscribes to ScoreUpdate.

Why separate event vs field on ScoreUpdate? Open question from nautilus-strategies.md. Two factors decide:

Cadence. If meta-prob runs synchronously inside the scoring actor, fold it as a field on ScoreUpdate. If it runs on a separate timer (e.g., heavy ML inference batched every 500ms), separate MetaProb event.
Subscriber graph. If the strategy is the only consumer and always pairs ScoreUpdate + meta_prob 1:1, fold it. If the dashboard / logger / experiment harness wants meta-prob without carrying the 78-feature vector, separate it.

Recommendation for spike: start as a field on ScoreUpdate (one event, simpler). Split out only if the meta-model adds latency and we want to decouple cadence.

`EmaDecayValue` - EMA-decay flow value

Extension point: field on ScoreUpdate (default), OR @customdataclass event if it updates independently of score recomputation.

Why: EMA decay is derived state, recomputed on every flow event. If it’s recomputed in lock-step with scoring, fold the field. If a separate Actor updates it on an independent timer, separate event.

Recommendation for spike: field on ScoreUpdate. Promote to EmaDecayUpdate(Data) only if the cadence diverges.

`RegimeChange` - chop ↔ trend ↔ power-hour reclassification

Extension point: @customdataclass, published by a RegimeDetector Actor.

Why a separate event: regime changes are infrequent (a few per day), asynchronous to scoring (driven by their own rolling-window detector), and broadly subscribed (strategy, position-manager, dashboard, logger all want to know). Folding into ScoreUpdate would force every score event to re-publish a regime field most subscribers don’t need. Separate event = clean subscriber graph.

Why NOT publish_signal: regime is not just a string label - it’s the regime + the confidence + the prior regime + the trigger classification. Signal can carry one of those, not the bundle.

Summary table

Cortana type	Extension point	Why
`UWFlowAlert`	`@customdataclass` (via custom DataClient)	Structured UW WebSocket payload, needs catalog persistence
`ScoreUpdate`	`@customdataclass`	Audit-trail event, 78 features, bus + catalog + replay
`MetaProb`	Field on `ScoreUpdate` (default) OR `@customdataclass` if cadence diverges	Synchronous meta-prob = field; async batched = separate event
`EmaDecayValue`	Field on `ScoreUpdate` (default) OR `@customdataclass` if independent	Lock-step with score = field; independent timer = separate
`RegimeChange`	`@customdataclass`	Infrequent, asynchronous, broadly subscribed

Worked example - UWFlowAlert lifecycle

End-to-end the way it’ll work in MK3:

# In cortana/mk3/data/uw_types.py - shared module imported everywhere
from nautilus_trader.model.custom import customdataclass
from nautilus_trader.core import Data
from nautilus_trader.model import InstrumentId
 
@customdataclass
class UWFlowAlert(Data):
    instrument_id: InstrumentId = InstrumentId.from_str("SPY.ARCA")
    strike: float = 0.0
    expiry: str = ""
    option_side: str = ""           # "CALL" | "PUT"
    aggressor_side: str = ""        # "BUY" | "SELL" | "MIXED"
    premium_usd: float = 0.0
    size_contracts: int = 0
    is_sweep: bool = False
    is_block: bool = False
    flow_score: float = 0.0
    underlying_price: float = 0.0
    raw_id: str = ""
 
# In cortana/mk3/adapters/uw/data_client.py
class UWLiveDataClient(LiveMarketDataClient):
    async def _on_message(self, frame: dict) -> None:
        alert = UWFlowAlert(
            instrument_id=InstrumentId.from_str("SPY.ARCA"),
            strike=frame["strike"],
            expiry=frame["expiry"],
            option_side=frame["side"],
            aggressor_side=frame["aggressor"],
            premium_usd=frame["premium"],
            size_contracts=frame["size"],
            is_sweep=frame["is_sweep"],
            is_block=frame["is_block"],
            flow_score=frame["score"],
            underlying_price=frame["underlying"],
            raw_id=frame["alert_id"],
            ts_event=int(frame["ts_ms"]) * 1_000_000,  # ms → ns
            ts_init=self._clock.timestamp_ns(),
        )
        # _handle_data routes through DataEngine → Cache → MessageBus
        self._handle_data(alert)
 
# In cortana/mk3/actors/scoring_actor.py
class ScoringActor(Actor):
    def on_start(self) -> None:
        self.subscribe_data(
            data_type=DataType(UWFlowAlert, metadata={"underlying": "SPY"}),
        )
 
    def on_data(self, data: Data) -> None:
        if isinstance(data, UWFlowAlert):
            self._update_flow_state(data)
            score = self._compute_composite()
            self.publish_data(
                DataType(ScoreUpdate, metadata={"underlying": "SPY"}),
                ScoreUpdate(
                    instrument_id=data.instrument_id,
                    composite_score=score.value,
                    bias=score.bias,
                    conviction=score.conviction,
                    features_json=json.dumps(score.features),
                    ts_event=data.ts_event,            # propagate source event time
                    ts_init=self.clock.timestamp_ns(), # init = now
                ),
            )
 
# In cortana/mk3/strategies/cortana_strategy.py
class CortanaStrategy(Strategy):
    def on_start(self) -> None:
        self.subscribe_data(
            data_type=DataType(ScoreUpdate, metadata={"underlying": "SPY"}),
        )
 
    def on_data(self, data: Data) -> None:
        if isinstance(data, ScoreUpdate):
            if data.composite_score >= 65 and data.bias == "BULL":
                self._submit_bull_call(data)

The same UWFlowAlert, ScoreUpdate classes are imported by:

the live UWLiveDataClient,
the live ScoringActor,
the live CortanaStrategy,
the backtest harness,
the catalog migration script (decisions.db → Parquet),
the brain-logger Actor,
any out-of-process dashboard subscriber.

One class, six consumers. The DataRegistry registration happens once per process at module import time.

Anti-patterns to avoid

Using publish_signal for anything you’d want to query later. Signals are not catalog-persisted. If the spike’s success criteria include “replay 2026-04-16 chop-day cluster”, a signal-based implementation cannot be replayed.
Skipping the shared types module. If UWFlowAlert is defined in the producer’s module and re-defined in the consumer’s, the DataRegistry registers two distinct type_name values and the bus topic mismatch silently drops every message. Define once, import everywhere.
Mutating a published CustomData instance. nautilus-architecture.md is explicit: messages are immutable post-publish. Derive new local state if you need a different shape.
Putting the 78-feature vector in 78 separate cache writes. One ScoreUpdate per score event, not 78 partial updates. The doc warns: stable-sort by ts_init makes 78 simultaneous events ordering-ambiguous within ties (see open question #7).
Using @customdataclass_pyo3 premature optimization. Reach for it only after profiling shows pure-Python @customdataclass is the bottleneck. At Cortana’s event rate (low double digits per second), it almost certainly isn’t.
Forgetting ts_event for internally-generated events. Even a ScoreUpdate computed inside the runtime needs a sensible ts_event - propagate the source event’s ts_event (the UW alert that triggered the score), don’t just set both to “now”. Otherwise replay-ordered backtests lose the cause-effect timing.
Custom topic strings via msgbus.publish for Cortana events. Use publish_data with a DataType. The string-topic path is typo-prone and bypasses replay determinism.

When this concept applies

Defining any new structured event type for Cortana MK3 (scoring, gating, regime, premium-flow, position telemetry, exit decisions).
Authoring the UW custom adapter - every UW-sourced object becomes a custom-data class.
Migrating decisions.db rows to Parquet - each row class becomes a custom-data class with a registered Arrow schema.
Backtest replay of historical Cortana decisions.
Any out-of-band subscriber (dashboard, brain-logger, Telegram notifier) that consumes Cortana events.

When this concept does NOT apply

Built-in market data (QuoteTick, TradeTick, Bar, OrderBookDelta, OptionGreeks). Those have hardcoded Rust schemas and don’t go through the DataRegistry custom path.
Order/Position/Account events (OrderFilled, PositionOpened, AccountState). Those are Events on the MessageBus event topic, not custom data. See nautilus-events.md for the distinction - Events flow with their own dispatch ladder (on_order_filled → on_order_event → on_event); custom Data flows to on_data.
Component lifecycle transitions (STARTING / RUNNING / DEGRADED). Those are FSM transitions on the Component, not bus events. Wrap them in a custom EngineHealth event if you want them in the audit trail.
Logging messages. The logging subsystem has its own MPSC channel; don’t conflate.
Cython @customdataclass. The custom_data doc explicitly separates this - the Cython decorator is a different system from the PyO3 CustomData architecture this page describes. If you find a Nautilus example using Cython @customdataclass, it’s older; the PyO3 path is the current architecture.

Timeline

2026-05-07 | Cody - Filed during pre-spike concept mastery sweep batch 2.

CortanaROI Brain

Explorer

nautilus-custom-data