Nautilus Custom Data
Nautilus’s
/concepts/custom_data/page describes the framework’s internal custom-data architecture: a single PyO3CustomDatawrapper, a runtimeDataRegistrythat resolves type handlers bytype_name, and an Arrow C FFI bridge that lets pure-Python custom types persist as Parquet without hardcoded schemas. There are five extension points a Cortana developer can choose from -Datasubclass,@customdataclassdecorator,@customdataclass_pyo3decorator,publish_signal, and a customDataClient
- and they trade off ergonomics, performance, replay determinism, and persistence support. This page is the canonical decision tree for picking the right extension point per Cortana custom-data class (
UWFlowAlert,ScoreUpdate,MetaProb,EmaDecayValue,RegimeChange) and the answer to the open question:@customdataclassfor any structured event you want to log, replay, or query;publish_signalonly for ephemeral primitive notifications you’d be willing to lose. Filed during the pre-spike concept mastery sweep for the 2026-05-09 NautilusTrader spike.
This page complements nautilus-data.md (which covers the broader data
model, built-in types, ts_event/ts_init, and the catalog write path). Where
that page sketches custom data as one of many topics, this page goes deep on
the custom-data mechanics specifically: how registration works, where the
PyO3 boundary sits, the JSON envelope, the Arrow C FFI bridge, and which
extension point you should reach for when.
Core claim
There is one custom-data system at runtime - the PyO3 CustomData
wrapper plus the DataRegistry that resolves type-handlers by type_name -
and five authoring surfaces on top of it. The decision rule is:
- Need persistence + replay + structure? →
@customdataclass(Python) or@customdataclass_pyo3(Rust-backed). - Need a primitive ephemeral notification (one float/int/str/bool)? →
publish_signal. - Need to ingest bytes from an external source? → custom
DataClientthat emits one of the above types via_handle_data(...). - Need full control over schema/serialization (the option-Greeks pattern)?
→ manual
Datasubclass with explicitregister_serializable_type+register_arrow. - Just want to publish a string topic and don’t care about ordering or
replay? → raw
self.msgbus.publish(topic, message)(anti-pattern for anything Cortana cares about).
Everything else is a special case of one of these five.
Extension points - the complete catalog
1. @customdataclass decorator (Python, runtime-registered)
The 90% case for Cortana. A pure-Python class with type-annotated fields,
auto-generates the to_dict/from_dict/to_bytes/from_bytes/schema
helpers, registers itself with the DataRegistry so the catalog Arrow
encoder/decoder works, and rides the bus through publish_data →
on_data. Internally these instances live as CustomData(data_type, PythonCustomDataWrapper(self)) - the wrapper caches ts_event, ts_init,
type_name and delegates JSON / Arrow operations back to Python under
the GIL.
from nautilus_trader.model.custom import customdataclass
from nautilus_trader.core import Data
from nautilus_trader.model import InstrumentId
@customdataclass
class ScoreUpdate(Data):
instrument_id: InstrumentId = InstrumentId.from_str("SPY.ARCA")
composite_score: float = 0.0
bias: str = ""
conviction: str = ""Pros: zero boilerplate, replay-deterministic via ts_event/ts_init,
catalog-persistable, bus-routable, immutable post-publish.
Cons: Arrow encode/decode goes through GIL + Arrow C FFI per batch (measurable but small at Cortana volumes); 78-feature flat-vector schema is awkward (use a JSON-string column or split into top-N columns).
2. @customdataclass_pyo3 decorator (Rust-backed payload)
Same authoring shape as @customdataclass but the underlying wrapper is a
native same-binary Rust payload registered via
register_custom_data_class(MyType). Registration precedence (per the
custom_data doc) resolves the native Rust path first, falling back to
the Python wrapper only if no native handler exists. JSON and Arrow go
through native Rust handlers - no GIL hop, no FFI bridge. This is the
fast path.
When to reach for it: only if profiling shows the Python encode path is a
bottleneck. For Cortana’s ~1-10 events/sec scoring rate, the pure-Python
@customdataclass is fine.
3. publish_signal - ephemeral primitive notifications
self.publish_signal("score_alert", 67.5, ts_event)
self.subscribe_signal("score_alert")
def on_signal(self, signal):
print("Signal", signal)Signals carry a single primitive (str/float/int/bool/bytes) plus a
name and a ts_event. They flow on the bus but do not auto-register
with the catalog - they’re not persisted, not replay-deterministic across
runs unless you build your own sink, and not structured.
Use only for: debug pings, dashboard heartbeats, ephemeral cooldown flags. Never use for: anything you want to query, replay, or audit.
4. Custom DataClient - bytes-in, normalized-Data-out
A LiveDataClient / LiveMarketDataClient (Rust core + PyO3 binding +
Python factory, per nautilus-developer-guide.md) that opens the
WebSocket / REST connection, decodes wire bytes into a Data subclass
(usually a @customdataclass instance), and pushes through
_handle_data(event). The DataEngine then runs the cache-then-publish
sequence (per nautilus-architecture.md).
This is where UWFlowAlert originates in MK3 - the UW WebSocket adapter
is not an Actor (it is bytes-in), but every UWFlowAlert it emits is
authored as a @customdataclass.
5. Manual Data subclass with full schema control
The option-Greeks pattern (sketched in nautilus-data.md): subclass
Data, hand-write to_dict/from_dict/to_bytes/from_bytes/schema,
explicitly call register_serializable_type(...) for bus serialization
and register_arrow(...) for catalog persistence. Equivalent to what
@customdataclass generates, but you control the Arrow schema exactly
(useful for nested types, pa.Map, large binary blobs).
Use only when @customdataclass’s flat-scalar schema doesn’t fit your
data shape.
Anti-pattern: raw self.msgbus.publish(topic, message)
Bypasses ts_event/ts_init ordering, no catalog persistence, typo-prone
topic strings. The nautilus-concepts.md doc warns explicitly: “you must
track topic names manually (typos could result in missed messages).” Use
only for one-off internal plumbing where ordering doesn’t matter.
DataRegistry - the runtime resolver
crates/model/src/data/registry.rs is the central resolver. Singletons
(OnceLock-initialized DashMap) keyed by type_name:
- JSON deserializers keyed by
type_name. - Arrow schemas, encoders, and decoders keyed by
type_name. - Python extractors that convert a Python object into
Arc<dyn CustomDataTrait>. - Rust extractor factories that produce Python extractors for same-binary types.
Concurrent registration is safe: registration uses atomic
DashMap::entry() so concurrent register_* and ensure_* calls do not
race. The registry is what makes “no hardcoded schemas in the binary”
possible - the catalog write path looks up the encoder by type_name
extracted from DataType, not by static dispatch.
CustomData - the PyO3 wrapper
The outer CustomData wrapper is the common container that crosses the
FFI boundary. Constructor: CustomData(data_type, data) where DataType
is first, payload second.
It contains:
- A
DataType. - An inner custom payload implementing
CustomDataTrait(wrapped inArc<dyn CustomDataTrait>). - Timestamps (
ts_event,ts_init) delegated to the innerCustomDataTraitand exposed as properties on the wrapper.
Python-side semantics: __eq__ and __repr__ are implemented (equality
uses the Rust PartialEq logic). Instances are intentionally
unhashable so equality stays consistent with payload comparison.
Two backends for the inner payload:
PythonCustomDataWrapper- used for pure-Python custom data. Stores reference to the Python object, cachests_event/ts_init/type_name, calls Python methods for JSON / Arrow ops under the GIL.- Native same-binary Rust payload - concrete Rust type, downcastable
directly from
Arc<dyn CustomDataTrait>. No Python callback path.
User code only sees CustomData - the same API regardless of backend.
DataType - routing identity vs storage path
DataType(type_name, metadata=None, identifier=None).
The critical rule from the doc:
“Equality, hashing, and topic routing are derived from
type_nameandmetadataonly. TwoDataTypevalues with the same type name and metadata but different identifiers compare equal and publish to the same message bus topic. Theidentifieraffects only the storage path underdata/custom/<type_name>/<identifier...>.”
So: routing topic = (type_name, metadata). Catalog path = adds
identifier. This means a Cortana subscriber for
UWFlowAlert(metadata={"underlying": "SPY"}) will receive every alert
regardless of whether the publisher tagged a specific identifier like
expiry=2026-05-08.
CustomData JSON envelope
When serialized to JSON (e.g. to_json_bytes/from_json_bytes, SQL cache,
Redis), CustomData uses a single canonical envelope so deserialization
does not depend on user payload field names:
type- the custom type name (fromCustomDataTrait::type_name).data_type- object withtype_name,metadata, optionalidentifier.payload- the inner payload only (result ofCustomDataTrait::to_jsonparsed as a value). Registered deserializers receive only this value infrom_json, so user structs can use any field names (includingvalue) without conflicting with wrapper metadata.
This is what makes a Redis bus hop or restart-replay safe even when your
custom-data class has a field named metadata or data_type.
Serialization paths - which extension point supports what
| Extension point | Bus pub/sub | Cache (in-memory) | Cache DB (Redis/Postgres) | Parquet catalog | Arrow schema |
|---|---|---|---|---|---|
@customdataclass | yes | yes | yes (via JSON envelope) | yes (Python encode via Arrow C FFI) | auto-generated flat |
@customdataclass_pyo3 | yes | yes | yes | yes (native Rust encode) | native |
publish_signal | yes | yes (transient) | no (not auto-registered) | no | n/a |
Custom DataClient (emitting custom types) | yes | yes | yes | yes | inherits from emitted type |
Manual Data subclass + register_serializable_type + register_arrow | yes | yes | yes | yes | hand-written |
Raw msgbus.publish(topic, ...) | yes | no | no | no | n/a |
The asymmetry of publish_signal is the load-bearing distinction: a signal
is alive for the duration of the bus dispatch, then gone. A
@customdataclass event is alive forever in the catalog, replayable into
any future backtest, queryable by DataType, and survives a Redis bus
restart.
Subscription patterns
From Strategies
from nautilus_trader.model.identifiers import ClientId
self.subscribe_data(
data_type=DataType(ScoreUpdate, metadata={"underlying": "SPY"}),
client_id=ClientId("SCORING_ACTOR"),
)
def on_data(self, data: Data) -> None:
if isinstance(data, ScoreUpdate):
self._handle_score(data)on_data is the catch-all handler for custom data - type-check inside the
handler. Strategies inherit this from Actor.
From Actors
Identical signature. Actors typically subscribe in on_start():
class MetaGateActor(Actor):
def on_start(self) -> None:
self.subscribe_data(
data_type=DataType(ScoreUpdate, metadata={"underlying": "SPY"}),
)
def on_data(self, data: Data) -> None:
if isinstance(data, ScoreUpdate):
meta_prob = self._meta_model.predict(data.features_json)
self.publish_data(
DataType(MetaProb, metadata={"underlying": "SPY"}),
MetaProb(
instrument_id=data.instrument_id,
meta_prob=meta_prob,
ts_event=data.ts_event,
ts_init=self.clock.timestamp_ns(),
),
)Pattern: Actor subscribes to upstream custom data, computes a derived
event, publishes downstream custom data. Strategy subscribes to the final
gate event. This is the chain nautilus-actors.md recommends for the
spike - one ScoringActor → one MetaGateActor → one CortanaStrategy.
Signals (different handler)
self.subscribe_signal("score_alert")
def on_signal(self, signal):
# signal is a primitive
...Signals route to on_signal, NOT on_data. Don’t conflate the two.
Replay semantics - does custom data round-trip cleanly?
Yes, with caveats. The catalog write/read flow is symmetric:
Write path
ParquetDataCatalogreceives aCustomDatavalue.- Extracts
type_name,metadata,identifierfromDataType. - Looks up Arrow encoder in
DataRegistry. - Encodes values to a
RecordBatch. - Appends a
data_typecolumn containing the persistedDataType. - Attaches
type_nameand metadata to Arrow schema. - Writes batch to Parquet under
data/custom/<type_name>/<identifier...>.
Identifiers are normalized before becoming path segments.
Read path
- Catalog reads matching Parquet files.
- Extracts
type_namefrom schema metadata. - Asks
DataRegistryfor the registered decoder. - Decodes
RecordBatchintoVec<Data>. - Reconstructs
CustomDatawith the originalDataType.
Caveat 1 - registration must match. The decoder is looked up by
type_name at read time, so the consuming process must have registered
the same custom-data class (@customdataclass re-import, or
@customdataclass_pyo3 registration) before the catalog query runs.
If the type isn’t registered, the read fails. This is why custom-data
class definitions belong in a shared module imported by both the writer
and the reader (the strategy, the actor, the backtest harness, the
brain-logger).
Caveat 2 - feather-to-parquet conversion. When converting a Feather
stream to Parquet (e.g. after a backtest), the custom-data branch
decodes batches and re-writes them via write_custom_data_batch so
custom data written through the Feather writer is correctly converted
to Parquet. The conversion is automatic but requires the type to be
registered in the converting process.
Caveat 3 - pure-Python encode goes through Arrow C FFI. For
@customdataclass (not @customdataclass_pyo3), encode/decode hops the
GIL once per batch, exporting/importing via RecordBatch._export_to_c /
RecordBatch._import_from_c. Per-batch overhead is small but real.
Caveat 4 - schema evolution is not handled by the framework. If you
add a field to a @customdataclass, old Parquet files do not auto-
migrate. Plan for this in MK3: version the type name (ScoreUpdateV1,
ScoreUpdateV2) or write a migration path.
ts_event vs ts_init - the contract
Every Data subclass - built-in or custom - carries two UNIX-nanosecond
timestamps:
ts_event- when the event actually occurred at the source (exchange, vendor WebSocket, scoring computation moment).ts_init- when Nautilus initialized the internal object representing that event.
Practical rules:
latency = ts_init - ts_eventgives total system latency. Clocks aren’t synchronized so it’s an estimate, not a guarantee positive.- For internally-generated custom data (e.g.,
ScoreUpdateproduced inside the runtime),ts_eventandts_initmay legitimately be equal - both areself.clock.timestamp_ns()at the moment of computation. - For UW-sourced data,
ts_eventshould be the UW WebSocket payload’s vendor timestamp (in nanoseconds - UW gives milliseconds, multiply by 1e6);ts_initisself.clock.timestamp_ns()at the moment the DataClient deserializes the frame.
Backtest replay ordering
From nautilus-data.md: backtest data is ordered by ts_init using a
stable sort. This is the property MK3 needs for decisions.db replay.
Live mode processes data as it arrives.
Open question carryover #7 - nanosecond-tie ordering
The spike plan’s open question #7 asks: does Nautilus’s stable-sort
handle multiple events at the same ts_init nanosecond deterministically?
The custom_data doc does NOT explicitly address this. What we know:
- The architecture is single-threaded dispatch on the kernel core
(
nautilus-architecture.md) → event order on the bus = order they were placed on the bus. - Backtest sort is documented as stable sort by
ts_init(nautilus-data.md). - Stable sort preserves insertion order within tie groups - meaning
ordering among ties depends on the order data was fed into the
BacktestEngine. BacktestDataConfigreads from the catalog in time order, but Parquet files don’t guarantee within-file row ordering across writes unless you explicitly sort on write.
Implication for MK3: ties are deterministic per-run if the input
ordering is deterministic, but cross-run determinism depends on
whether the catalog read produces identical row order every time. For
decisions.db replay where multiple ScoreUpdate events legitimately
share a millisecond (and therefore a ts_init after the ms→ns
conversion), this could drift if the Parquet writer doesn’t preserve
order.
Spike action item (Step 6): verify experimentally - write 100
custom-data events with identical ts_init, query the catalog, confirm
the read order matches the write order across multiple runs. If it
doesn’t, the migration path is to mint synthetic monotonic ts_init
during decisions.db → Parquet ingest (add an artificial nanosecond
offset per event within a millisecond bucket).
Code-reading task: the actual sort live in
crates/data/src/engine/mod.rs (or similar). Confirm during the spike.
Cortana MK3 implications - extension-point mapping per type
The 9-page sibling sweep established that five custom-data classes will live in MK3. Mapping each to the right extension point:
UWFlowAlert - UW WebSocket flow alert
Extension point: @customdataclass, emitted by a custom UW
LiveDataClient.
Why: structured event (strike, side, size, premium, sweep/block flags), needs to be persisted for replay (decisions.db is replaying these alerts indirectly via score updates today), needs to round-trip through the catalog so backtest replays the same UW event sequence.
Why NOT publish_signal: signal can carry one primitive. UW alerts
have ~10 fields. Hard requirement → structured.
Why NOT @customdataclass_pyo3: Python is fine at UW’s ~1-10
alerts/sec rate. Reach for the Rust-backed version only if profiling
shows Arrow encode is a bottleneck during catalog writes - not likely
at Cortana volume.
ScoreUpdate - composite scoring engine output
Extension point: @customdataclass, published by a ScoringActor.
Why: the load-bearing audit event (78 features + composite + bias +
conviction). Needs catalog persistence for postmortem replay; needs
strict ts_event/ts_init ordering for backtest determinism. A signal
would lose all the structure.
Schema awkwardness: the 78-feature vector is the hard part. Three
options (per nautilus-data.md):
- (a) Serialize as JSON string in one column. Easy. Loses
query-ability via DataFusion
where=predicates. - (b) 78 separate Arrow columns. Clean. Rigid - adding a feature requires a schema migration.
- (c) Arrow
Maptype. Flexible. PyArrow backend only.
Recommendation for spike: (a) JSON string column. If we need DataFusion filtering on individual features post-spike, migrate to (b) with a versioned class name.
MetaProb - secondary classifier output
Extension point: @customdataclass, published by a MetaGateActor
that subscribes to ScoreUpdate.
Why separate event vs field on ScoreUpdate? Open question from
nautilus-strategies.md. Two factors decide:
- Cadence. If meta-prob runs synchronously inside the scoring
actor, fold it as a field on
ScoreUpdate. If it runs on a separate timer (e.g., heavy ML inference batched every 500ms), separateMetaProbevent. - Subscriber graph. If the strategy is the only consumer and
always pairs
ScoreUpdate + meta_prob1:1, fold it. If the dashboard / logger / experiment harness wants meta-prob without carrying the 78-feature vector, separate it.
Recommendation for spike: start as a field on ScoreUpdate (one
event, simpler). Split out only if the meta-model adds latency and we
want to decouple cadence.
EmaDecayValue - EMA-decay flow value
Extension point: field on ScoreUpdate (default), OR
@customdataclass event if it updates independently of score
recomputation.
Why: EMA decay is derived state, recomputed on every flow event. If it’s recomputed in lock-step with scoring, fold the field. If a separate Actor updates it on an independent timer, separate event.
Recommendation for spike: field on ScoreUpdate. Promote to
EmaDecayUpdate(Data) only if the cadence diverges.
RegimeChange - chop ↔ trend ↔ power-hour reclassification
Extension point: @customdataclass, published by a RegimeDetector
Actor.
Why a separate event: regime changes are infrequent (a few per day),
asynchronous to scoring (driven by their own rolling-window detector),
and broadly subscribed (strategy, position-manager, dashboard, logger
all want to know). Folding into ScoreUpdate would force every score
event to re-publish a regime field most subscribers don’t need. Separate
event = clean subscriber graph.
Why NOT publish_signal: regime is not just a string label - it’s
the regime + the confidence + the prior regime + the trigger
classification. Signal can carry one of those, not the bundle.
Summary table
| Cortana type | Extension point | Why |
|---|---|---|
UWFlowAlert | @customdataclass (via custom DataClient) | Structured UW WebSocket payload, needs catalog persistence |
ScoreUpdate | @customdataclass | Audit-trail event, 78 features, bus + catalog + replay |
MetaProb | Field on ScoreUpdate (default) OR @customdataclass if cadence diverges | Synchronous meta-prob = field; async batched = separate event |
EmaDecayValue | Field on ScoreUpdate (default) OR @customdataclass if independent | Lock-step with score = field; independent timer = separate |
RegimeChange | @customdataclass | Infrequent, asynchronous, broadly subscribed |
Worked example - UWFlowAlert lifecycle
End-to-end the way it’ll work in MK3:
# In cortana/mk3/data/uw_types.py - shared module imported everywhere
from nautilus_trader.model.custom import customdataclass
from nautilus_trader.core import Data
from nautilus_trader.model import InstrumentId
@customdataclass
class UWFlowAlert(Data):
instrument_id: InstrumentId = InstrumentId.from_str("SPY.ARCA")
strike: float = 0.0
expiry: str = ""
option_side: str = "" # "CALL" | "PUT"
aggressor_side: str = "" # "BUY" | "SELL" | "MIXED"
premium_usd: float = 0.0
size_contracts: int = 0
is_sweep: bool = False
is_block: bool = False
flow_score: float = 0.0
underlying_price: float = 0.0
raw_id: str = ""
# In cortana/mk3/adapters/uw/data_client.py
class UWLiveDataClient(LiveMarketDataClient):
async def _on_message(self, frame: dict) -> None:
alert = UWFlowAlert(
instrument_id=InstrumentId.from_str("SPY.ARCA"),
strike=frame["strike"],
expiry=frame["expiry"],
option_side=frame["side"],
aggressor_side=frame["aggressor"],
premium_usd=frame["premium"],
size_contracts=frame["size"],
is_sweep=frame["is_sweep"],
is_block=frame["is_block"],
flow_score=frame["score"],
underlying_price=frame["underlying"],
raw_id=frame["alert_id"],
ts_event=int(frame["ts_ms"]) * 1_000_000, # ms → ns
ts_init=self._clock.timestamp_ns(),
)
# _handle_data routes through DataEngine → Cache → MessageBus
self._handle_data(alert)
# In cortana/mk3/actors/scoring_actor.py
class ScoringActor(Actor):
def on_start(self) -> None:
self.subscribe_data(
data_type=DataType(UWFlowAlert, metadata={"underlying": "SPY"}),
)
def on_data(self, data: Data) -> None:
if isinstance(data, UWFlowAlert):
self._update_flow_state(data)
score = self._compute_composite()
self.publish_data(
DataType(ScoreUpdate, metadata={"underlying": "SPY"}),
ScoreUpdate(
instrument_id=data.instrument_id,
composite_score=score.value,
bias=score.bias,
conviction=score.conviction,
features_json=json.dumps(score.features),
ts_event=data.ts_event, # propagate source event time
ts_init=self.clock.timestamp_ns(), # init = now
),
)
# In cortana/mk3/strategies/cortana_strategy.py
class CortanaStrategy(Strategy):
def on_start(self) -> None:
self.subscribe_data(
data_type=DataType(ScoreUpdate, metadata={"underlying": "SPY"}),
)
def on_data(self, data: Data) -> None:
if isinstance(data, ScoreUpdate):
if data.composite_score >= 65 and data.bias == "BULL":
self._submit_bull_call(data)The same UWFlowAlert, ScoreUpdate classes are imported by:
- the live
UWLiveDataClient, - the live
ScoringActor, - the live
CortanaStrategy, - the backtest harness,
- the catalog migration script (
decisions.db→ Parquet), - the brain-logger Actor,
- any out-of-process dashboard subscriber.
One class, six consumers. The DataRegistry registration happens once
per process at module import time.
Anti-patterns to avoid
- Using
publish_signalfor anything you’d want to query later. Signals are not catalog-persisted. If the spike’s success criteria include “replay 2026-04-16 chop-day cluster”, a signal-based implementation cannot be replayed. - Skipping the shared types module. If
UWFlowAlertis defined in the producer’s module and re-defined in the consumer’s, theDataRegistryregisters two distincttype_namevalues and the bus topic mismatch silently drops every message. Define once, import everywhere. - Mutating a published
CustomDatainstance.nautilus-architecture.mdis explicit: messages are immutable post-publish. Derive new local state if you need a different shape. - Putting the 78-feature vector in 78 separate cache writes. One
ScoreUpdateper score event, not 78 partial updates. The doc warns: stable-sort byts_initmakes 78 simultaneous events ordering-ambiguous within ties (see open question #7). - Using
@customdataclass_pyo3premature optimization. Reach for it only after profiling shows pure-Python@customdataclassis the bottleneck. At Cortana’s event rate (low double digits per second), it almost certainly isn’t. - Forgetting
ts_eventfor internally-generated events. Even aScoreUpdatecomputed inside the runtime needs a sensiblets_event- propagate the source event’sts_event(the UW alert that triggered the score), don’t just set both to “now”. Otherwise replay-ordered backtests lose the cause-effect timing. - Custom topic strings via
msgbus.publishfor Cortana events. Usepublish_datawith aDataType. The string-topic path is typo-prone and bypasses replay determinism.
When this concept applies
- Defining any new structured event type for Cortana MK3 (scoring, gating, regime, premium-flow, position telemetry, exit decisions).
- Authoring the UW custom adapter - every UW-sourced object becomes a custom-data class.
- Migrating
decisions.dbrows to Parquet - each row class becomes a custom-data class with a registered Arrow schema. - Backtest replay of historical Cortana decisions.
- Any out-of-band subscriber (dashboard, brain-logger, Telegram notifier) that consumes Cortana events.
When this concept does NOT apply
- Built-in market data (
QuoteTick,TradeTick,Bar,OrderBookDelta,OptionGreeks). Those have hardcoded Rust schemas and don’t go through theDataRegistrycustom path. - Order/Position/Account events (
OrderFilled,PositionOpened,AccountState). Those are Events on the MessageBus event topic, not custom data. Seenautilus-events.mdfor the distinction - Events flow with their own dispatch ladder (on_order_filled→on_order_event→on_event); custom Data flows toon_data. - Component lifecycle transitions (
STARTING/RUNNING/DEGRADED). Those are FSM transitions on the Component, not bus events. Wrap them in a customEngineHealthevent if you want them in the audit trail. - Logging messages. The logging subsystem has its own MPSC channel; don’t conflate.
- Cython
@customdataclass. The custom_data doc explicitly separates this - the Cython decorator is a different system from the PyO3CustomDataarchitecture this page describes. If you find a Nautilus example using Cython@customdataclass, it’s older; the PyO3 path is the current architecture.
See Also
- Nautilus Data Model - broader data model, built-in types, ParquetDataCatalog, ts_event/ts_init basics.
- Nautilus Architecture - single-threaded dispatch, cache-then-publish, kernel topology.
- Nautilus Actors - actors publish custom data; decision rule for “Actor vs. DataClient vs. Strategy”.
- Nautilus Strategies - strategies subscribe to
custom data;
publish_signalvspublish_datadiscussion. - Nautilus Events - Events vs custom Data distinction; audit-logger Actor sketch.
- Nautilus Developer Guide - custom adapter Phase 1-7, Cortana → Nautilus translation table.
- Spike plan:
~/conductor/workspaces/cortanaroi-mk2/belo-horizonte/plans/2026-05-09-nautilus-spike.md - Source: https://nautilustrader.io/docs/latest/concepts/custom_data/
Timeline
- 2026-05-07 | Cody - Filed during pre-spike concept mastery sweep batch 2.