Nautilus Cache

The Nautilus Cache is a single, central, in-memory database that holds every piece of trading-related state a node needs in flight: built-in market data (quotes, trades, bars, books), the full order/position/account/ instrument graph, and arbitrary user-defined objects shared across strategies. The DataEngine and ExecutionEngine are the only writers - the framework guarantees a cache-then-publish ordering for quotes/trades/ bars so any subscriber handler can read the very value that triggered it. Strategies and Actors read via self.cache.*, never write trading state directly. Persistence is configurable per node: in-memory only (default, with capacity eviction), or backed by Redis via DatabaseConfig for crash-only restart, multi-process state sharing, and out-of-band consumers (dashboard, brain logger). For Cortana MK3 this collapses every hand-rolled cache (in-memory dicts, SQLite tables, Pickle blobs, lru_cache) into one runtime object with one ordering invariant - and directly addresses the failure modes from project_data_loss_april22 and the 2026-05-06 power-outage state divergence.

This page specializes Nautilus Concepts (which covers the Cache at the architectural level, alongside MessageBus, DataEngine, ExecutionEngine, RiskEngine) and is the parallel of Nautilus Architecture (which covers cache-then-publish ordering at high level). For data-specific patterns (custom Data subclasses, ParquetDataCatalog, ts_event/ts_init) see Nautilus Data. For who reads the cache (Strategy queries vs Actor queries, identical surface) see Nautilus Strategies and Nautilus Actors.

Why this page exists

nautilus-architecture.md describes cache-then-publish at the topology level. nautilus-data.md covers what lands in cache. This page is the API surface and operational reference: every read method, every write trigger, every persistence option, every ordering guarantee, and the multi-tenant question for the SaaS roadmap. It’s the saturation-level brain page Cody will keep open during Step 4-7 of the 2026-05-09 spike.

Core claim

“The Cache is a central in-memory database that stores and manages all trading-related data, from market data to order history to custom calculations.”

One object. One read API. One write discipline. Three categories of content (market data, trading records, custom objects). Optional Redis externalization. Single-threaded write ordering with cache-then-publish for the data-stream path.

This is the single biggest architectural delta from MK2. Cortana today has at least seven state stores (in-memory dicts in app.py, SQLite in cortanaroi/db/decisions.py, Pickle blobs for cooldown, a separate Pickle for impulse history, an lru_cache on UW REST, IBKR’s internal cache, and the dashboard’s redundant SQL view). MK3 collapses all of them into one Cache instance with one consistent contract.

Scope - what the Cache stores

Three top-level categories:

1. Market data

“Stores recent market history (e.g., order books, quotes, trades, bars). Gives you access to both current and historical market data for your strategy.”

Specifically:

Data typeCache method (read)Capacity-bounded?
OrderBook (live state)order_book(instrument_id)per-instrument
QuoteTick historyquote_ticks(instrument_id), quote_tick(id, index=0)yes (tick_capacity, default 10,000)
TradeTick historytrade_ticks(instrument_id), trade_tick(id, index=0)yes (tick_capacity, default 10,000)
Bar historybars(bar_type), bar(bar_type, index=0)yes (bar_capacity, default 10,000 per bar type)
Latest price`price(instrument_id, price_type=BIDASK

Quote: “Each bar type maintains its own separate capacity. For example, if you’re using both 1-minute and 5-minute bars, each stores up to bar_capacity bars.”

Eviction: “When bar_capacity is reached, the Cache automatically removes the oldest data.” No warning logged. If you need full history, write to ParquetDataCatalog (see nautilus-data.md) or raise the cap explicitly via CacheConfig.

Reverse indexing convention (load-bearing): “All market data in the cache uses reverse indexing, so the most recent entry sits at index 0.” self.cache.bar(bar_type) (no index) is bar(..., index=0) is the most recent bar.

2. Trading records

“Maintains complete Order history and current execution state. Tracks all Positions and Account information. Stores Instrument definitions and Currency information.”

These are unbounded by default - no LRU eviction on orders/positions because the lifecycle is finite per-instance.

RecordCache method
Order (any state)order(client_order_id)
Order collectionsorders(), orders(venue=...), orders(strategy_id=...), orders(instrument_id=...)
Order state filtersorders_open(), orders_closed(), orders_emulated(), orders_inflight(), orders_active_local()
Positionposition(position_id)
Position collectionspositions(), positions(venue=...), positions(side=PositionSide.LONG), etc.
Accountaccount(account_id), account_for_venue(venue), account_id(venue)
Instrumentinstrument(instrument_id), instruments(venue=...), instruments(underlying="ES"), instrument_ids(...)

3. Custom user data

“You can store any user-defined objects or data in the Cache for later use. Enables data sharing between different strategies.”

Two surfaces:

Raw bytes-by-key (low-level, fast):

self.cache.add(key="my_key", value=b"some binary data")
stored = self.cache.get("my_key")  # bytes | None

Custom Data subclasses (preferred for any structured object): flow through DataEngine if published as Data, get cached automatically by type+identifier. See nautilus-data.md for the @customdataclass pattern. The Greeks-style example uses add(...) / get(...) with a key-naming convention:

def greeks_key(instrument_id):
    return f"{instrument_id}_GREEKS"
 
def cache_greeks(self, greeks_data):
    self.cache.add(greeks_key(greeks_data.instrument_id),
                   greeks_data.to_bytes())
 
def greeks_from_cache(self, instrument_id):
    raw = self.cache.get(greeks_key(instrument_id))
    return GreeksData.from_bytes(raw) if raw else None

Hard caveat from the doc: “The Cache is not designed to be a full database replacement. For large datasets or complex querying needs, consider using a dedicated database system.”

For MK3: cache is the runtime hot store. Historical bulk lives in ParquetDataCatalog. The brain (~/brain) and any SQL audit table are out-of-band sinks - not Cache substitutes.

Read API - full surface exposed to Strategy / Actor

self.cache is the same object on Actors and Strategies. The full read surface (collected from the doc’s API listing):

Market data reads

# Bars
self.cache.bars(bar_type)                              # list[Bar], reverse-indexed
self.cache.bar(bar_type)                               # Bar | None  (latest)
self.cache.bar(bar_type, index=1)                      # Bar | None  (second-latest)
self.cache.bar_count(bar_type)                         # int
self.cache.has_bars(bar_type)                          # bool
 
# Quotes
self.cache.quote_ticks(instrument_id)
self.cache.quote_tick(instrument_id)
self.cache.quote_tick(instrument_id, index=1)
self.cache.quote_tick_count(instrument_id)
self.cache.has_quote_ticks(instrument_id)
 
# Trades
self.cache.trade_ticks(instrument_id)
self.cache.trade_tick(instrument_id)
self.cache.trade_tick(instrument_id, index=1)
self.cache.trade_tick_count(instrument_id)
self.cache.has_trade_ticks(instrument_id)
 
# Books
self.cache.order_book(instrument_id)                   # OrderBook | None
self.cache.has_order_book(instrument_id)
self.cache.book_update_count(instrument_id)
 
# Prices (derived; price-type aware)
self.cache.price(instrument_id, price_type=PriceType.MID)
 
# Bar types catalog
self.cache.bar_types(
    instrument_id=instrument_id,
    price_type=PriceType.LAST,
    aggregation_source=AggregationSource.EXTERNAL,
)

Order reads

# Identity
self.cache.order(client_order_id)                      # Order | None
self.cache.order_exists(client_order_id)               # bool
 
# Collections
self.cache.orders()
self.cache.orders(venue=venue)
self.cache.orders(strategy_id=strategy_id)
self.cache.orders(instrument_id=instrument_id)
 
# State filters
self.cache.orders_open()
self.cache.orders_closed()
self.cache.orders_emulated()
self.cache.orders_inflight()
self.cache.orders_active_local()
 
# State predicates
self.cache.is_order_open(client_order_id)
self.cache.is_order_closed(client_order_id)
self.cache.is_order_emulated(client_order_id)
self.cache.is_order_inflight(client_order_id)
self.cache.is_order_active_local(client_order_id)
 
# Counts
self.cache.orders_open_count()
self.cache.orders_closed_count()
self.cache.orders_emulated_count()
self.cache.orders_inflight_count()
self.cache.orders_active_local_count()
self.cache.orders_total_count()
self.cache.orders_open_count(side=OrderSide.BUY)
self.cache.orders_total_count(venue=venue)

Position reads

# Identity
self.cache.position(position_id)                       # Position | None
self.cache.position_exists(position_id)
 
# Collections
self.cache.positions()
self.cache.positions_open()
self.cache.positions_closed()
self.cache.positions(venue=venue)
self.cache.positions(instrument_id=instrument_id)
self.cache.positions(strategy_id=strategy_id)
self.cache.positions(side=PositionSide.LONG)
 
# State predicates
self.cache.is_position_open(position_id)
self.cache.is_position_closed(position_id)
 
# Cross-references
self.cache.orders_for_position(position_id)            # list[Order]
self.cache.position_for_order(client_order_id)         # Position | None
 
# Counts
self.cache.positions_open_count()
self.cache.positions_closed_count()
self.cache.positions_total_count()
self.cache.positions_open_count(side=PositionSide.LONG)
self.cache.positions_total_count(instrument_id=instrument_id)

Account & instrument reads

self.cache.account(account_id)                         # Account | None
self.cache.account_for_venue(venue)
self.cache.account_id(venue)
 
self.cache.instrument(instrument_id)
self.cache.instruments()
self.cache.instruments(venue=venue)
self.cache.instruments(underlying="ES")
self.cache.instrument_ids()
self.cache.instrument_ids(venue=venue)

Custom data reads

self.cache.get(key)                                    # bytes | None

(For typed custom data published as Data subclasses, the platform handles cache by type+id automatically - see nautilus-data.md.)

Universal return shape

All point queries return None when missing. All collection queries return empty lists when nothing matches. There are no “default values” - the absence of data is always observable. This matters because the 2026-05-06 incident was partly rooted in stale fallbacks: callers got old values when fresh ones should have been None.

Write semantics - who writes, when, in what order

The cache has two writers and one writer discipline.

Writers

  1. DataEngine - writes market data (QuoteTick, TradeTick, Bar, OrderBook state via BookUpdater, custom Data).
  2. ExecutionEngine - writes orders, positions, accounts, fills.

That’s it. Strategies do not write trading state to the cache. Strategies submit orders via self.submit_order(...); the ExecutionEngine writes the resulting Order and Position records. Actors do not write at all on the trading-state side. For custom user data, self.cache.add(key, bytes) is permitted but should be used sparingly - typed custom Data subclasses are preferred because they ride the DataEngine’s ordering guarantees.

Cache-then-publish invariant (the load-bearing rule)

The Nautilus contract, verbatim:

“The DataEngine writes to the Cache before publishing to subscribers, so the latest value is available in the cache by the time your handler runs.”

And from the architecture page’s Life of a Quote Tick:

“Step 4: Cache stores the quote. handle_quote writes the quote into the Cache via cache.add_quote(quote), making it available to any component through self.cache.quote_tick(instrument_id).”

“Step 5: MessageBus publishes.”

The mechanical guarantee: inside on_quote_tick(tick), self.cache.quote_tick(instrument_id) returns the same tick that triggered the handler. There is no race window. The single-threaded kernel + cache-then-publish is what makes this true by construction - not by careful coding.

This applies to QuoteTick, TradeTick, Bar (and custom Data that flows through the DataEngine).

Order book exception

The doc is explicit:

“Order book deltas and depth snapshots are published directly without a cache write; book state is maintained separately through BookUpdater subscriptions.”

Reason: book state is incremental and high-frequency. The BookUpdater maintains the book object itself (subscribers query it directly via self.cache.order_book(instrument_id)); the deltas don’t get written to the cache as historical records. Implication: if you need a historical book delta stream, persist via ParquetDataCatalog, not the Cache.

Live caveat - async write delay

“In live contexts, the engine applies updates asynchronously, so you might see a brief delay between an event and its appearance in the Cache.”

This is the qualifier on cache-then-publish: for events that flow through the DataEngine’s main path (quotes/trades/bars), ordering holds. For events that arrive through other paths - e.g., an account snapshot update arriving while you’re handling a quote - the cache may show a brief lag. Strategy code should never assume cross-path synchrony: if you’re in on_quote_tick and you need the most recent OrderFilled, query the cache; do not assume in-process ordering between distinct event streams.

Reads inside handlers - the safe pattern

Because of cache-then-publish, this is always correct:

def on_quote_tick(self, tick: QuoteTick) -> None:
    # The very tick that triggered this handler is already in cache.
    latest = self.cache.quote_tick(tick.instrument_id)
    assert latest is tick or latest == tick
    # Plus, prior history is already there:
    prev = self.cache.quote_tick(tick.instrument_id, index=1)

What you must NOT do:

# WRONG: writing to cache yourself before letting the engine see the
# data. Breaks ordering, breaks reconciliation, breaks backtest replay.
self.cache.add("my_quote", quote.to_bytes())
self.publish_data(...)

The right path for inserting custom data into the cache from an Actor is self._handle_data(event) from inside a DataClient, which routes through DataEngine and preserves cache-then-publish. From outside an adapter, use self.publish_data(DataType(MyType, ...), event) and let the DataEngine + bus do the cache write for you.

Persistence - in-memory, file-backed, Redis-backed

Default - in-memory only

No database configured = pure in-memory. Cache state vanishes on process exit. For backtest this is correct (each run starts clean). For live this means: flush_on_start=False is meaningless without a database, and a crash means losing in-flight emulator state, custom cache entries, and any account/order knowledge that wasn’t already on the venue.

File-backed

Not a first-class option in the cache concept doc. The catalog (ParquetDataCatalog, see nautilus-data.md) is the file-backed historical store; the cache itself is either in-memory or DB-backed. Don’t conflate these - file-backed cache is not how Nautilus thinks.

Redis-backed (the recommended live story)

Configuration:

from nautilus_trader.cache.config import CacheConfig
from nautilus_trader.common.config import DatabaseConfig
 
cache_config = CacheConfig(
    database=DatabaseConfig(
        type="redis",
        host="localhost",
        port=6379,
        connection_timeout=2,
        response_timeout=2,
    ),
    encoding="msgpack",                # or "json"
    timestamps_as_iso8601=False,
    buffer_interval_ms=None,
    bulk_read_batch_size=None,
    use_trader_prefix=True,
    use_instance_id=False,
    flush_on_start=False,
    drop_instruments_on_reset=True,
    tick_capacity=10_000,
    bar_capacity=10_000,
)

Stated use cases for the database:

“Long-running systems: If you want your data to survive system restarts, upgrading, or unexpected failures.”

“Historical insights: When you need to preserve past trading data for detailed post-analysis or audits.”

“Multi-node or distributed setups: If multiple services or nodes need to access the same state.”

The third bullet is the dashboard / brain-logger story: an out-of-band Python (or Node, or Rust) process subscribes to the same Redis backing and reads cache state without being part of the kernel. Same pattern Nautilus uses for the optional MessageBus Redis externalization (see nautilus-concepts.md).

Persistence taxonomy - when to use what

Use caseStorage
BacktestIn-memory (default)
Sandbox / paper / stagingIn-memory or Redis (test the persistence path)
Live trading, single node, restartableRedis recommended
Live trading, multi-node, dashboard-attachedRedis required
Bulk historical replay dataParquetDataCatalog, NOT cache
Append-only audit logOut-of-band sink (SQLite, brain, S3)

For Cortana MK3: Redis is recommended for live, mandatory if the dashboard is co-deployed. The 2026-04-22 data loss class (externalized state) is solved IFF live runs against Redis.

Reconciliation - how cache rebuilds from venue truth on startup

The cache concept doc itself is sparse on reconciliation; the work happens in the LiveExecutionEngine (see nautilus-concepts.md and nautilus-architecture.md). The flow:

  1. Startup with Redis backing. The cache loads its prior state from Redis. Orders, positions, accounts, instruments, custom user data are all rehydrated.

  2. flush_on_start=True (config option) wipes Redis on startup - useful for clean-slate testing, dangerous in live (you’ve thrown away your knowledge of what’s at the venue).

  3. drop_instruments_on_reset=True (default) clears instruments when reset() is called - instruments are typically refetched from the venue at start.

  4. LiveExecutionEngine reconciliation fires next: pulls order status reports, fill reports, and position reports from the venue, then aligns cached state. Per the architecture doc:

    “With cached state, report data generates missing events to align the state. Without cached state, all orders and positions at the venue are generated from scratch.”

  5. Four reconciliation invariants (from nautilus-concepts.md): position quantity matches within instrument precision; average entry price aligns within tolerance; PnL integrity preserved through calculated pricing; synthetic identifiers deterministic across restarts.

  6. Continuous monitoring loop (live only): detects stale or lost in-flight messages and re-reconciles.

For Cortana, this is the answer to both the 2026-04-22 data loss incident and the 2026-05-06 power-outage state divergence: the cache is rebuilt from Redis (if persisted) AND reconciled against IBKR’s actual order/position state on startup. The “alert without action” P0 (project_pm_ibkr_exit_invariant) becomes structural - there is no scenario where the cache disagrees with the venue and nobody notices.

Backtest vs live behavior

The cache surface is identical in both modes - same read API, same write disciplines. Differences:

AspectBacktestLive
Default persistenceIn-memory only (correct - runs are independent)In-memory by default; Redis recommended
Write timingSynchronous within the deterministic kernel cycleSynchronous within the kernel cycle, with brief async lag for non-DataEngine paths
ReconciliationNone - the simulator owns truth on both sidesYes - LiveExecutionEngine reconciles against venue on startup + continuously
Capacity evictionSame (10k bars / 10k ticks defaults)Same
on_reset()Called between backtest runs; clears mutable state, preserves data + instruments unless drop_instruments_on_reset=TrueNot called - live engines do not reset
flush_on_startIrrelevant without DBWipes Redis if True

“Same Cache” is one of the seven structural points of the backtest-live parity claim (see nautilus-architecture.md). Cortana code reading self.cache.quote_tick(instrument_id) works identically in both contexts.

Multi-tenant scoping - can one process host multiple tenant caches?

The doc does not explicitly address multi-tenant cache scoping. Inferred answer: partial - one cache per TradingNode, one node per process, so isolation is process-level not cache-level.

Walking the evidence:

  1. The architecture is single-node-per-process. Per nautilus-architecture.md:

    “Running multiple TradingNode or BacktestNode instances concurrently in the same process is not supported due to global singleton state.”

    Singletons listed: _FORCE_STOP, logger mode/timestamps, global Tokio runtime, callback registries, other OnceLock instances.

  2. The cache is a per-Kernel object. One NautilusKernel per TradingNode, one cache per kernel. So a single process has one cache.

  3. Therefore, multi-tenant requires multi-process. Each tenant runs in its own process, with its own kernel, its own cache, its own Redis namespace (or its own Redis instance).

  4. What multi-tenancy is supported within a single cache: queries already filter by strategy_id - orders(strategy_id=...), positions(strategy_id=...). So “many strategies in one node” is first-class. But the strategies share the same cache object, so they have read access to each other’s orders/positions. That is not tenant isolation.

  5. Redis namespacing knobs that help:

    • use_trader_prefix=True (default) prefixes keys with the trader ID.
    • use_instance_id=False - when True, also prefixes with the instance ID. For multi-tenant deployment, set use_instance_id=True so each process gets a distinct prefix even if trader IDs collide.

Verdict for the spike’s Step 7.5 question

Per-tenant cache scoping: PARTIAL.

  • Within one process / one tenant: clean. Multi-strategy is first-class via strategy_id filtering. One cache, multiple strategies, no cross-strategy interference if disciplined.
  • Across tenants: requires process-per-tenant deployment. Each tenant gets its own TradingNode, its own cache, its own Redis namespace (use_instance_id=True is the load-bearing config flag). Shared market data subscriptions (one UW WebSocket serving N tenants) still need a separate fan-out architecture - likely a shared “market data hub” process that publishes to N tenant nodes via Redis MessageBus. The Cache itself does NOT solve fan-out.

So for the SaaS roadmap: the cache is correctly scoped per tenant because it lives inside the per-tenant process. What it does not do is multiplex multiple tenants inside one process. That’s ultimately the right architecture (compliance, isolation, blast radius), but it changes the unit-economics conversation - Step 7.5’s “process-per-tenant likely” guess is now confirmed.

Cortana MK3 implications

Concrete mapping from MK2’s seven (or more) cache surfaces to Nautilus’s one.

MK2 → Nautilus cache mapping

MK2 componentWhere it lives todayNautilus replacement
In-memory spy_price, last_score, etc. dicts in app.pyProcess-local Python dictsself.cache.quote_tick(instrument_id) + custom ScoringEvent cache
cortanaroi/db/decisions.py (SQLite)Disk SQLiteOut-of-band sink (Cache for runtime; SQLite/audit table for forensics)
Cooldown state PickleDisk PickleStrategy.on_save() / on_load() (per nautilus-strategies.md) - backed by Cache database
Impulse history PickleDisk Pickleself.cache.add(key, bytes) or custom Data
UW REST lru_cacheFunction-localUW custom DataClient writes via _handle_data → cache by type+id
IBKR adapter cacheIBKR adapter internalBuilt-in - IBKR adapter feeds Nautilus cache directly
Dashboard’s redundant SQL viewSeparate read replicaOut-of-band Redis subscriber to the same cache backing

Result: one runtime cache, one ordering invariant, one place to look when state seems wrong.

How this addresses the 2026-04-22 data loss class

project_data_loss_april22 root cause: workspace archive lost disk-resident state (Pickle blobs, cooldown state, decisions.db contents). Nautilus’s answer:

  1. Externalize critical state to Redis via DatabaseConfig. The cache becomes the runtime store, Redis becomes the durable backing. A workspace archive that loses the local FS no longer destroys trading state - Redis (running on a separate host, or at least a separate volume) survives.
  2. Crash-only design (per nautilus-architecture.md) means restart is the only recovery path, AND it’s the same code path the system runs every day. There is no “graceful shutdown” off-ramp that exists only on the happy day; the cache rebuild from Redis + venue reconciliation is always what happens at boot.
  3. flush_on_start=False (the default) means accidental restarts don’t wipe the durable backing. Combined with externalized state, this is the structural fix to GH #26.

The remaining attack surface is “what if Redis itself is the single-volume the workspace archive lost?” Mitigation: run Redis on a separate host/volume (or use a managed Redis with cross-zone replication). This is the operational discipline; the architecture is sound.

How this addresses the 2026-05-06 cache rewrite race

The 2026-05-06 power-outage post-mortem identified a cache rewrite race: components were updating their local view of spy_price at different times, and a strategy fired on a stale value. Nautilus’s answer is structural, not procedural:

  1. One cache. No “local view” exists. Every component reads self.cache.quote_tick(instrument_id) and gets the same answer.
  2. Single-threaded ordering. Per nautilus-architecture.md: “The kernel consumes and dispatches messages on a single thread… cache reads and writes.” No concurrent writers means no rewrite race - every write is serialized through the kernel.
  3. Cache-then-publish. The DataEngine writes the new quote before notifying any subscriber. By the time on_quote_tick runs in the strategy, the cache already reflects the new value. The “handler reads stale cache” failure mode is impossible by construction.

This is one of the strongest single-component arguments for MK3. The cache rewrite race is a class of bug, not an instance, and Nautilus eliminates the class.

Multi-tenant question (spike Step 7.5)

Per the analysis above:

  • Cache scopes cleanly to one tenant per process. Use use_instance_id=True to namespace Redis keys per tenant.
  • Process-per-tenant is the right deployment shape. This was the spike’s guess and it’s confirmed by the singleton constraint.
  • Shared market data fan-out requires a separate “hub” process that subscribes to UW once and republishes per-tenant via Redis MessageBus. The cache itself doesn’t multiplex this - but the MessageBus’s Redis externalization is designed for exactly this kind of out-of-band consumer. (See nautilus-message-bus.md when filed.)
  • Per-tenant venue credentials (each customer’s IBKR account) flow through the per-process ExecutionClient config. Each tenant’s process holds its own creds; no shared brokerage pool.

Bottom line for SaaS unit economics: one process per tenant, shared data hub, Redis-per-tenant (or shared Redis with namespace prefixes) is the correct architecture. The spike plan’s “process-per-tenant likely the right starting point” framing is right.

Anti-patterns to avoid

  • Strategies / Actors writing trading state to the cache directly. self.cache.add(...) is permitted for custom user data. It is NOT for orders, positions, accounts. The ExecutionEngine owns those.
  • Reading stale fallbacks instead of None. Every cache point query returns None on miss. Don’t substitute “old value” for missing - the doc enforces “explicit unpriceability” elsewhere (see Portfolio in nautilus-concepts.md) and the cache follows the same posture.
  • Treating the cache as a database. “The Cache is not designed to be a full database replacement.” If you need queries by custom predicates, time ranges, or joins, write to ParquetDataCatalog (historical) or an out-of-band SQL/audit sink.
  • Writing to cache yourself before letting the engine see the data. Use _handle_data from inside a DataClient, or publish_data from an Actor/Strategy, so cache-then-publish is preserved. Hand-rolled cache.add() followed by publish_data() is wrong.
  • Assuming cross-path synchrony. Cache-then-publish holds within the DataEngine’s path (quotes/trades/bars). It does NOT guarantee that an OrderFilled and a QuoteTick arriving at the same microsecond land in cache in arrival order. Use ts_event / ts_init for cross-path ordering.
  • flush_on_start=True in live. Wipes the Redis backing. Almost always wrong in live.
  • Ignoring tick_capacity / bar_capacity. Default 10,000 is fine for normal use, brutally tight for some research workloads. If you’re querying bars(bar_type) and getting fewer than expected, check if you’ve blown past capacity. No warning is logged on eviction.
  • Storing the 78-feature vector as 78 separate cache.add() calls per scoring tick. One ScoringEvent per score, see nautilus-data.md.

Cache vs Portfolio vs strategy variables - which to use

The doc’s own decision rule:

Cache - for:

  • Trading-related data (orders, positions, instruments, market data).
  • Data shared across strategies.
  • Data that must persist across restarts.
  • State that must survive on_reset() (with the right config).

Portfolio - for:

  • Aggregated position, exposure, account information.
  • Current state (no history).
  • Pull-style queries against valuation.

Strategy/Actor instance variables - for:

  • Strategy-specific calculations.
  • Temporary values, intermediate results.
  • Data that only this one component needs.

For Cortana, the migration mostly moves what’s currently in instance variables and Pickle files into the Cache (with Redis backing). What stays in instance variables: short-lived per-tick scratch state, in-progress feature engineering buffers that are fully recoverable on restart, indicator state managed by registered indicators (lives on the Strategy/Actor, fed by the DataEngine).

Quick reference - config defaults

CacheConfig(
    database=None,                      # In-memory only
    encoding="msgpack",                 # or "json"
    timestamps_as_iso8601=False,
    buffer_interval_ms=None,
    bulk_read_batch_size=None,
    use_trader_prefix=True,             # Prefix Redis keys with trader ID
    use_instance_id=False,              # SET TRUE FOR MULTI-TENANT
    flush_on_start=False,               # Don't wipe Redis on boot
    drop_instruments_on_reset=True,
    tick_capacity=10_000,               # Per instrument
    bar_capacity=10_000,                # Per bar type
)
 
DatabaseConfig(
    type="redis",
    host="localhost",
    port=6379,
    connection_timeout=2,
    response_timeout=2,
)

Open questions for the 2026-05-09 spike

  1. Redis ops shape. What’s the recommended deployment topology for Redis in a single-tenant Cortana live deployment? Local Redis on the same host? Separate VM? Managed (ElastiCache, MemoryDB)? The cache concept doc doesn’t recommend; needs spike-time investigation.
  2. Cache key inspection in live. Is there a CLI or admin surface for inspecting cache contents at runtime, or do we always go through redis-cli against the backing store? Affects ops ergonomics for the dashboard.
  3. flush_on_start interaction with reconciliation. If we flush Redis but the venue still has orders/positions, does LiveExecutionEngine rebuild from venue truth alone? Per the architecture doc “all orders and positions at the venue are generated from scratch” - so yes - but verify the synthetic ID determinism claim holds.
  4. Custom data eviction policy. cache.add(key, bytes) - does the cache evict custom keys under memory pressure, or are they pinned? Affects whether we can use it for unbounded user data.
  5. Multi-tenant Redis sizing. If each tenant runs use_instance_id=True and we have 1000 tenants, what’s the memory footprint per tenant of a representative trading day? Affects SaaS unit economics.

See Also

  • Nautilus Architecture - runtime topology; cache-then-publish at the dispatch level
  • Nautilus Data - what flows into cache; custom Data subclasses; ParquetDataCatalog (the non-cache historical store); ts_event / ts_init semantics
  • Nautilus Strategies - self.cache queries from inside a Strategy; emulated-order gotcha; on_save / on_load cache-database persistence
  • Nautilus Actors - self.cache queries from inside an Actor; same surface, no order writes
  • Nautilus Concepts - full architecture canon including ExecutionEngine reconciliation invariants
  • nautilus-message-bus.md (parallel - pending) - Redis-backed bus for out-of-band consumers; multi-tenant fan-out pattern
  • Spike plan: ~/conductor/workspaces/cortanaroi-mk2/belo-horizonte/plans/2026-05-09-nautilus-spike.md
  • project_data_loss_april22 - the externalized-state failure class
  • project_pm_ibkr_exit_invariant (#46) - broker-truth alignment that reconciliation directly addresses
  • feedback_no_hwm_trailing_language - reduce-only / single-shot TP semantics live on Strategy, but their state lives in cache
  • 2026-05-06 power-outage state divergence postmortem (~/brain/writing/2026-05-06-power-outage-state-divergence.md)

Timeline

  • 2026-05-07 | Cody - Filed during pre-spike concept mastery sweep batch 2.