Nautilus Logging

Nautilus ships a high-performance Rust-implemented logging subsystem fronted by the log-crate facade. The core logger runs on its own thread and receives messages via a multi-producer single-consumer (MPSC) channel, so string formatting and file I/O never block the trading core. Configuration happens through LoggingConfig in Python (with parallel NAUTILUS_LOG env-var spec for Rust binaries / runtime overrides). Output supports stdout/stderr writers, file writers (plain text or JSON), per-component filtering, ANSI coloring, size- or date-based rotation, and selective capture of external tracing-crate output. JSON file format is a first-class option (log_file_format="json") - exactly what downstream aggregators (Vector, Loki, Datadog) want. For Cortana MK3 this displaces the dozen ad-hoc Python loggers (decision_logger, score_logger, gate_logger, position_logger, watchdog_logger) with one structured stream. The “Telegram = trading-events-only” invariant from feedback_watchdog_to_telegram.md is preserved by NOT pointing Telegram at the log stream - Telegram subscribes to typed Events on the MessageBus (OrderFilled, custom TradeOpened/TradeClosed), not at log records.

Core claim

The logging subsystem and the MessageBus are two different audit surfaces with two different jobs. Logs are operator/debug telemetry - what the process did, at what verbosity, indexed by component + level. Events are the causal trading record - what state changed, with replay-grade ordering. Cortana MK2 conflates them (watchdog meta-events sometimes leak to Telegram, trading events sometimes only land in SQLite). Under MK3 the split is structural: logs flow over the Rust MPSC log channel; trading events flow over the MessageBus; Telegram subscribes only to the latter.

Architecture

From the docs verbatim:

“The platform provides logging for both backtesting and live trading using a high-performance logging subsystem implemented in Rust with a standardized facade from the log crate. The core logger operates in a separate thread and uses a multi-producer single-consumer (MPSC) channel to receive log messages. This design ensures that the main thread remains performant, avoiding potential bottlenecks caused by log string formatting or file I/O operations.”

Three input sources funnel into the dedicated logging thread:

  1. Python and Nautilus components - log directly through the Nautilus Logger.
  2. External log-crate users - filtered by log_level / log_level_file in LoggingConfig.
  3. External tracing-crate users - when enabled via use_tracing=True, output goes directly to stdout (separate from Nautilus logging), filtered by RUST_LOG. This is a parallel pipe, not a merge.

“All Nautilus log events are sent through an MPSC channel to a dedicated thread, ensuring the main thread isn’t blocked by I/O operations.”

Same single-thread / MPSC-offload pattern as the Redis stream writer (see Nautilus Message Bus §Backpressure). The logging thread and the kernel thread are decoupled - log calls return immediately once the message is on the channel.

Log levels

“Log level (LogLevel) values include the following (matching standard log level conventions).”

LevelMeaningNotes
OFFDisable logging-
TRACEMost verboseRust components only - Python cannot emit TRACE directly. Setting TRACE as a filter still captures Rust trace output.
DEBUGDetailed diagnostics-
INFOGeneral operational messagesDefault minimum for stdout.
WARNINGPotential issues that don’t prevent operationAliased Warn in env-var spec.
ERRORErrors that may affect functionality-

Default behavior: INFO and higher to stdout/stderr; no file writer unless configured.

Configuration - LoggingConfig (Python)

LoggingConfig covers (verbatim from docs):

  • Minimum LogLevel for stdout/stderr.
  • Minimum LogLevel for log files.
  • Maximum size before rotating a log file.
  • Maximum number of backup log files to maintain when rotating.
  • Automatic log file naming with date or timestamp components, or custom log file name.
  • Directory for writing log files.
  • Plain text or JSON log file formatting.
  • Filtering of individual components by log level.
  • ANSI colors in log lines.
  • Bypass logging entirely.
  • Print Rust config to stdout at initialization.
  • Optionally initialize logging via the PyO3 bridge (use_pyo3) to capture log events emitted by Rust components.
  • Truncate existing log file on startup if it already exists (clear_log_file).

Canonical example (from docs)

from nautilus_trader.config import LoggingConfig
from nautilus_trader.config import TradingNodeConfig
 
config_node = TradingNodeConfig(
    trader_id="TESTER-001",
    logging=LoggingConfig(
        log_level="INFO",
        log_level_file="DEBUG",
        log_file_format="json",
        log_component_levels={"Portfolio": "INFO"},
    ),
    # ... omitted
)

For backtests, swap TradingNodeConfig for BacktestEngineConfig - same options, identical behavior. This is the live-vs-backtest parity that nautilus-architecture.md calls out, applied to logging.

File logging

Files are written to the current working directory by default. Override with log_directory and/or log_file_name.

Formats

  • None (default) - plain text, .log extension.
  • "json" - JSON-per-line, .json extension. “Useful for log aggregation tools.”

JSON output is the right default for any production deployment that pipes to Vector / Loki / Datadog / CloudWatch - every log record is parseable without fragile regex.

Rotation

ModeTriggerWhen it applies
Size-basedlog_file_max_size set (e.g. 100_000_000 for 100 MB). When the next write would exceed, file is closed and a new one is created.Whenever set, takes precedence over date-based.
Date-based (default)UTC midnight rollover, one file per UTC day.Default when no log_file_max_size and no custom log_file_name.
No rotationLogs append to the same file forever.Custom log_file_name set, no log_file_max_size.

Backups: log_file_max_backup_count (default 5) limits total rotated files; oldest deleted when exceeded.

Naming

With rotation enabled: {trader_id}_{%Y-%m-%d_%H%M%S:%3f}_{instance_id}.{log|json} e.g. TESTER-001_2025-04-09_210721:521_d7dc12c8-7008-4042-8ac4-017c3db0fc38.log

Without rotation (default, daily UTC): {trader_id}_{%Y-%m-%d}_{instance_id}.{log|json} e.g. TESTER-001_2025-04-09_d7dc12c8-7008-4042-8ac4-017c3db0fc38.log

The instance_id segment matters for multi-tenant deployments - each tenant TradingNode process gets a UUID and its log files are unambiguously identifiable. This dovetails with the multi-tenant producer/consumer pattern in Nautilus Message Bus.

Component-scoped log filtering

“The log_component_levels parameter can be used to set log levels for each component individually. The input value should be a dictionary of component ID strings to log level strings: dict[str, str].”

LoggingConfig(
    log_level="INFO",
    log_component_levels={
        "RiskEngine": "DEBUG",
        "Portfolio": "INFO",
    },
    log_components_only=True,  # suppress everything not in the dict
)

log_components_only=True is the focus-mode toggle: only listed components emit; everything else is silenced regardless of global level.

“If log_components_only=True (or log_components_only is present in the spec string) and log_component_levels is empty, no log messages will be emitted to stdout/stderr or files. Add at least one component filter or disable components-only logging.”

This is exactly the surface MK2 needs for “show me only what the position manager is doing during this debugging session” without killing global logging.

NAUTILUS_LOG environment variable

Alternative configuration via semicolon-separated spec string. Useful for Rust-only binaries and runtime overrides without touching code.

export NAUTILUS_LOG="stdout=Info;fileout=Debug;RiskEngine=Error;is_colored"

Supported keys (from docs):

KeyTypeDescription
stdoutLog levelMaximum level for stdout output
fileoutLog levelMaximum level for file output
is_coloredFlagEnable ANSI colors (default: true)
print_configFlagPrint config to stdout at startup
log_components_onlyFlagOnly log components with explicit filters
<Component>Log levelComponent-specific level (exact match)
<module::path>Log levelModule-specific level (prefix match, Rust only)

Levels are case-insensitive: Off, Trace, Debug, Info, Warning (or Warn), Error. Flags activate by their presence - no value needed.

For Rust-only binaries, setting NAUTILUS_LOG enables lazy initialization of the logging subsystem on first use without an explicit init_logging() call.

Module-path filtering (Rust only)

Keys containing :: are treated as Rust module path filters with prefix matching:

# All adapters at Warn, but OKX specifically at Debug
export NAUTILUS_LOG="stdout=Info;nautilus_okx=Warn;nautilus_okx::websocket=Debug"

“The longest matching prefix takes precedence. In the example above, nautilus_okx::websocket::handler would use the Debug level (longer prefix), while nautilus_okx::data would use Warn.”

This is NAUTILUS_LOG-only - Python log_component_levels does component-name exact-match, not module-path prefix-match. If Cortana wants sub-tree filtering on a Rust IBKR adapter (nautilus_interactive_brokers), use the env var.

ANSI colors

“ANSI color codes improve log readability in terminals. In environments that do not support ANSI color rendering (such as some cloud environments or text editors), these color codes may not be appropriate as they can appear as raw text.”

Disable with LoggingConfig(log_colors=False) or set NAUTILUS_LOG="...;is_colored=false" (presence of the flag without value toggles to true; explicit false disables - confirm exact env-var semantics during the spike). Set False for any deployment piping to file aggregators that don’t strip ANSI.

Using a Logger directly

Nautilus exposes the Logger class for ad-hoc logging outside a TradingNode / BacktestEngine (e.g. tooling, scripts, custom adapters):

from nautilus_trader.common.component import init_logging
from nautilus_trader.common.component import Logger
 
log_guard = init_logging()  # bootstrap the subsystem
logger = Logger("MyLogger")
logger.info("hello world")

“Only one logging subsystem can be initialized per process with an init_logging call. Multiple LogGuard instances (up to 255) can exist concurrently, and the logging thread will remain active until all guards are dropped.”

LogGuard - managing the log lifecycle

Reference-counted handle that keeps the logging thread alive across multiple sequential engine runs in the same process. Required when running multiple backtests sequentially.

Mechanics (from docs):

  • Counter increments when a new LogGuard is created; decrements on drop.
  • When the counter hits zero, the logging thread is joined - all buffered messages flushed before the process terminates.
  • Maximum 255 concurrent guards per process; exceeding raises RuntimeError.

“LogGuard keeps the logging thread alive and flushes on drop; abrupt termination (crashes, kill signals) can still lose buffered logs.”

Without a LogGuard, sequential backtests in the same process see errors: Error sending log event: [INFO] ... because the channel is closed when the first engine disposes.

Pattern (from docs):

log_guard = None
for i in range(number_of_backtests):
    engine = setup_engine(...)
    if log_guard is None:
        log_guard = engine.get_log_guard()
    actors = setup_actors(...)
    engine.add_actors(actors)
    engine.run()
    engine.dispose()  # safe - LogGuard keeps logging alive

For Cortana MK3 this matters during the batch-replay-postmortem workflow (re-run April 16 chop-day cluster across N parameter configurations in one process - see project_losses_april16_chop).

tracing subscriber for external Rust libraries

External Rust crates that use the tracing crate (instead of log) can have their output displayed by enabling the tracing subscriber:

LoggingConfig(
    log_level="INFO",
    use_tracing=True,
)

Or call directly:

from nautilus_trader.core import nautilus_pyo3
nautilus_pyo3.init_tracing()

Filtered by RUST_LOG:

RUST_LOG=my_feature_extractor=debug,hyper=warn python my_script.py

“If RUST_LOG is not set, the default filter level is warn.”

Critical caveats:

“Tracing output goes directly to stdout, not through the Nautilus logging thread.”

“Filtering is controlled exclusively by RUST_LOG, independent of LoggingConfig.”

RUST_LOG only affects crates using tracing. For crates using log, configure verbosity via LoggingConfig or the NAUTILUS_LOG environment variable.”

“The tracing subscriber can only be initialized once per process. When using use_tracing=True in LoggingConfig, init happens during kernel/engine setup. Calling init_tracing() when already initialized will raise an error.”

So: tracing output is a parallel uncorrelated stream to stdout, not part of the JSON file output. If Cortana writes a Rust feature extractor that uses tracing, its logs will not appear in the JSON log file. They go to stdout only. Plan capture accordingly.

Performance characteristics

The doc does not publish benchmark numbers but commits to:

  • Off-thread I/O. MPSC channel; the main (kernel) thread never blocks on file writes or stdout flushes.
  • Rust core. Formatting and serialization happen in Rust, not Python.
  • Single logging thread drains the channel; no per-component thread fan-out.
  • JSON serialization is the same Rust path as plain text, just a different formatter on the writer side - no measurable Python cost difference.

The key cost is the work to enqueue the message on the kernel thread: constructing the log event struct and pushing to the channel. Keep DEBUG/TRACE filters off for hot paths in production - even with off-thread I/O, the enqueue cost compounds at million-msg/sec rates. For a 0DTE single-contract trader at <1k trading-event/sec scale this is irrelevant; for a tick-firehose adapter writing TRACE per quote this is the throttle.

Custom log handlers - Telegram-filter Actor pattern

The Nautilus logging subsystem does not expose a public “register a custom Python sink” API. The supported writers are stdout/stderr and file (text or JSON). Vector or similar log-aggregator infrastructure is the documented integration path for downstream forwarding.

For Cortana’s Telegram bot, do not write a custom log sink - that would conflate logs with trading events. Instead, the clean pattern is a Telegram-filter Actor subscribed to specific events on the MessageBus:

from nautilus_trader.common.actor import Actor
from nautilus_trader.model.events import OrderFilled, OrderRejected
 
class TelegramAlerter(Actor):
    """
    Subscribes ONLY to trading events. Forwards a curated subset to
    Telegram. Watchdog/AI-meta noise NEVER touches this Actor.
 
    Maps to Cortana invariant: feedback_watchdog_to_telegram.md -
    'Telegram is the TRADING-system alert channel. It carries: signal
    fired, trade entered, trade exited (win/loss/PnL), engine errors,
    real operational alerts the user wants on their phone.'
    """
 
    def __init__(self, config):
        super().__init__(config)
        self._bot = TelegramBot(token=config.bot_token, chat_id=config.chat_id)
 
    def on_start(self):
        # Built-in events the user explicitly asked for on Telegram:
        # signals, fills, exits, errors. Nothing else.
        self.subscribe_data(SignalFired)         # custom Cortana event
        self.subscribe_data(TradeOpened)         # custom Cortana event
        self.subscribe_data(TradeClosed)         # custom Cortana event
        # Order events route through on_order_event by default (no need
        # to explicitly subscribe - the Actor lifecycle dispatches them).
 
    def on_order_filled(self, event: OrderFilled):
        # Filter: only emit if this is a real venue fill, not a
        # reconciliation synthesis (see nautilus-events.md).
        if event.reconciliation:
            return
        self._bot.send(f"FILL {event.instrument_id} "
                       f"{event.last_qty}@{event.last_px}")
 
    def on_order_rejected(self, event):
        # Engine error worth waking the human for.
        self._bot.send(f"REJECT {event.instrument_id} {event.reason}")
 
    def on_data(self, data):
        # Custom Cortana trading events (signal/open/close).
        if isinstance(data, SignalFired):
            self._bot.send(f"SIGNAL {data.symbol} {data.side} "
                           f"score={data.score:.2f}")
        elif isinstance(data, TradeOpened):
            self._bot.send(f"OPEN {data.symbol} qty={data.qty} "
                           f"entry={data.entry_px}")
        elif isinstance(data, TradeClosed):
            self._bot.send(f"CLOSE {data.symbol} pnl={data.pnl:.2f} "
                           f"hold={data.duration_s}s")

Why this pattern preserves the Telegram invariant:

  1. No log-sink path. Telegram is wired to the MessageBus, not the logging thread. Watchdog DEBUG/INFO/WARNING records cannot leak to Telegram because the Actor doesn’t subscribe to log records - only to typed events.
  2. Curated subscription. The Actor explicitly lists the event types it forwards. Adding a new alert means adding a new subscribe_data(...) line - there’s no “subscribe to everything and filter” footgun.
  3. reconciliation flag check. On startup, the LiveExecutionEngine may synthesize fills to align cached state with broker truth. The Actor explicitly filters those out - no spurious “you got filled!” alerts at engine restart.
  4. Same-bus semantics. This Actor runs on the same single-threaded kernel as the strategy, with the same immutability guarantees. Telegram alerts are causally ordered with the events that triggered them.

For operator alerts (engine degraded, DataClient disconnected, etc.) - the watchdog AI-meta layer - those go to:

  • Logs (via the Nautilus logging subsystem, JSON file output, picked up by Vector/Loki).
  • In-chat surface when Cody is at the keyboard (per feedback_watchdog_to_telegram.md: “surface it IN CHAT - not Telegram”).

Never to Telegram. Telegram is the trader’s phone.

Cortana MK3 implications

MK2 today: many ad-hoc loggers, mixed sinks

MK2 has at least these distinct loggers, each with its own sink and format:

MK2 loggerSinkFormatNotes
decision_loggerSQLite (decisions.db)rowsConflates audit trail with debug
score_loggerSQLite + filemixedPer-bar scoring snapshots
gate_loggerSQLiterowsGate accept/reject reasons
position_loggerfile + stdouttextPM lifecycle events
watchdog_loggerfiletextAI-meta heartbeat / engine health
ibkr_loggerfiletextBroker comms
uw_loggerfiletextUW alert ingestion
app.py root loggerstdouttextEngine boot / loop

These have drifted in format, log-rotation policy, and verbosity defaults. Some write to Telegram (incorrectly - see feedback_watchdog_to_telegram.md). Some are missing rotation entirely (disk-fill risk).

MK3: one LoggingConfig, one stream

LoggingConfig(
    log_level="INFO",
    log_level_file="DEBUG",
    log_file_format="json",
    log_directory="/var/log/cortana",
    log_file_max_size=100_000_000,        # 100MB rotation
    log_file_max_backup_count=20,         # ~2GB ceiling
    log_colors=False,                     # disable for file consumers
    log_component_levels={
        "RiskEngine": "DEBUG",            # we want every gate decision
        "ScoringActor": "DEBUG",
        "CortanaStrategy": "INFO",
        "InteractiveBrokersClient": "INFO",
        "UWDataClient": "INFO",
    },
    use_pyo3=True,                        # capture Rust-side logs too
    print_config=True,                    # paper-trail the config
    clear_log_file=False,                 # never truncate; we replay
)

What changes structurally:

  • One JSON-per-line file instead of N text/SQLite sinks. Vector or Loki ingests it directly.
  • Component filtering replaces module-by-module logger config.
  • Audit trail moves to MessageBus + Parquet (see Nautilus Events). Logs become operator/debug telemetry only - they are not the trading record.
  • Telegram is decoupled from logs. It subscribes to typed events on the MessageBus via the TelegramAlerter Actor pattern above.
  • Rotation is automatic. No more disk-fill scares.

Preserving the Telegram-events-only invariant

The hard rule from feedback_watchdog_to_telegram.md:

“Telegram is the TRADING-system alert channel. It carries: signal fired, trade entered, trade exited (win/loss/PnL), engine errors, real operational alerts the user wants on their phone. It is NOT the AI agent’s status feed.”

Mapping to MK3 plumbing:

Event classPath
Signal fired (custom Cortana event)MessageBus → TelegramAlerter.on_data → Telegram
Trade opened (custom event)MessageBus → TelegramAlerter.on_data → Telegram
Trade closed with PnL (custom event)MessageBus → TelegramAlerter.on_data → Telegram
Order filled (built-in OrderFilled)MessageBus → TelegramAlerter.on_order_filled → Telegram (skip if reconciliation=True)
Order rejected (built-in OrderRejected)MessageBus → TelegramAlerter.on_order_rejected → Telegram
Engine degraded / faultedNautilus FSM transition + log record → Vector/Loki, NOT Telegram
Watchdog heartbeatLog record → Vector/Loki, NOT Telegram
Cache miss / stale-data warningLog record → Vector/Loki, NOT Telegram
AI-meta loop wakeup (“nothing actionable”)NOT logged at INFO; suppressed entirely or DEBUG-only

The architectural seam - logs vs MessageBus events - is what makes the invariant enforceable by construction rather than by review discipline. A new component physically cannot leak to Telegram unless someone explicitly subscribes the TelegramAlerter Actor to its event type.

Open questions for the 2026-05-09 spike

  1. use_pyo3 flag - does enabling it incur measurable overhead on the hot path, or is it free? Need to confirm before turning it on in production.
  2. Multi-tenant log key - under the SaaS roadmap, each tenant TradingNode writes its own log file (since trader_id and instance_id differentiate names). Is one log directory per host with per-tenant filenames sufficient, or do we want log_directory=/var/log/cortana/<tenant_id>/?
  3. Log → Loki/Vector pipeline - is Vector the right collector, or does Promtail/Fluent Bit fit better? Spike Step 9 should pick.
  4. TRACE for IBKR adapter - when debugging an order-routing bug, we want module-path filtering on the Rust IBKR crate. Confirm the NAUTILUS_LOG=nautilus_interactive_brokers::orders=Trace pattern works in practice.
  5. tracing capture - if Cortana writes a custom Rust feature extractor using tracing, its output goes to stdout only (not the JSON file). Do we need tracing at all, or stick to log for custom Rust components?
  6. Log-record schema for downstream parsing - the JSON shape is Nautilus-defined. Confirm fields available (component, level, timestamp, message, trader_id, instance_id) match what our log queries need.
  7. Sequential backtest replay - confirm the LogGuard pattern in the docs handles a 30-config parameter sweep without leaking guards (255 limit).

Caveats and gotchas

  • TRACE level cannot be emitted from Python. It’s Rust-only output. Setting TRACE as a filter still captures Rust trace output, but Python logger.trace(...) doesn’t exist.
  • tracing and log are separate pipes. Output from Rust crates using tracing (gated by RUST_LOG) does NOT go through the Nautilus log file. It writes to stdout directly. Don’t expect a unified JSON file.
  • init_tracing() is once-per-process. Calling it twice raises. When LoggingConfig(use_tracing=True), the kernel calls it during setup - don’t call it again from user code.
  • Log files default to CWD. Without log_directory, you get log files wherever the process was launched. Production deployments must set log_directory explicitly.
  • log_components_only=True with empty filters silences everything. Easy footgun: you turn on components-only mode, forget to populate the dict, and now nothing logs. Always assert at least one component is listed.
  • clear_log_file=True deletes audit trail on restart. Never set for production. Useful for tests.
  • Custom Python log sinks are not a public API. If you need a custom downstream, it’s Vector/Loki/Datadog at the file boundary, or it’s an Actor on the MessageBus. There is no register_log_handler(callback) surface.
  • Abrupt termination loses buffered logs. SIGKILL bypasses LogGuard flush. For crash-only design, accept that the last ~100ms of log records may be lost - the audit trail of trading state is on the MessageBus + Redis, not in logs.
  • MPSC channel is unbounded. A runaway TRACE flood from a Rust crate can balloon memory if the logging thread can’t drain fast enough. Mitigate by setting tight component/module filters.

When this concept applies

  • Configuring MK3’s logging at TradingNode boot.
  • Designing the Telegram alerter as a MessageBus Actor (not a log sink).
  • Setting up Vector / Loki / Datadog downstream forwarding.
  • Debugging an adapter with module-path filtering via NAUTILUS_LOG.
  • Running sequential backtests / postmortems with LogGuard.
  • Splitting operator telemetry (logs) from causal trading record (Events on the MessageBus).

When it breaks / does not apply

  • Trading audit trail. Logs are NOT the audit record. Use MessageBus events + Parquet persistence (see Nautilus Events).
  • Custom in-process log sink. Not a supported public API. Use external aggregator (Vector) or an Actor on the MessageBus.
  • Cross-language unified output stream. tracing (Rust) and log (everything else) are parallel pipes. They don’t merge into one file.
  • Sub-millisecond log delivery guarantees. The MPSC channel decouples by design; expect best-effort delivery with potential ~100ms staleness vs the originating event.

See Also

  • Nautilus Architecture - single-threaded kernel, MPSC offload pattern, crash-only design
  • Nautilus Events - typed events on the MessageBus, the audit-trail surface that Telegram subscribes to
  • Nautilus Message Bus - pub/sub spine that the TelegramAlerter Actor rides
  • 2026-05-09 Nautilus Spike Plan: ~/conductor/workspaces/cortanaroi-mk2/belo-horizonte/plans/2026-05-09-nautilus-spike.md
  • User memory: feedback_watchdog_to_telegram.md - Telegram is for trading events only; AI-meta watchdog noise stays out
  • User memory: feedback_loop_snapshots_noisy.md - silent checks for watchdog wakeups
  • Source: https://nautilustrader.io/docs/latest/concepts/logging/

Timeline

  • 2026-05-07 | Cody - Filed during pre-spike concept mastery sweep batch 3.