Architecture Overview
The claw code architecture follows a dual-layer design: a Python orchestration layer that manages agent sessions, command routing, and LLM interaction, paired with a Rust performance layer that handles API communication, tool execution, terminal rendering, and security. This separation keeps the high-level agent logic expressive and easy to modify while pushing latency-sensitive operations into compiled, memory-safe code.
At the highest level, every user interaction flows through three stages: bootstrap (environment discovery, configuration loading, mode routing), query engine execution (turn loops with tool-calling and message compaction), and response rendering (streaming markdown with syntax highlighting in the terminal). Each stage is backed by well-defined modules in both layers.
User Terminal (REPL / stdin)
|
v
+------------------------------------------+
| Rust CLI Binary |
| rusty-claude-cli (crossterm, syntect, |
| pulldown_cmark, braille spinner) |
+------------------------------------------+
|
v
+------------------------------------------+
| Python Orchestration Layer |
| bootstrap_graph.py (7 stages) |
| runtime.py (route_prompt, run_turn) |
| query_engine.py (max_turns=8, stream) |
| commands.py | tools.py | models.py |
| context.py | session_store.py |
| transcript.py | execution_registry.py |
| tool_pool.py | parity_audit.py |
| + 50 more modules |
+------------------------------------------+
| | |
v v v
+-----------+ +-----------+ +-----------+
| Rust api | | Rust | | Rust |
| crate | | runtime | | tools |
| Anthropic | | 16 modules| | 19 specs |
| client, | | bash,file | | JSON |
| SSE,OAuth | | ops,mcp, | | schemas |
| retry | | oauth, | | |
| | | session, | | |
| | | prompt, | | |
| | | usage | | |
+-----------+ +-----------+ +-----------+
| |
v v
+-----------+ +---------------+
| Rust | | Rust |
| commands | | compat- |
| 15 slash | | harness |
| commands | | (bridge |
| | | layer) |
+-----------+ +---------------+
|
v
Anthropic API (api.anthropic.com)
Python Orchestration Layer (src/)
The Python workspace contains 60+ modules organized into tightly scoped subsystems. Each module owns a single responsibility, making the codebase easy to navigate and test in isolation. Below is a detailed breakdown of every major subsystem.
Bootstrap Graph — bootstrap_graph.py
The bootstrap sequence is the entry point for every claw-code session. It runs through 7 sequential stages, each one gated on the previous stage completing successfully:
- Prefetch — Pre-loads reference data and warms caches before anything else runs.
- Warning handler — Attaches global warning filters so noisy deprecation messages from third-party libraries do not leak into the user terminal.
- CLI parser — Parses command-line arguments and flags, establishing the execution mode early.
- Setup + Commands parallel load — Runs environment setup and command snapshot loading concurrently for faster startup.
- Deferred init — Initializes components that depend on the parsed CLI state and loaded configuration.
- Mode routing — Routes execution to one of six modes:
local,remote,ssh,teleport,direct-connect, ordeep-link. - Query engine submit loop — Hands control to the query engine for the interactive turn loop.
Query Engine — query_engine.py
The query engine is the central orchestration hub. It manages the conversation loop between the user, the LLM, and the tool system. Key configuration values are defined in the QueryEngineConfig dataclass:
- max_turns = 8 — The maximum number of LLM round-trips per user query before the engine halts.
- max_budget_tokens = 2000 — Token budget cap for a single query session.
- compact_after_turns = 12 — After this many accumulated turns, the transcript is compacted to free context window space.
Each turn produces a TurnResult, and the QueryEnginePort class exposes session management, message compaction, and streaming. The streaming interface yields six distinct event types: message_start, command_match, tool_match, permission_denial, message_delta, and message_stop. This event-driven design allows the terminal renderer to paint output incrementally as tokens arrive.
Runtime — runtime.py
The runtime module bridges raw user input and the query engine. It provides PortRuntime, which exposes two critical methods:
- route_prompt — Tokenizes user input and scores it against available commands and tools, producing
RoutedMatchobjects (each with akind,name,source_hint, andscore). - bootstrap_session / run_turn_loop — Initializes a new agent session and enters the main execution loop.
Commands — commands.py
The command inventory loads from reference_data/commands_snapshot.json at startup. It exposes the CommandExecution dataclass with five fields: name, source_hint, prompt, handled, and message. Four public functions provide the command API:
load_command_snapshot()— Reads and parses the JSON snapshot file.get_command(name)— Returns a single command definition by name.find_commands(query)— Fuzzy-searches commands matching a query string.execute_command(cmd)— Dispatches a command for execution.
Tools — tools.py
The tool inventory mirrors the command system but for tool-calling. It loads from reference_data/tools_snapshot.json and defines a ToolExecution dataclass. Key functions include:
load_tool_snapshot()— Cached withlru_cachefor zero-cost repeated access.build_tool_backlog()— Constructs the list of pending tool operations.get_tools(simple_mode)— Whensimple_modeis enabled, restricts the available tools to onlyBashTool,FileReadTool, andFileEditTool.filter_tools_by_permission_context()— Applies permission filtering based on the current session context.
Core Data Models — models.py
The models.py module defines the shared data types that flow through every layer of the system:
| Dataclass | Fields | Purpose |
|---|---|---|
Subsystem | name, path, file_count, notes | Represents a discovered subsystem in the workspace |
PortingModule | name, responsibility, source_hint, status | Tracks porting progress for a single module |
PermissionDenial | — | Records when a tool call is blocked by permissions |
UsageSummary | input_tokens, output_tokens (word-count proxy) | Token usage tracking for cost management |
PortingBacklog | — | Aggregates all pending porting work |
Supporting Python Modules
Beyond the core subsystems, the Python layer includes dozens of focused modules:
- context.py — Workspace discovery via
PortContext, which locatessource_root,tests_root,assets_root,archive_root, and counts Python files. - session_store.py — JSON-based session persistence stored in the
.port_sessions/directory. - transcript.py — In-memory conversation transcript with a compaction strategy that keeps the last 10 messages (
keep_last=10). - execution_registry.py — Unified registry exposing
MirroredCommandandMirroredTooltypes that bridge the Python and Rust execution models. - tool_pool.py — Filtered tool assembly via the
ToolPoolclass, which caps visible tools at 15 to avoid overwhelming the LLM context. - parity_audit.py — Compares the Python and TypeScript implementations: 18 root file mappings and 31 directory mappings to track porting completeness.
- cost_tracker.py — Accumulates token usage and dollar costs across turns.
- history.py — Session history storage and retrieval.
- ink.py — Markdown panel rendering for rich terminal output.
- permissions.py — Permission checks and policy enforcement.
- prefetch.py — Data prefetching for faster bootstrap.
- remote_runtime.py — Handles remote execution modes.
- direct_modes.py — Direct-connect and deep-link mode handling.
- command_graph.py — Command dependency graph resolution.
- query.py — Lower-level query construction helpers.
- deferred_init.py — Lazy initialization for heavy components.
Rust Performance Layer (rust/)
The Rust workspace is organized as a 6-crate Cargo workspace. Each crate is compiled independently, enabling incremental builds and clear dependency boundaries. Together, they provide the high-performance foundation that the Python layer calls into.
api Crate — Anthropic API Client
The api crate encapsulates all communication with the Anthropic API. Its AnthropicClient handles authentication, retries, and streaming:
- Retry logic — Automatically retries on HTTP status codes 408, 409, 429, 500, 502, 503, and 504.
- Authentication — The
AuthSourceenum supports four modes:None,ApiKey,BearerToken, andApiKeyAndBearer. - Streaming — Server-Sent Events via
MessageStreamandSseParserfor real-time token delivery. - OAuth — Full token exchange flow for enterprise authentication.
API Constants
DEFAULT_BASE_URL = "https://api.anthropic.com" • ANTHROPIC_VERSION = "2023-06-01" • DEFAULT_MAX_RETRIES = 2
runtime Crate — 16 Modules
The runtime crate is the largest in the Rust workspace with 16 modules covering everything from bash execution to OAuth flows:
Bootstrap (12 phases)
From CliEntry to MainRuntime through 12 sequential phases that progressively build the execution environment.
ConversationRuntime
Core turn loop with ApiClient + ToolExecutor traits. Max iterations capped at 16 per conversation round.
CompactionConfig
preserve_recent = 4 messages, max_estimated_tokens = 10000. Keeps context window lean without losing critical state.
Configuration (3 sources)
ConfigSources: User, Project, and Local. Discovers settings.json files at each level with cascading priority.
File Operations
Read, write, edit, glob, and grep with a 250 head_limit for search results and 100 glob truncation cap.
Permission System
PermissionMode: Allow, Deny, or Prompt. PermissionPolicy supports per-tool permission modes.
System Prompt Builder
MAX_INSTRUCTION_FILE_CHARS = 4000, MAX_TOTAL_INSTRUCTION_CHARS = 12000. Discovers and assembles CLAUDE.md files.
MCP Integration
Name normalization with mcp__{server}__{tool} convention. 6 transport types: Stdio, SSE, HTTP, WebSocket, SDK, and ClaudeAiProxy.
Additional runtime modules include:
- bash — Sandboxed shell command execution.
- json — Zero-dependency JSON parser for minimal overhead.
- oauth — Full PKCE authorization code flow, storing credentials at
~/.claude/credentials.json. - remote — Upstream proxy configuration with
DEFAULT_REMOTE_BASE_URLand a list of 16 no-proxy hosts. - session —
MessageRoleenum (System, User, Assistant, Tool) andContentBlockvariants (Text, ToolUse, ToolResult). - sse — Incremental SSE parser for streaming API responses.
- usage —
ModelPricingwith per-model rates: Sonnet at $15/$75 per million tokens, Haiku at $1/$5, Opus at $15/$75. Includesformat_usdfor display.
tools Crate — 19 Tool Specifications
The tools crate defines 19 tool specifications, each with a full JSON Schema for parameter validation. These schemas are what the LLM sees when deciding which tool to call:
| Tool | Category | Description |
|---|---|---|
bash | Execution | Run shell commands in a sandboxed environment |
read_file | File I/O | Read file contents with offset/limit support |
write_file | File I/O | Write or overwrite file contents |
edit_file | File I/O | Targeted string replacements within files |
glob_search | Search | Pattern-based file discovery |
grep_search | Search | Regex content search powered by ripgrep |
WebFetch | Network | HTTP requests to external URLs |
WebSearch | Network | Web search queries |
TodoWrite | Planning | Structured task list management |
Skill | Extension | Invoke registered skill modules |
Agent | Multi-agent | Spawn sub-agent for parallel work |
ToolSearch | Discovery | Search for deferred tools by keyword |
NotebookEdit | Notebook | Edit Jupyter notebook cells |
Sleep | Utility | Pause execution for a specified duration |
SendUserMessage | Communication | Send messages back to the user |
Config | Settings | Read/write configuration values |
StructuredOutput | Output | Return structured JSON responses |
REPL | Execution | Interactive language REPL sessions |
PowerShell | Execution | Windows PowerShell command execution |
commands Crate — 15 Slash Commands
The commands crate implements the slash command system with 15 built-in commands. Each command is classified by a CommandSource enum: Builtin, InternalOnly, or FeatureGated.
rusty-claude-cli — CLI Binary
The rusty-claude-cli crate is the user-facing binary. It defaults to claude-sonnet-4-20250514 as the model and provides a rich terminal experience:
- REPL — Interactive read-eval-print loop with a braille spinner animation (10 frames) for loading states.
- Syntax highlighting — Powered by
syntectwith thebase16-ocean.darktheme. - Markdown rendering — Uses
pulldown_cmarkto parse and render markdown directly in the terminal. - Raw mode line editor — Built on
crosstermfor cross-platform terminal input handling.
compat-harness Crate
The compat-harness crate serves as the compatibility bridge between the Python orchestration layer and the Rust performance layer. It ensures that data structures, function signatures, and calling conventions remain stable across the language boundary.
Data Flow: From Prompt to Response
Understanding the claw code architecture means tracing how a single user prompt flows through the entire system:
- The user types a prompt in the rusty-claude-cli REPL.
- The CLI passes input to the Python layer, where runtime.py tokenizes it and calls
route_promptto determine if it matches a slash command or should go to the LLM. - If it is a slash command, the commands crate handles it directly. Otherwise, the query engine takes over.
- The query engine constructs the API request — including system prompt (built by the Rust
promptmodule from discoveredCLAUDE.mdfiles), conversation history (managed by transcript.py), and available tools (filtered by tool_pool.py, max 15). - The Rust api crate sends the request to Anthropic's API with streaming enabled, yielding SSE events back through the Python layer.
- If the LLM response includes a tool call, the Rust runtime crate executes it (bash commands, file operations, etc.), checking permissions via the PermissionPolicy.
- Tool results feed back into the next turn. This loop continues for up to 8 turns (Python query engine) or 16 iterations (Rust conversation runtime).
- When the conversation accumulates more than 12 turns, the transcript is compacted — the Rust compaction module preserves the 4 most recent messages and limits estimated tokens to 10,000.
Session & Transcript Management
Claw-code maintains session state through two complementary mechanisms:
- session_store.py persists full session data as JSON in the
.port_sessions/directory, enabling the/resumecommand to restore previous conversations. - transcript.py keeps an in-memory rolling transcript. Its compaction algorithm retains the last 10 messages (
keep_last=10), discarding older content to prevent context window overflow.
On the Rust side, the session module defines the message schema with MessageRole (System, User, Assistant, Tool) and ContentBlock (Text, ToolUse, ToolResult). This shared schema ensures both layers interpret conversation history identically.
Python-TypeScript Parity Audit
Since claw-code is a clean-room rewrite of the Claude Code architecture, the parity_audit.py module continuously tracks implementation completeness. It maintains 18 root file mappings and 31 directory mappings that map Python modules to their TypeScript counterparts, making it straightforward to identify gaps and prioritize porting work.