What is RLM?
RLM (Recursive Language Model) graphs handle massive context (1M+ tokens) that’s too large to fit in prompts. Instead of interpolating huge documents directly into LLM calls, RLM graphs:- Materialize context to a searchable store
- Search via fast local tools (~1ms grep/search)
- Extract relevant snippets for LLM processing
- Synthesize the final answer from found information
When to use RLM
| Scenario | Use RLM? |
|---|---|
| Context < 100K tokens | No - use graph_type: "policy" |
| Context 100K-500K tokens | Maybe - depends on model limits |
| Context > 500K tokens | Yes - use graph_type: "rlm" |
| RAG over large corpus | Yes |
| Codebase analysis | Yes |
| Multi-document QA | Yes |
Quick start
- Auto-detects
documentsis too large for prompts - Auto-adds RLM tools (
materialize_context,local_grep, etc.) - Auto-configures the proposer to use tool-based search patterns
Auto-added tools
Whengraph_type: "rlm", these tools are automatically available:
| Tool | Latency | Description |
|---|---|---|
materialize_context | ~1ms | Store input fields for searching |
local_grep | ~1ms | Regex search on materialized content |
local_search | ~1ms | Substring search |
query_lm | ~100ms | Sub-LM calls for processing chunks |
codex_exec | ~500ms | Shell execution (complex operations) |
How it works
1. Materialize
2. Search
3. Process
Performance
Local RLM tools run ~11,000x faster than equivalent sandbox operations:| Operation | Sandbox | Local RLM |
|---|---|---|
| Write 45KB file | 300ms | 0.03ms |
| Grep file | 400ms | 0.1ms |
| Line count | 350ms | 0.05ms |
| Total | 1050ms | 0.2ms |
Dataset format
Large context fields are auto-detected, but you can explicitly mark them:Inference
Inference works the same as regular graphs:RLM as a Pattern
RLM is available as both agraph_type and as a pattern. This distinction matters:
graph_type: "rlm"- The graph’s primary purpose is RLM-style searchpatterns.required: ["rlm"]- Apply RLM pattern to ANY graph type
RLM Verifier
Train a verifier that uses tool-based search to analyze large traces:- Auto-gets RLM tools (materialize_context, local_grep, etc.)
- Uses tool-based search instead of stuffing traces into prompts
- Outputs a score (0.0-1.0) like any verifier
Pattern Options
rlm- Tool-based search for massive contextmap_reduce- Parallel processing for lists (common for verifiers)single_shot- Single LLM callchain_of_thought- Multi-step reasoningdigest_combine- Two-stage: digest then combine
Next steps
- Judging: Learn how scoring works in
product/workflows/judging - API Reference: See full API at
sdk/graphs/inference