---
name: rlm
description: Process large codebases (>100 files) using the Recursive Language Model pattern. Treats code as an external environment, using parallel background agents to map-reduce complex tasks without context rot.
triggers:
  - "analyze codebase"
  - "scan all files"
  - "large repository"
  - "RLM"
  - "find usage of X across the project"
license: MIT
metadata:
  author: ClawFu
  version: 1.0.0
  mcp-server: "@clawfu/mcp-skills"
---

# Recursive Language Model (RLM) Skill

## Core Philosophy
**"Context is an external resource, not a local variable."**

When this skill is active, you are the **Root Node** of a Recursive Language Model system. Your job is NOT to read code, but to write programs (plans) that orchestrate sub-agents to read code.

## Protocol: The RLM Loop

### Phase 1: Choose Your Engine
Decide based on the nature of the data:

| Engine | Use Case | Tool |
|--------|----------|------|
| **Native Mode** | General codebase traversal, finding files, structure. | `find`, `grep`, `bash` |
| **Strict Mode** | Dense data analysis (logs, CSVs, massive single files). | `python3 ~/.claude/skills/rlm/rlm.py` |

### Phase 2: Index & Filter (The "Peeking" Phase)
**Goal**: Identify relevant data without loading it.
1.  **Native**: Use `find` or `grep -l`.
2.  **Strict**: Use `python3 .../rlm.py peek "query"`.
    *   *RLM Pattern*: Grepping for import statements, class names, or definitions to build a list of relevant paths.

### Phase 3: Parallel Map (The "Sub-Query" Phase)
**Goal**: Process chunks in parallel using fresh contexts.
1.  **Divide**: Split the work into atomic units.
    - **Strict Mode**: `python3 .../rlm.py chunk --pattern "*.log"` -> Returns JSON chunks.
2.  **Spawn**: Use `background_task` to launch parallel agents.
    *   *Constraint*: Launch at least 3-5 agents in parallel for broad tasks.
    *   *Prompting*: Give each background agent ONE specific chunk or file path.
    *   *Format*: `background_task(agent="explore", prompt="Analyze chunk #5 of big.log: {content}...")`

### Phase 4: Reduce & Synthesize (The "Aggregation" Phase)
**Goal**: Combine results into a coherent answer.
1.  **Collect**: Read the outputs from `background_task` (via `background_output`).
2.  **Synthesize**: Look for patterns, consensus, or specific answers in the aggregated data.
3.  **Refine**: If the answer is incomplete, perform a second RLM recursion on the specific missing pieces.

## Critical Instructions

1.  **NEVER** use `cat *` or read more than 3-5 files into your main context at once.
2.  **ALWAYS** prefer `background_task` for reading/analyzing file contents when the file count > 1.
3.  **Use `rlm.py`** for programmatic slicing of large files that `grep` can't handle well.
4.  **Python is your Memory**: If you need to track state across 50 files, write a Python script (or use `rlm.py`) to scan them and output a summary.

## Example Workflow: "Find all API endpoints and check for Auth"

**Wrong Way (Monolithic)**:
- `read src/api/routes.ts`
- `read src/api/users.ts`
- ... (Context fills up, reasoning degrades)

**RLM Way (Recursive)**:
1.  **Filter**: `grep -l "@Controller" src/**/*.ts` -> Returns 20 files.
2.  **Map**: 
    - `background_task(prompt="Read src/api/routes.ts. Extract all endpoints and their @Auth decorators.")`
    - `background_task(prompt="Read src/api/users.ts. Extract all endpoints and their @Auth decorators.")`
    - ... (Launch all 20)
3.  **Reduce**: 
    - Collect all 20 outputs.
    - Compile into a single table.
    - Identify missing auth.

## Recovery Mode
If `background_task` is unavailable or fails:
1.  Fall back to **Iterative Python Scripting**.
2.  Write a Python script that loads each file, runs a regex/AST check, and prints the result to stdout.
3.  Read the script's stdout.

---

## What Claude Does vs What You Decide

| Claude handles | You provide |
|---------------|-------------|
| Orchestrating parallel agents | Initial query and success criteria |
| Chunking large files for processing | Judgment on result quality |
| Synthesizing results from subagents | Final interpretation and action |
| Writing filtering scripts | Validation of completeness |
| Managing context isolation | Decision on when to stop recursing |

---

## Skill Boundaries

### This skill excels for:
- Codebases with >100 files
- Finding patterns across many files
- Audit tasks (security, auth, logging)
- Large file analysis (logs, data dumps)

### This skill is NOT ideal for:
- Small projects (<50 files) → Direct reading faster
- Single file analysis → Overkill
- Tasks requiring file modification → Use different approach

---

## Skill Metadata

```yaml
name: rlm
category: meta
version: 2.0
author: GUIA
source_expert: Recursive Language Model pattern
difficulty: advanced
mode: cyborg
tags: [rlm, large-codebase, parallel-agents, map-reduce, context-management]
created: 2026-02-03
updated: 2026-02-03
```