---
name: decompose-mcp
description: Decompose any text into classified semantic units — authority, risk, attention, entities. No LLM. Deterministic.
metadata: {"clawdbot":{"emoji":"🧩","requires":{"anyBins":["python3","python"]},"install":[{"id":"pip","kind":"uv","pkg":"decompose-mcp","bins":["decompose"],"label":"Install decompose-mcp (pip/uv)"}]}}
---

# Decompose

Decompose any text or URL into classified semantic units. Each unit gets authority level, risk category, attention score, entity extraction, and irreducibility flags. No LLM required. Deterministic. Runs locally.

## Setup

### 1. Install

```bash
pip install decompose-mcp
```

### 2. Configure MCP Server

Add to your OpenClaw MCP config:

```json
{
  "mcpServers": {
    "decompose": {
      "command": "python3",
      "args": ["-m", "decompose", "--serve"]
    }
  }
}
```

### 3. Verify

```bash
python3 -m decompose --text "The contractor shall provide all materials per ASTM C150-20."
```

## Available Tools

### `decompose_text`

Decompose any text into classified semantic units.

**Parameters:**
- `text` (required) — The text to decompose
- `compact` (optional, default: false) — Omit zero-value fields for smaller output
- `chunk_size` (optional, default: 2000) — Max characters per unit

**Example prompt:** "Decompose this spec and tell me which sections are mandatory"

**Returns:** JSON with `units` array. Each unit contains:
- `authority` — mandatory, prohibitive, directive, permissive, conditional, informational
- `risk` — safety_critical, security, compliance, financial, contractual, advisory, informational
- `attention` — 0.0 to 10.0 priority score
- `actionable` — whether someone needs to act on this
- `irreducible` — whether content must be preserved verbatim
- `entities` — referenced standards and codes (ASTM, ASCE, IBC, OSHA, etc.)
- `dates` — extracted date references
- `financial` — extracted dollar amounts and percentages
- `heading_path` — document structure hierarchy

### `decompose_url`

Fetch a URL and decompose its content. Handles HTML, Markdown, and plain text.

**Parameters:**
- `url` (required) — URL to fetch and decompose
- `compact` (optional, default: false) — Omit zero-value fields

**Example prompt:** "Decompose https://spec.example.com/transport and show me the security requirements"

## What It Detects

- **Authority levels** — RFC 2119 keywords: "shall" = mandatory, "should" = directive, "may" = permissive
- **Risk categories** — safety-critical, security, compliance, financial, contractual
- **Attention scoring** — authority weight x risk multiplier, 0-10 scale
- **Standards references** — ASTM, ASCE, IBC, OSHA, ACI, AISC, AWS, ISO, EN
- **Financial values** — dollar amounts, percentages, retainage, liquidated damages
- **Dates** — deadlines, milestones, notice periods
- **Irreducibility** — legal mandates, threshold values, formulas that cannot be paraphrased

## Use Cases

- Pre-process documents before sending to your LLM — save 60-80% of context window
- Classify specs, contracts, policies, regulations by obligation level
- Extract standards references and compliance requirements
- Route high-attention content to specialized analysis chains
- Build structured training data from raw documents

## Library: `filter_for_llm()`

When using Decompose as a Python library (not MCP), `filter_for_llm()` pre-filters decompose output for LLM context windows:

```python
from decompose import decompose_text, filter_for_llm

result = decompose_text(document_text)
filtered = filter_for_llm(result, max_tokens=4000)
# filtered["text"] = high-value units only (mandatory, safety-critical, financial, compliance)
# filtered["meta"]["reduction_pct"] = typically 60-80% token reduction
```

Keeps mandatory, prohibitive, directive, conditional authority + safety-critical, compliance, financial, contractual risk + requirement, constraint, data, definition types. Configurable via keyword arguments.

## Performance

- ~14ms average per document on Apple Silicon
- 1,000+ chars/ms throughput
- Zero API calls, zero cost, works offline
- Deterministic — same input always produces same output

## Security & Trust

**Text classification is fully local.** The `decompose_text` tool performs all processing in-process with no network I/O. No data leaves your machine.

**URL fetching performs outbound HTTP requests.** The `decompose_url` tool fetches the target URL, which necessarily involves network I/O to the specified host. This is why the skill declares the `network` permission in `claw.json`. If you do not need URL fetching, you can use `decompose_text` exclusively with no network access required.

**SSRF protection.** URL fetching blocks private/internal IP ranges before connecting: `0.0.0.0/8`, `10.0.0.0/8`, `100.64.0.0/10`, `127.0.0.0/8`, `169.254.0.0/16`, `172.16.0.0/12`, `192.168.0.0/16`, `::1/128`, `fc00::/7`, `fe80::/10`. The implementation resolves the hostname via DNS *before* connecting and checks all returned addresses against the blocklist. See [`src/decompose/mcp_server.py` lines 19-49](https://github.com/echology-io/decompose/blob/main/src/decompose/mcp_server.py#L19-L49).

**No API keys or credentials required.** No external services are contacted except when using `decompose_url` to fetch user-specified URLs.

**Source code is fully auditable.** The complete source is published at [github.com/echology-io/decompose](https://github.com/echology-io/decompose). The PyPI package is built from this repo via GitHub Actions ([`publish.yml`](https://github.com/echology-io/decompose/blob/main/.github/workflows/publish.yml)) using PyPI Trusted Publishers (OIDC), so the published artifact is traceable to a specific commit.

## Resources

- [Source Code (GitHub)](https://github.com/echology-io/decompose) — full source, auditable
- [PyPI](https://pypi.org/project/decompose-mcp/) — published via Trusted Publishers
- [Documentation](https://echology.io/decompose)
- [Blog: When Regex Beats an LLM](https://echology.io/blog/regex-beats-llm)
- [Blog: Why Your Agent Needs a Cognitive Primitive](https://echology.io/blog/cognitive-primitive)