# rawq — Agent Usage Guide Context retrieval engine. Semantic + lexical hybrid search over codebases. Returns ranked code chunks with file paths, line ranges, scope labels, and confidence scores. ## When to use rawq Use rawq when you don't know where to look. Use grep/read when you already know the file or exact string. | Situation | Tool | |-----------|------| | "Where is retry logic implemented?" | `rawq search` | | "Find the function named `parse_config`" | `rawq search` or `grep` | | "Read line 42 of src/main.rs" | `read` | | "What does this codebase do?" | `rawq map` | | "What changed and how does it affect X?" | `rawq diff` | ## Query style matters rawq blends semantic (embedding) and lexical (BM25) search. **How you phrase the query changes which mode dominates.** **Use natural language for concepts** — this is where rawq beats grep: ```bash rawq search "how does the app handle authentication failures" . rawq search "database connection pooling and retry logic" . rawq search "where are environment variables validated" . ``` **Use identifiers only for exact symbol lookup:** ```bash rawq search "fn parse_config" . rawq search "class DatabaseClient" . ``` **Do NOT use grep-style keyword queries with rawq.** These produce worse results: ```bash # BAD — grep-style keywords, rawq can't infer intent rawq search "auth error" . rawq search "db pool" . # GOOD — natural language, rawq understands the concept rawq search "how does authentication error handling work" . rawq search "database connection pool management" . ``` The more descriptive your query, the better semantic search works. Single keywords trigger lexical-dominant mode which is just BM25 — no better than grep. ## Use filtering options Agents often search the entire codebase when they already know constraints. Use filters to narrow results and improve relevance: ```bash # Filter by language — skip irrelevant file types rawq search "parse config" . --lang rust rawq search "API endpoint" . --lang typescript # Exclude patterns — skip tests, generated code, vendored deps rawq search "database" . --exclude "*.test.*" --exclude "vendor/*" # Force search mode when you know what you need rawq search -e "reconnect" . # lexical only — exact keyword match rawq search -s "how does caching work" . # semantic only — concept search # Re-rank for better precision on ambiguous queries rawq search "error handling" . --rerank # Text weight — boost docs/comments when searching for explanations rawq search "how to configure" . --text-weight 1.0 # Token budget — control how much context is returned rawq search "auth" . --token-budget 2000 --json ``` ## Commands ### search — find relevant code ```bash rawq search "query" [path] # hybrid search (default) rawq search "query" [path] --json # structured JSON for parsing rawq search "query" [path] --lang rust # only Rust files rawq search "query" [path] --exclude "test*" # skip test files rawq search "query" [path] --top 5 # limit to 5 results rawq "query" [path] # shorthand (no subcommand needed) ``` Key flags: - `--top N` — number of results (default 10) - `--context N` — surrounding context lines (default 3) - `--json` — structured output with all fields - `--stream` — NDJSON streaming (one result per line) - `--lang X` — filter by language - `--exclude "glob"` — skip matching files - `-e` / `-s` — force lexical / semantic mode - `--rerank` — two-pass keyword overlap re-ranking - `--text-weight F` — weight for text/markdown chunks (default 0.5, use 1.0 for docs) - `--token-budget N` — max tokens in results - `--full-file` — include full file content in results ### map — codebase structure ```bash rawq map . # definitions with hierarchy rawq map . --depth 3 # deeper nesting rawq map . --lang rust # only Rust files rawq map . --exclude "test*" # skip test directories rawq map . --json # structured output ``` Use to orient in an unfamiliar codebase before searching. **Filter with `--lang` and `--exclude`** to avoid noise from irrelevant files. ### diff — search within changes ```bash rawq diff "query" . # unstaged changes rawq diff "query" . --staged # staged changes rawq diff "query" . --base main # diff vs branch ``` ## JSON output format ```json { "schema_version": 1, "model": "snowflake-arctic-embed-s", "results": [ { "file": "src/db.rs", "lines": [23, 41], "display_start_line": 23, "language": "rust", "scope": "DatabaseClient.reconnect", "confidence": 0.91, "content": "...", "context_before": "...", "context_after": "...", "token_count": 45 } ], "query_ms": 8, "total_tokens": 45 } ``` ## Workflow 1. `rawq map .` — understand the structure 2. `rawq search "descriptive query" . --json` — find relevant code 3. Read the top results' files for full context 4. Act on what you found rawq narrows down which files matter. Read those files, not everything.