# FUNCTION_DOC_WORKFLOW_V5_BATCH

Orchestrate parallel function documentation using subagents. Each subagent follows [FUNCTION_DOC_WORKFLOW_V5.md](FUNCTION_DOC_WORKFLOW_V5.md) independently. This document covers target selection, dispatch, and result collection only.

## Dispatch Pattern

```
Task(
  subagent_type: "general-purpose",
  model: "sonnet",  // or "opus" for Public API / complex Init functions
  description: "Document FunctionName",
  prompt: "Follow docs/prompts/FUNCTION_DOC_WORKFLOW_V5.md to document the function
  at address 0xADDRESS (currently named 'FUN_XXXXXXXX').
  Skip get_current_selection() — the address is provided above.
  Apply all changes directly in Ghidra using MCP tools.
  Return the DONE output when complete."
)
```

**Concurrency**: Max 3 subagents at once. MCP tools serialize at the Ghidra HTTP layer — more than 3 risks timeouts without speed benefit.

**Model selection**: `sonnet` for Worker/Leaf/Getter functions. `opus` for Public API, complex Init/Cleanup, and functions requiring deep algorithm analysis.

## Target Selection

### By completeness score
1. Run `analyze_function_completeness` on candidates
2. Filter to score < 70%, sort ascending (worst first)
3. Dispatch subagents for each

### By call graph (callees first)
1. `get_function_call_graph` on the target
2. Topological sort: leaves first, then callers
3. Dispatch leaves in parallel, then next tier

### By undocumented functions
1. `list_functions` filtered to `FUN_*` prefix, or `find_next_undefined_function` repeatedly
2. Dispatch in batches of 3

### By neighborhood (address-adjacent)
1. Pick a documented function as anchor
2. `list_functions` to find adjacent `FUN_*` entries
3. Useful after orphaned code discovery — process newly created functions in the same region

## Practical Notes

These issues come up repeatedly when running V5 at scale:

- **`get_function_variables` returns empty after prototype changes**: Register-only variables lose Ghidra symbols. Call `force_decompile` first to refresh, then retry. Even if still empty, `rename_variables` works by matching names from decompiled output.
- **`set_local_variable_type` "No HighVariable found"**: Common for stack arrays (e.g., `ushort[6]`) and decompiler-inferred composites. Skip on first failure — note in plate comment Special Cases. Do not retry.
- **Storage still `undefined4` despite resolved display type**: The decompiler shows `int`/`dword`/`FILE*` but storage remains `undefined4`. Explicitly calling `set_local_variable_type` with the same type resolves it. Critical for reaching 100%.
- **Unfixable deductions** (do not retry or flag for manual review):
  - `this` void* in `__thiscall` — convention keyword, can't rename or type further
  - HighVariable-unmappable arrays — decompiler limitation
  - API-mandated void* params (e.g., `DllMain pvReserved`)
  - Phantom variables (`extraout_*`, `in_*`)
- **Trivial getters** (6 bytes, 2 instructions): 3 tool calls total — rename+prototype, plate comment, verify. Subagent overhead may not be worth it; consider documenting inline.

## Error Handling

- Subagent timeout/connection error: retry once, then skip and log
- Score < 50%: flag for manual review, do not re-dispatch
- Score 50-70% with only unfixable deductions (check `all_deductions_unfixable`): accept as complete

## Output

```
BATCH COMPLETE: N functions documented
Scores: FuncA=100%, FuncB=95%, FuncC=89% (unfixable: this void*)
Skipped: FuncD (timeout), FuncE (45% - manual review)
```