--- namespace: aiwg platforms: [all] name: research-gap-detect description: Build the mutual citation graph, find connected components, identify isolated clusters, and optionally search for bridge candidates and file gap issues. commandHint: argumentHint: "[--clusters-only] [--file-issues] [--search-bridges] [--format full|summary|json]" allowedTools: Read, Write, Glob, Grep, Bash, Agent, WebSearch model: sonnet category: research-analysis --- # Research Gap Detect Analyze the research corpus citation graph to find disconnected clusters, isolated papers, and gap opportunities. Optionally searches for bridge paper candidates and files gap issues. ## Triggers - "find research gaps" - "detect clusters" - "cluster analysis" - "find isolated papers" - "bridge candidate search" - `/research-gap-detect` ## Parameters ### `--clusters-only` (optional) Only run cluster detection — skip bridge search and issue filing. ### `--file-issues` (optional) Auto-file gap issues for each disconnected cluster pair. ### `--search-bridges` (optional) Search external databases for papers that could bridge disconnected clusters. ### `--min-cluster-size N` (optional) Minimum papers in a cluster to report. Default: 2. ### `--format` (optional) Output format: `full` (default), `summary`, or `json`. ## Execution Flow ### Phase 1: Build Citation Graph 1. Read the citation-network index (from `/corpus-index-build --graph citation-network`) - If stale or missing: run `/corpus-index-build --graph citation-network` first 2. Build an adjacency list from outgoing + incoming edges 3. Treat as undirected for cluster detection (A cites B ≡ A connected to B) ### Phase 2: Connected Components (BFS) Run BFS/connected-components on the undirected citation graph: 1. Initialize: all nodes unvisited 2. For each unvisited node: BFS to find its connected component 3. Collect components sorted by size (largest first) **Output**: ``` Connected Components: 9 Cluster 1: "Agentic Workflows" (124 papers) Hub: REF-016 (34 connections) Topics: agentic-workflows, multi-agent, orchestration Sample: REF-001, REF-016, REF-024, REF-121 ... Cluster 2: "GUI Agents" (31 papers) Hub: REF-198 (12 connections) Topics: gui-agents, web-agents, screen-understanding Sample: REF-198, REF-201, REF-215 ... ... Cluster 9: "Isolated" (3 papers) No hub (all degree 1) REF-299, REF-312, REF-350 ``` ### Phase 3: Gap Analysis For each pair of clusters, assess the gap: 1. **Topic overlap** — do the clusters share any tags? 2. **Temporal overlap** — do they cover the same years? 3. **Author overlap** — do any authors appear in both clusters? 4. **Bridgeability** — could a single paper connect them? Prioritize gaps by: - **Size product** — larger clusters disconnected = higher priority - **Topic proximity** — clusters with related but not identical topics - **Recency** — newer clusters may simply be missing recent cross-citations **Output**: ``` Gap Analysis: 12 cluster pairs Priority 1: "Agentic Workflows" ↔ "GUI Agents" Gap: 124 × 31 = 3,844 (size product) Topic overlap: agent, llm (2 shared tags) Bridge opportunity: HIGH Suggested search: "LLM agent GUI interaction orchestration" Priority 2: "Evaluation" ↔ "Reproducibility" Gap: 45 × 28 = 1,260 Topic overlap: evaluation, benchmark (2 shared tags) Bridge opportunity: MEDIUM Suggested search: "reproducible LLM evaluation benchmarks" ... ``` ### Phase 4: Bridge Search (if --search-bridges) For each high-priority gap: 1. Generate search queries from cluster topic overlap 2. Search external databases (Semantic Scholar, arXiv, Google Scholar) 3. Filter candidates by: - Cites papers from BOTH clusters - Published in overlapping time range - High citation count (likely to be connecting work) 4. Rank candidates by bridge potential **Output**: ``` Bridge Candidates Found: 8 For gap "Agentic Workflows" ↔ "GUI Agents": 1. "WebAgent: World-Centric Web Navigation" (2024) Cites: REF-016 (Cluster 1), REF-198 (Cluster 2) Citations: 87 Bridge potential: HIGH 2. "Agent-E: Vision-Language Planning for Web Tasks" (2024) Cites: REF-024 (Cluster 1), REF-201 (Cluster 2) Citations: 45 Bridge potential: MEDIUM ``` ### Phase 5: File Issues (if --file-issues) For each gap with bridge candidates, file a research induction issue: ```markdown ## Research Gap: [Cluster A] ↔ [Cluster B] **Gap Size**: [N × M papers disconnected] **Bridge Candidates**: [list] **Suggested Action**: Induct [top candidate] to connect clusters ### Bridge Papers to Induct - [ ] "WebAgent: World-Centric Web Navigation" — arxiv:2401.XXXXX - [ ] "Agent-E: Vision-Language Planning" — arxiv:2403.XXXXX ``` ### Phase 6: Report ``` Research Gap Detection ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Graph: 372 nodes, 1,247 edges Connected components: 9 Largest cluster: 124 papers ("Agentic Workflows") Isolated papers: 3 Gap analysis: 12 cluster pairs HIGH priority: 4 (bridge candidates available) MEDIUM priority: 5 LOW priority: 3 Bridge candidates found: 8 papers Issues filed: 4 Papers recommended for induction: 8 ``` ## Distinction from research-gap | Tool | Approach | Output | |------|----------|--------| | `research-gap` | **Intellectual** — topic coverage, missing areas, GRADE gaps | Gap report with search queries | | `research-gap-detect` | **Structural** — citation graph topology, disconnected components | Cluster map, bridge candidates, filed issues | `research-gap` answers "what topics are we missing?" while `research-gap-detect` answers "which existing papers don't cite each other but should?" ## Examples ```bash # Full analysis with bridge search /research-gap-detect --search-bridges # Just show clusters /research-gap-detect --clusters-only # Detect and auto-file issues /research-gap-detect --file-issues # Combined: search + file /research-gap-detect --search-bridges --file-issues # JSON for visualization /research-gap-detect --format json ``` ## References - @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/corpus-index-build/SKILL.md — Builds the citation-network graph - @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/citation-backfill/SKILL.md — Prerequisite: complete bidirectional edges - @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-gap/SKILL.md — Complementary intellectual gap analysis - @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/induct-research/SKILL.md — Inducts bridge candidates