--- name: research-storage description: "Research file storage conventions and templates for dokhak agents. Use when: (1) saving research results from research-collector or researcher agents, (2) reading cached research files, (3) checking if research exists for a section. Provides directory structure, file format templates, and naming conventions." allowed-tools: Read, Write, Glob --- # Research Storage Skill This skill defines conventions for storing and retrieving research data collected by dokhak agents. Research files are cached to enable reuse and reduce redundant web searches. ## Quick Reference for Agents | Agent | Uses This Skill For | |-------|---------------------| | researcher | Directory resolution, research.md writing, multi-tier lookup | | research-collector | summary.md, sources.md writing to `.research/init/` | | writer | Reading research files (read-only) | | structure-designer | Reading init research (read-only) | ### Standard Loading Pattern All agents should reference this skill for: - **Normalization functions**: normalizeChapter, normalizeSection, generateSlug - **Multi-tier directory resolution**: Handling legacy naming inconsistencies - **File format templates**: research.md, sources.md, summary.md ``` Read("skills/research-storage/SKILL.md") ``` ## Directory Structure ``` project-root/ ├── .research/ # Research cache directory │ ├── init/ # /init command research │ │ ├── summary.md # Structured research summary │ │ └── sources.md # Source registry with reliability │ │ │ └── sections/ # /write command section research │ ├── 01-1-introduction/ │ │ ├── research.md # Section research results │ │ └── sources.md # Section sources │ ├── 01-2-core-concepts/ │ │ ├── research.md │ │ └── sources.md │ └── {chapter}-{section}-{slug}/ │ ├── research.md │ └── sources.md ``` ## Naming Convention ### Section Directory Pattern (CANONICAL) Format: `{chapter}-{section}-{slug}` | Component | Format | Canonical Example | Non-canonical (avoid) | | --------- | ----------------------- | ----------------- | --------------------- | | chapter | Zero-padded 2 digits | `01`, `02`, `10` | `1`, `2` | | section | Single digit (NO padding) | `1`, `2`, `3` | `01`, `02` | | slug | Kebab-case lowercase | `core-concepts` | `Core-Concepts` | **Canonical Examples**: - Chapter 1, Section 2, "Core Concepts" → `01-2-core-concepts` ✓ - Chapter 3, Section 1, "Getting Started" → `03-1-getting-started` ✓ - Chapter 10, Section 3, "Advanced Patterns" → `10-3-advanced-patterns` ✓ **Non-canonical (may exist from legacy/inconsistency)**: - `1-2-core-concepts` (chapter not padded) - `01-02-core-concepts` (section padded) - `01-2-Core-Concepts` (slug not lowercase) ## Normalization Functions **CRITICAL**: All agents MUST use these normalization functions to ensure consistency. ### normalizeChapter(chapter) Converts any chapter format to canonical 2-digit zero-padded string. ``` Input: "1" or "01" or 1 or "001" Output: "01" (always 2-digit zero-padded string) Process: 1. Convert to integer: parseInt(chapter, 10) 2. Zero-pad to 2 digits: String(n).padStart(2, '0') Examples: - "1" → "01" - "01" → "01" - "10" → "10" - 1 → "01" - "001" → "01" ``` ### normalizeSection(section) Converts any section format to canonical single-digit string (no padding). ``` Input: "1" or "01" or 1 Output: "1" (single digit, no padding) Process: 1. Convert to integer: parseInt(section, 10) 2. Convert to string: String(n) Examples: - "1" → "1" - "01" → "1" - "3" → "3" - "03" → "3" ``` ### generateSlug(title) Converts title to canonical kebab-case slug. ``` Input: Any title string Output: Lowercase kebab-case slug Process: 1. Convert to lowercase: title.toLowerCase() 2. Replace spaces with hyphens: replace(/\s+/g, '-') 3. Remove special characters (keep a-z, 0-9, -): replace(/[^a-z0-9-]/g, '') 4. Collapse multiple hyphens: replace(/-+/g, '-') 5. Trim leading/trailing hyphens: replace(/^-|-$/g, '') Examples: - "Core Concepts" → "core-concepts" - "What is React?" → "what-is-react" - "Setup & Installation" → "setup-installation" - " Multiple Spaces " → "multiple-spaces" - "C++ Programming" → "c-programming" ``` ### buildCanonicalPath(chapter, section, title) Builds the canonical directory path. ``` Input: chapter, section, title Output: ".research/sections/{canonical_chapter}-{canonical_section}-{canonical_slug}/" Process: 1. canonical_chapter = normalizeChapter(chapter) 2. canonical_section = normalizeSection(section) 3. canonical_slug = generateSlug(title) 4. return ".research/sections/{canonical_chapter}-{canonical_section}-{canonical_slug}/" Example: - buildCanonicalPath("1", "02", "Core Concepts") - → ".research/sections/01-2-core-concepts/" ``` ## File Path Generation ### For /init Research ``` .research/init/summary.md .research/init/sources.md ``` ### For Section Research ``` .research/sections/{chapter}-{section}-{slug}/research.md .research/sections/{chapter}-{section}-{slug}/sources.md ``` **Example**: Section 1.2 "Core Concepts" ``` .research/sections/01-2-core-concepts/research.md .research/sections/01-2-core-concepts/sources.md ``` ## File Format Templates ### summary.md (for /init) ```markdown # Research Summary > Generated: {YYYY-MM-DD} > Topic: {topic} > Domain: {domain} ## Key Concepts ### {Concept 1} - **Definition**: {clear definition} - **Importance**: {why it matters} - **Source**: [{source name}]({url}) ### {Concept 2} - **Definition**: {clear definition} - **Importance**: {why it matters} - **Source**: [{source name}]({url}) ## Learning Path 1. **Prerequisites**: {comma-separated list} 2. **Fundamentals**: {comma-separated list} 3. **Core Skills**: {comma-separated list} 4. **Advanced Topics**: {comma-separated list} ## Current Trends ({current_year}) - {trend 1 with source link} - {trend 2 with source link} ## Domain-Specific Information {domain-specific sections based on domain-profiles skill} ``` ### sources.md (for both /init and sections) ```markdown # Source Registry > Section: {section_id or "init"} > Generated: {YYYY-MM-DD} ## Primary Sources (High Reliability) | Source | URL | Type | Last Verified | | ------ | ----- | ------------- | ------------- | | {name} | {url} | Official Docs | {YYYY-MM-DD} | | {name} | {url} | Official Docs | {YYYY-MM-DD} | ## Secondary Sources (Medium Reliability) | Source | URL | Type | Notes | | ------ | ----- | -------- | ------- | | {name} | {url} | Tutorial | {notes} | | {name} | {url} | Blog | {notes} | ## Rejected Sources | Source | Reason | | ------ | ----------------- | | {name} | Outdated (year) | | {name} | Unreliable author | ``` ### research.md (for sections) ````markdown # Research: {Section Title} > Section: {chapter}.{section} {title} > Target Pages: {N}p > Generated: {YYYY-MM-DD} ## Scope {Brief description of what this section covers} ## Key Concepts ### {Concept 1} - **Definition**: {definition} - **Source**: [{name}]({url}) ### {Concept 2} - **Definition**: {definition} - **Source**: [{name}]({url}) ## Code Examples ### {Example Title} ```{language} {code} ``` > Source: [{name}]({url}) ## Common Pitfalls 1. **{Pitfall 1}**: {description} - **Cause**: {why it happens} - **Solution**: {how to avoid} 2. **{Pitfall 2}**: {description} - **Cause**: {why it happens} - **Solution**: {how to avoid} ## Practical Insights - {insight 1 with source link} - {insight 2 with source link} ## Subtopic Coverage | Subtopic | Status | Source | | -------- | -------- | ----------------- | | {name} | Complete | [{source}]({url}) | | {name} | Partial | [{source}]({url}) | | {name} | Missing | - | ```` ## Directory Resolution Strategy When locating research directories, use multi-tier search to handle naming inconsistencies from legacy data or different generation sources. ### Why Multi-Tier Search? Research directories may have been created with inconsistent naming: | Inconsistency Type | Example Mismatch | | ---------------------- | ------------------------------------------ | | Chapter padding | `1-2-intro` vs `01-2-intro` | | Section padding | `01-02-intro` vs `01-2-intro` | | Slug case | `01-2-Core-Concepts` vs `01-2-core-concepts` | | Slug special chars | `01-2-whats-new?` vs `01-2-whats-new` | | Combined inconsistency | `1-02-What's New?` vs `01-2-whats-new` | ### Multi-Tier Search Algorithm > **⚠️ CRITICAL: Glob Returns Files Only** > > Glob does NOT return directories. All patterns MUST end with a filename (e.g., `/research.md`). > > | Pattern | Result | > |---------|--------| > | `.research/sections/*9-1*` | ❌ Empty (matches directory, not returned) | > | `.research/sections/*9-1*/research.md` | ✅ Returns file path | > | `.research/sections/*9-1*/*` | ✅ Returns all files in matching dirs | Execute tiers in order. Stop at first successful match. #### Tier 1: Canonical Exact Match (Primary) Search using fully normalized canonical path. ``` canonical_chapter = normalizeChapter(chapter) # "1" → "01" canonical_section = normalizeSection(section) # "02" → "2" canonical_slug = generateSlug(title) # "Core Concepts" → "core-concepts" Glob(".research/sections/{canonical_chapter}-{canonical_section}-{canonical_slug}/research.md") Example: Glob(".research/sections/01-2-core-concepts/research.md") ``` #### Tier 2: Canonical Chapter-Section, Any Slug If Tier 1 fails, search with canonical chapter-section but wildcard slug. ``` Glob(".research/sections/{canonical_chapter}-{canonical_section}-*/research.md") Example: Glob(".research/sections/01-2-*/research.md") ``` This catches slug variations like `Core-Concepts`, `core_concepts`, etc. #### Tier 3: Non-Padded Chapter Variation If Tier 2 fails, try without chapter zero-padding (legacy compatibility). ``` raw_chapter = String(parseInt(chapter, 10)) # "01" → "1" Glob(".research/sections/{raw_chapter}-{canonical_section}-*/research.md") Example: Glob(".research/sections/1-2-*/research.md") ``` #### Tier 4: Flexible Pattern Match (Last Resort) If all above fail, use section number and first slug keyword. ``` first_keyword = generateSlug(title).split('-')[0] # "core-concepts" → "core" Glob(".research/sections/*-{canonical_section}-*{first_keyword}*/research.md") Example: Glob(".research/sections/*-2-*core*/research.md") ``` ### Resolution Output Format Return resolution result in XML format: ```xml {original_chapter}
{original_section}
{original_title} {normalized_chapter}
{normalized_section}
{normalized_slug} {canonical_path}
{matched_path OR canonical_path} {true|false} {1|2|3|4|new}
``` ### Resolution Logic Summary ``` function resolveResearchDirectory(chapter, section, title): # Normalize inputs c = normalizeChapter(chapter) s = normalizeSection(section) slug = generateSlug(title) canonical = ".research/sections/{c}-{s}-{slug}/" # Tier 1: Exact canonical result = Glob("{canonical}research.md") if result: return { path: canonical, existing: true, tier: 1 } # Tier 2: Canonical chapter-section, any slug result = Glob(".research/sections/{c}-{s}-*/research.md") if result: return { path: parent(result[0]), existing: true, tier: 2 } # Tier 3: Non-padded chapter raw_c = String(parseInt(chapter, 10)) result = Glob(".research/sections/{raw_c}-{s}-*/research.md") if result: return { path: parent(result[0]), existing: true, tier: 3 } # Tier 4: Flexible pattern keyword = slug.split('-')[0] result = Glob(".research/sections/*-{s}-*{keyword}*/research.md") if result: return { path: parent(result[0]), existing: true, tier: 4 } # No match - use canonical for new directory return { path: canonical, existing: false, tier: "new" } ``` ## Usage Patterns ### Checking Existing Research (UPDATED) **IMPORTANT**: Do NOT use simple Glob. Use the multi-tier resolution strategy above. ``` # OLD (may miss existing research due to naming inconsistency) Glob(".research/sections/{chapter}-{section}-{slug}/research.md") # NEW (handles all variations) resolution = resolveResearchDirectory(chapter, section, title) existing_research = resolution.existing research_dir = resolution.resolved_path ``` Returns resolved directory path and existence status. ### Reading Research Files When consuming research, read files directly in agent context: ``` Read(".research/init/summary.md") Read(".research/sections/01-2-core-concepts/research.md") ``` ### Saving Research Results Agents should Write files following the templates above: ``` Write(".research/init/summary.md", content) Write(".research/sections/01-2-core-concepts/research.md", content) ``` ## Agent-Specific Guidelines ### research-collector Agent - Outputs to: `.research/init/summary.md`, `.research/init/sources.md` - Creates directory if not exists - Returns confirmation only: `research_saved:.research/init/` ### researcher Agent - Outputs to: `.research/sections/{id}/research.md`, `.research/sections/{id}/sources.md` - Checks existing research via Subtopic Coverage table - Appends to existing if partial coverage - Returns confirmation only: `research_saved:{output_dir}` ### Consumer Agents (structure-designer, writer) - Receive file paths in prompt - Read files directly in their own context - Do not modify research files ## XML Output Schemas Standardized XML schemas for agent communication. All agents should use these formats for consistency. ### Research Result Schema Used by `research-collector` and `researcher` agents: ```xml {N} {N} {path} {YYYY-MM-DD} - [Source Name](url) - {reliability: high|medium} - **{Term}**: {Definition} 1. Prerequisites: {list} 2. Fundamentals: {list} 3. Core Skills: {list} 4. Advanced: {list} ``` ### Directory Resolution Schema Used by `researcher` agent for path resolution: ```xml {original_chapter}
{original_section}
{original_title} {normalized_chapter}
{normalized_section}
{normalized_slug} {canonical_path}
{matched_path OR canonical_path} {true|false} {1|2|3|4|new}
``` ### Subtopic Coverage Schema Used within research files to track coverage: ```xml {url or "pending"} ``` ### Status Values Reference | Status | Context | Meaning | |--------|---------|---------| | OK | research_result | All subtopics covered, sufficient sources | | PARTIAL | research_result | Some subtopics missing or incomplete | | ERROR | research_result | Critical failure (e.g., no sources found) | | FOUND | directory_resolution | Existing research directory located | | NEW | directory_resolution | No existing research, use canonical path | --- ## Error Handling | Scenario | Action | | ------------------------------ | ----------------------------------- | | `.research/` directory missing | Auto-create on first write | | Research file not found | Conduct fresh research | | Read failure | Log warning, conduct fresh research | | Write failure | Report error, do not update task.md | ## .gitignore Recommendation Research files are regenerable and should typically be ignored: ```gitignore # Research cache (regenerable) .research/ ```