---
name: researching-github
description: Use when researching GitHub repositories, code patterns, issues, discussions, and implementations - provides systematic 6-phase workflow with gh CLI (primary) and web fallback, quality indicators for repository assessment, and structured synthesis with citations
allowed-tools: Bash, WebSearch, WebFetch, Read, Write, TodoWrite, AskUserQuestion
---

# Researching GitHub

**Systematic GitHub research methodology for discovering repositories, code patterns, issues, discussions, and implementations with dual-tool strategy (gh CLI + web fallback).**

## When to Use

Use this skill when:

- Searching for open-source implementations of techniques
- Finding example repositories for a technology/pattern
- Discovering GitHub issues/discussions about specific problems
- Locating code snippets across public repositories
- Researching project adoption/community health
- Comparing implementations across multiple repositories

**DO NOT use when:**

- npm/library docs needed → use `researching-context7`
- Academic papers needed → use `researching-arxiv`
- Local codebase patterns → use `researching-codebase`
- General web/blog content → use `researching-web`

**You MUST use TodoWrite** before starting to track all workflow phases.

## Quick Reference

| Phase                        | Purpose                       | Primary Tool | Fallback  | Time    |
| ---------------------------- | ----------------------------- | ------------ | --------- | ------- |
| 1. Environment & Query       | Detect gh CLI, formulate      | Bash         | -         | 2 min   |
| 2. Search Execution          | Execute searches              | gh CLI       | WebSearch | 3-5 min |
| 3. Quality Assessment        | Filter by indicators          | -            | -         | 2 min   |
| 4. Deep Inspection           | Fetch details for top results | gh CLI       | WebFetch  | 5 min   |
| 5. Code Extraction (if code) | Extract patterns              | gh CLI       | WebFetch  | 3 min   |
| 6. Synthesis                 | Structured findings           | Write        | -         | 3 min   |

**Total time**: 15-20 minutes

---

## Tool Strategy (CRITICAL)

### Primary: gh CLI

**Advantages:**

- Authenticated access (no rate limits with `gh auth login`)
- JSON output for structured parsing
- Advanced filtering (language, stars, license, date)
- Direct repository content access

**Availability check:**

```bash
gh auth status
```

### Fallback: WebSearch + WebFetch

**When to use:**

- `gh auth status` fails (not authenticated)
- User prefers web-based research
- gh CLI timeout/errors

**Pattern:**

```
WebSearch('site:github.com "query" language:go stars:>100')
WebFetch('https://github.com/search?q={URL_ENCODED_QUERY}&type=repositories', 'Extract top 10 repository names, owners, descriptions, star counts, last updated dates, and URLs')
WebFetch('https://github.com/{owner}/{repo}', 'Extract README, license, star count, open issues, and technology stack')
```

---

## Workflow Phases

### Phase 1: Environment Detection & Query Formulation

#### 1.1 Check gh CLI Availability

```bash
gh auth status 2>&1
```

**Parse output:**

- Success → use gh CLI (primary)
- `not logged in` or `gh: command not found` → use web fallback

**Record tool choice** - affects all subsequent phases.

#### 1.2 Formulate Search Queries

**Pattern:** Broad → Specific → Niche

**Query 1 (Broad):** Core technology or domain

- Example: `rate limiting`
- Purpose: Survey landscape

**Query 2 (Specific):** Technology + language or framework

- Example: `rate limiting golang`
- Purpose: Target implementation stack

**Query 3 (Niche):** Problem-specific with context

- Example: `rate limiting redis golang middleware`
- Purpose: Exact use case

**Advanced filters (gh CLI):**

- `language:go`, `language:python`, `language:rust`
- `stars:>1000`, `stars:100..1000`
- `license:mit`, `license:apache-2.0`
- `pushed:>2024-01-01` (recent activity)

**For complete query formulation patterns, see:** [references/query-formulation.md](references/query-formulation.md)

---

### Phase 2: Search Execution

#### 2.1 Repository Search

**gh CLI:**

```bash
gh search repos "rate limiting golang" --limit 20 --json name,owner,description,stargazersCount,updatedAt,url,licenseInfo
```

**Web fallback:**

```
WebSearch('site:github.com "rate limiting golang" stars:>100')
WebFetch('https://github.com/search?q=rate+limiting+golang&type=repositories', 'Extract the top 10 repository names, owners, descriptions, star counts, and URLs')
```

#### 2.2 Code Search (requires gh auth)

**gh CLI:**

```bash
gh search code "func RateLimit" --language=go --limit 20 --json repository,path,textMatches
```

**Web fallback:**

```
WebFetch('https://github.com/search?q=func+RateLimit+language:go&type=code', 'Extract code snippets showing function signatures and usage patterns')
```

#### 2.3 Issues/Discussions Search

**gh CLI:**

```bash
gh search issues "rate limiting bug" --state=open --limit 20 --json title,repository,state,url,createdAt,comments
```

**Web fallback:**

```
WebSearch('site:github.com "rate limiting bug" type:issue state:open')
```

**For complete search command reference, see:** [references/gh-cli-commands.md](references/gh-cli-commands.md)

---

### Phase 3: Result Filtering & Quality Assessment

Apply quality indicators to filter results:

| Indicator       | Criteria                                                     | Action       |
| --------------- | ------------------------------------------------------------ | ------------ |
| ✅ High Quality | >1000 stars, active in last 6 months, has license, good docs | Prioritize   |
| ⚠️ Medium       | 100-1000 stars, updated in last year                         | Consider     |
| 🔍 Evaluate     | <100 stars but recent/relevant                               | Inspect more |
| ❌ Skip         | Archived, no updates >2 years, no license for production use | Deprioritize |

**Assessment criteria:**

1. **Stars** - Community validation (>1000 = high, 100-1000 = medium, <100 = evaluate)
2. **Recent activity** - Last commit within 6 months = active, 1 year = maintained, >2 years = stale
3. **License** - OSI-approved license present (MIT, Apache-2.0, BSD)
4. **Documentation** - README with examples, architecture docs, API reference

**Filter output:** Top 3-5 repositories that meet ✅ or ⚠️ criteria.

**For complete quality indicators and assessment criteria, see:** [references/quality-indicators.md](references/quality-indicators.md)

#### 3.1 Prioritization Algorithm

After filtering by quality tiers, rank repositories using weighted scoring:

| Factor       | Weight | Metric                       | Source (gh CLI)   |
| ------------ | ------ | ---------------------------- | ----------------- |
| Popularity   | 40%    | Stars                        | `stargazersCount` |
| Maintenance  | 35%    | Days since last commit       | `updatedAt`       |
| Forks        | 15%    | Fork count                   | `forkCount`       |
| Issue Health | 10%    | Open issues (fewer = better) | `openIssues`      |

**Scoring formula:**

```
Score = (popularity × 0.40) + (maintenance × 0.35) + (forks × 0.15) + (issue_health × 0.10)

Where each component is normalized to 0-1:
- popularity = min(stars / 10000, 1.0)
- maintenance = max(0, 1 - (days_since_commit / 365))
- forks = min(forks / 1000, 1.0)
- issue_health = 1 / (1 + log10(open_issues + 1))
```

**Research basis:** Weights informed by npms.io (maintenance 35%, popularity 35%), GitHub trending (velocity-based), and Snyk Advisor. Stars weighted higher (40%) since GitHub lacks download metrics.

**Sort order:** Within quality tiers, sort by score descending.

**For detailed calculation methodology, edge cases, and customization guidance, see:** [references/prioritization-algorithm.md](references/prioritization-algorithm.md)

---

### Phase 4: Deep Inspection

For each of the top 3-5 filtered results, fetch detailed information.

#### 4.1 Repository Details

**gh CLI:**

```bash
gh repo view {owner}/{repo} --json name,description,readme,licenseInfo,stargazerCount,forkCount,openIssues,defaultBranch
```

**Web fallback:**

```
WebFetch('https://github.com/{owner}/{repo}', 'Extract: full README content, license name, star count, fork count, last commit date, open issues count, programming languages used, and topics/tags')
```

#### 4.2 README Content

**gh CLI:**

```bash
gh api repos/{owner}/{repo}/contents/README.md | jq -r '.content' | base64 --decode
```

**Web fallback:**

```
WebFetch('https://raw.githubusercontent.com/{owner}/{repo}/main/README.md', 'Extract full README markdown content')
```

**Alternative branch fallback:** Try `master` if `main` fails.

#### 4.3 Extract Key Information

For each repository, document:

- **Name & URL**: `{owner}/{repo}` and GitHub URL
- **Description**: One-line summary
- **Stars/Forks**: Community metrics
- **License**: OSI license name
- **Last Updated**: Last commit date
- **Key Features**: From README (installation, usage, examples)
- **Technology Stack**: Languages, frameworks, dependencies
- **Relevance**: Why this matters for current research task

**For complete deep inspection patterns, see:** [references/deep-inspection.md](references/deep-inspection.md)

---

### Phase 5: Code Example Extraction (Conditional)

**Only for code searches** - skip if researching repositories/issues.

#### 5.1 Identify Code Patterns

From code search results, extract:

- **File path**: Relative to repository root
- **Function/class signature**: Use durable patterns (NOT line numbers)
- **Usage context**: How is this pattern used?
- **Repository attribution**: `{owner}/{repo}` URL

#### 5.2 Durable Code References (MANDATORY)

❌ **NEVER use line numbers** - they change with every commit:

```markdown
❌ BAD: github.com/foo/bar/file.go:123-127
❌ BAD: See line 154 in rate_limiter.go
```

✅ **USE function signatures** - stable across refactors:

```markdown
✅ GOOD: github.com/foo/bar/rate_limiter.go - `func (r *RateLimiter) Allow()`
✅ GOOD: github.com/foo/bar/middleware.go (between NewMiddleware() and ServeHTTP() methods)
```

**For complete code reference patterns, see:** [Code Reference Patterns](../../../../skills/managing-skills/references/patterns/code-reference-patterns.md)

---

### Phase 6: Synthesis

#### 6.1 Structured Output Format

Create research findings document:

```markdown
## GitHub Research: {topic}

**Date:** {current-date}
**Tool Used:** gh CLI / Web Fallback
**Purpose:** Research for {task-description}

### Search Summary

- **Queries Executed:**
  1. {query1} - {N} results
  2. {query2} - {N} results
  3. {query3} - {N} results
- **Total Results:** {count}
- **Filtered Results:** {count after quality assessment}

### Top Repositories

**1. {owner}/{repo-name}** ✅ High Quality — Score: 0.87

- **Stars:** {count} | **Forks:** {count} | **License:** {license}
- **Last Updated:** {date}
- **URL:** https://github.com/{owner}/{repo}
- **Description:** {one-line summary}
- **Relevance:** {why this matters for the task}
- **Key Features:**
  - {feature1}
  - {feature2}
  - {feature3}
- **Key Files:** {relevant files to examine}

**2. {owner}/{repo-name}** ✅ High Quality — Score: 0.72

... (repeat pattern, sorted by score within tier)

### Code Patterns Found (if code search)

| Pattern            | Repository  | File            | Signature               |
| ------------------ | ----------- | --------------- | ----------------------- |
| Rate limiter       | owner/repo1 | rate_limiter.go | `func (r *RL) Allow()`  |
| Middleware wrapper | owner/repo2 | middleware.go   | `func NewRateLimiter()` |

### Issues/Discussions (if searched)

| Title              | Repository | Status | Comments | URL   |
| ------------------ | ---------- | ------ | -------- | ----- |
| {issue-title}      | owner/repo | open   | {count}  | {url} |
| {discussion-title} | owner/repo | closed | {count}  | {url} |

### Recommendations

1. **Primary Choice:** {owner}/{repo} - {why it's recommended}
2. **Alternative Approaches:** {list other viable options}
3. **Caveats:** {known limitations, dependencies, compatibility}
4. **Next Steps:** {what to do with findings}

### Implementation Guidance

Based on research findings:

1. **Pattern:** {common pattern across repositories}
2. **Dependencies:** {shared dependencies}
3. **Best Practices:** {validated approaches from high-quality repos}
4. **Anti-Patterns:** {what to avoid based on closed issues}
```

#### 6.2 Save Synthesis

**Output location depends on invocation mode:**

**Mode 1: Standalone (invoked directly)**

```bash
ROOT="$(git rev-parse --show-superproject-working-tree --show-toplevel | head -1)"
TIMESTAMP=$(date +"%Y-%m-%d-%H%M%S")
TOPIC="{semantic-topic-name}"
mkdir -p "$ROOT/.claude/.output/research/${TIMESTAMP}-${TOPIC}-github"
# Write synthesis to: $ROOT/.claude/.output/research/${TIMESTAMP}-${TOPIC}-github/SYNTHESIS.md
```

**Mode 2: Orchestrated (invoked by orchestrating-research)**

When parent skill provides `OUTPUT_DIR`:

- Write synthesis to: `${OUTPUT_DIR}/github.md`
- Do NOT create directory (parent already created it)

**Detection logic:** If parent skill passed an output directory path, use Mode 2. Otherwise use Mode 1.

**For complete output format templates, see:** [references/output-format.md](references/output-format.md)

---

## GitHub Search URL Patterns (Web Fallback)

| Search Type  | URL Pattern                                             |
| ------------ | ------------------------------------------------------- |
| Repositories | `https://github.com/search?q={query}&type=repositories` |
| Code         | `https://github.com/search?q={query}&type=code`         |
| Issues       | `https://github.com/search?q={query}&type=issues`       |
| Discussions  | `https://github.com/search?q={query}&type=discussions`  |
| Commits      | `https://github.com/search?q={query}&type=commits`      |
| Users        | `https://github.com/search?q={query}&type=users`        |

**Advanced filters:**

- `language:go`, `language:python`, `language:rust`
- `stars:>1000`, `stars:100..1000`
- `pushed:>2024-01-01` (commits after date)
- `license:mit`, `license:apache-2.0`
- `archived:false` (exclude archived repos)
- `is:public` (only public repos)

**For complete search filter syntax, see:** [references/search-filters.md](references/search-filters.md)

---

## Common Rationalizations (DO NOT SKIP)

| Rationalization                     | Why It's Wrong                                                |
| ----------------------------------- | ------------------------------------------------------------- |
| "I'll just WebSearch GitHub"        | gh CLI has better filtering, JSON output, no rate limits      |
| "gh isn't set up, skip to guessing" | Web fallback exists - use it systematically                   |
| "First result looks good enough"    | Quality indicators exist for a reason - evaluate properly     |
| "Stars don't matter"                | Stars + recent activity = community validation                |
| "I know the popular repos"          | New repos emerge constantly, search to discover               |
| "No time for quality assessment"    | 2 min assessment prevents hours with unmaintained/archived    |
| "Skip code extraction"              | Function signatures provide implementation guidance           |
| "This is just web research"         | GitHub-specific workflow (gh CLI, quality indicators) differs |

---

## Difference from Sibling Skills

| researching-github    | researching-web           | researching-codebase      |
| --------------------- | ------------------------- | ------------------------- |
| GitHub-specific       | General web               | Local codebases           |
| gh CLI + web fallback | WebSearch + WebFetch only | Grep/Glob/Read            |
| Quality indicators    | Source validation         | Convention analysis       |
| Repo/code/issues      | Blogs/docs/Stack Overflow | Patterns/naming/structure |
| Open-source discovery | Supplemental sources      | Internal implementation   |

---

## Integration with Research Orchestration

This skill is invoked during research orchestration by `orchestrating-research` (LIBRARY):

```
Read(".claude/skill-library/research/orchestrating-research/SKILL.md")
```

Research orchestration delegates to:

1. **Codebase research** → researching-codebase
2. **Context7 research** → researching-context7
3. **GitHub research** → researching-github (THIS SKILL)
4. **arxiv research** → researching-arxiv
5. **Web research** → researching-web

**Routing logic:**

- "find GitHub examples" → researching-github
- "open-source implementations" → researching-github
- "repository with X feature" → researching-github
- "issues about X problem" → researching-github

---

## Key Principles

1. **Dual-Tool Strategy** - gh CLI primary, web fallback always available
2. **Quality Assessment** - Filter by stars, activity, license, docs
3. **Durable References** - Function signatures, not line numbers
4. **Structured Synthesis** - Consistent output format with citations
5. **Progressive Phases** - 6 phases from detection to synthesis
6. **TodoWrite Tracking** - Track all 6 phases for visibility
7. **Evidence-Based** - Every finding must have GitHub URL

---

## Related Skills

| Skill                              | Access Method                                                                              | Purpose                                |
| ---------------------------------- | ------------------------------------------------------------------------------------------ | -------------------------------------- |
| `orchestrating-research` (LIBRARY) | `Read(".claude/skill-library/research/orchestrating-research/SKILL.md")`                   | Orchestrator for all research types    |
| **researching-codebase**           | `Read(".claude/skill-library/research/researching-codebase/SKILL.md")` (LIBRARY)           | Local codebase pattern discovery       |
| **researching-context7**           | `Read(".claude/skill-library/research/researching-context7/SKILL.md")` (LIBRARY)           | npm/library documentation via Context7 |
| **researching-arxiv**              | `Read(".claude/skill-library/research/researching-arxiv/SKILL.md")` (LIBRARY)              | Academic papers and research           |
| **researching-web**                | `Read(".claude/skill-library/research/researching-web/SKILL.md")` (LIBRARY)                | General web research (fallback)        |
| **creating-skills**                | `Read(".claude/skill-library/claude/skill-management/creating-skills/SKILL.md")` (LIBRARY) | Skill creation workflow                |
| **updating-skills**                | `Read(".claude/skill-library/claude/skill-management/updating-skills/SKILL.md")` (LIBRARY) | Skill update workflow                  |

---

## Changelog

See `.history/CHANGELOG` for historical changes.

Initial creation: 2025-12-30