--- name: github-tech-scanner description: > Scans all active GitHub repositories using a Personal Access Token (PAT) and produces a full inventory of every programming language and framework in use — with repo counts, byte-share percentages, and version info where available. Use this skill whenever the user wants to audit their GitHub tech stack, see what languages or frameworks they use across repos, understand technology spread across an org, generate a tech inventory, or answer questions like "what frameworks do I use on GitHub?", "what languages appear in my repos?", "scan my GitHub for technologies", or "give me a breakdown of my GitHub stack". Trigger even when the user just mentions a GitHub PAT alongside any technology or repo question. --- # GitHub Tech Scanner This skill takes a GitHub Personal Access Token (PAT) and produces a clean report showing every **language** and **framework** in use across all active repositories the token can access. --- ## What you'll need from the user Ask for these if not already provided: 1. **GitHub PAT** — must have at minimum the `repo` scope (for private repos) or no scopes at all for public-only scanning. If they don't have one, direct them to: GitHub → Settings → Developer Settings → Personal Access Tokens. 2. **Active window** *(optional, default 365 days)* — how far back to look when deciding whether a repo is "active". Repos that haven't been pushed to within this window, or are archived/disabled, are skipped. 3. **Show repo breakdown** *(optional)* — whether to list which repos use each language/framework (off by default to keep output concise). --- ## How to run it The skill bundles a Python script that does all the heavy lifting via the GitHub REST API. Run it from the shell: ```bash pip install requests --break-system-packages --quiet python /path/to/github-tech-scanner/scripts/scan_repos.py \ --token \ --active-days 365 \ --verbose ``` Use the skill's own directory path to find the script. The script is at `scripts/scan_repos.py` relative to this SKILL.md. **Useful flags:** - `--active-days N` — change the activity cutoff (e.g. 180 for 6 months) - `--org NAME` — limit scan to a specific GitHub organization (e.g. `--org my-company`) - `--show-repos` — include per-language/framework repo lists - `--json` — emit raw JSON (useful for further processing) - `--verbose` / `-v` — show progress on stderr while scanning --- ## What the script detects ### Languages Uses GitHub's native `/repos/{owner}/{repo}/languages` endpoint — accurate, fast, and returns byte counts per language so you get both a raw count and a percentage share. ### Frameworks (via manifest files in the repo root) The script looks for these files in each repo's root directory and parses them: | File | Ecosystems detected | |------------------|----------------------------------------------------------| | `package.json` | React, Vue, Angular, Next.js, Express, NestJS, Vite, … | | `requirements.txt` / `pyproject.toml` | Django, Flask, FastAPI, PyTorch, … | | `Gemfile` | Rails, Sinatra, RSpec, … | | `go.mod` | Gin, Echo, Fiber, GORM, … | | `pom.xml` / `build.gradle` | Spring, Quarkus, Hibernate, … | | `composer.json` | Laravel, Symfony, … | | `Cargo.toml` | Actix, Axum, Tokio, Diesel, Tauri, … | | `pubspec.yaml` | Flutter, Riverpod, Firebase, … | --- ## Output format The default output is a human-readable report printed to stdout: ``` ============================================================ GitHub Tech Stack Report — @username ============================================================ Repos scanned: 42 active (of 87 total, active = pushed within 365 days) ── Languages ──────────────────────────────────────────── TypeScript 41.2% ████████ (18 repos) Python 28.5% █████ (12 repos) JavaScript 15.3% ███ (9 repos) ... ── Frameworks & Libraries ─────────────────────────────── React (11 repos) [18.1.0, 18.2.0] Django (6 repos) [4.2.0] Express (5 repos) ... ============================================================ ``` Present this output clearly in your response. If `--show-repos` was used, the per-repo lists will appear indented under each entry. --- ## Presenting the results After running the script, present a clean summary to the user: 1. **Top languages** — highlight the top 3–5 by percentage 2. **Framework highlights** — call out the most widely-used frameworks 3. **Observations** — note interesting patterns (e.g. mixed frontend stacks, heavy ML footprint, polyglot backend, etc.) 4. Offer to re-run with `--show-repos` if they want to know *which* repos use each technology 5. Offer to re-run with a different `--active-days` if the active window seems off --- ## Handling common issues **Rate limiting**: The GitHub API allows 5,000 requests/hour for authenticated requests. Large accounts (100+ active repos) may approach this. If you get a `403` with a rate-limit message, wait and retry, or suggest the user narrow the scope with `--active-days 90`. **No PAT / wrong scope**: If you get a `401`, the token is invalid or expired. If private repos show 0 languages, the token might lack the `repo` scope. **org-owned repos**: The script fetches repos via `/user/repos` with `affiliation=owner,collaborator,organization_member`, so it picks up repos the user contributes to across orgs, not just their personal account. **Empty results**: If a repo has no languages, it may be empty, contain only binaries, or the default branch may have no code files. --- ## Security note The PAT is only used during the scan and is never stored or logged. Remind the user not to paste their token into shared documents or public chats. Fine-grained tokens with read-only `Contents` and `Metadata` repo permissions are sufficient and recommended over classic tokens with full `repo` scope.