--- name: agent-skills-authoring description: >- Guides agents through creating, validating, and optimizing Agent Skills (SKILL.md files) from user requests. Covers all YAML frontmatter fields, directory layout, progressive disclosure, scripts, quality patterns, and evaluation workflows. Use when a user asks you to create, edit, audit, or improve an Agent Skill, even if they don't use the word "skill" — any request about reusable agent instructions, workflow packaging, or capability extension should trigger this skill. license: MIT compatibility: >- Any AI agent generating SKILL.md files. Compatible with the Agent Skills open standard (agentskills.io), Claude Code, Claude.ai, and the Claude API. metadata: author: "zebbern" version: "1.4" sources: "agentskills.io/specification, docs.anthropic.com, github.com/anthropics/skills, github.com/zebbern/agent-skills-authoring" audience: "ai-agents" --- # Agent Skills Authoring Guide This is your authoritative reference for producing high-quality SKILL.md files. When a user asks you to create, edit, or improve a skill, follow these instructions to produce the best possible result — one that any consuming agent can discover, activate, and apply effectively. --- ## 1. Core Principles These principles come from Anthropic's official best practices. Apply them to every skill. **Claude is already very smart.** Only add context Claude doesn't already have. Challenge each piece of content: "Does Claude really need this explanation?" "Can I assume Claude knows this?" "Does this paragraph justify its token cost?" Concise skills perform better because the context window is shared with conversation history, system prompts, and other skills' metadata. **Explain WHY, not just WHAT.** Anthropic's guidance: "Explain to the model why things are important in lieu of heavy-handed musty MUSTs." When agents understand reasoning, they generalize correctly across novel situations. Rigid imperative-only rules cause brittle, over-literal behavior. **Match specificity to fragility.** Use high freedom (general text guidance) when multiple approaches are valid. Use medium freedom (pseudocode/parameterized scripts) when a preferred pattern exists. Use low freedom (exact scripts, no modifications) when operations are fragile and consistency is critical. Think of it as: narrow bridge (one safe path, give exact steps) vs. open field (many valid routes, give direction and trust the agent). **Use consistent terminology.** Pick one term and use it throughout. Don't mix "API endpoint" / "URL" / "path" or "extract" / "pull" / "get" / "retrieve" in the same skill. --- ## 2. Directory Structure ``` skill-name/ ├── SKILL.md # REQUIRED — frontmatter + instructions ├── scripts/ # Optional — executable code (runs without loading into context) │ └── validate.py ├── references/ # Optional — docs loaded on demand │ ├── REFERENCE.md │ └── domain.md └── assets/ # Optional — templates, data files, schemas └── template.json ``` - Directory name must match the `name` field in frontmatter exactly - `scripts/` — self-contained executables. Handle errors explicitly ("solve, don't punt"). Scripts execute via bash and only their output enters context, not their code - `references/` — additional docs loaded on demand. Keep files focused. Add a table of contents at the top of any reference file over 100 lines - `assets/` — static resources: templates, configuration files, schemas, lookup tables - Use forward slashes in all paths, even on Windows: `scripts/validate.py` not `scripts\validate.py` **Domain organization:** For skills supporting multiple variants, split into references: ``` cloud-deploy/ ├── SKILL.md # Overview + variant selection └── references/ ├── aws.md ├── gcp.md └── azure.md ``` The agent loads only the relevant reference, saving context. --- ## 3. YAML Frontmatter — Complete Field Reference ### Base Spec Fields (agentskills.io) The open standard defines exactly **six fields**: #### `name` (REQUIRED) Unique identifier. 1–64 chars. Lowercase alphanumeric + hyphens only. No leading/trailing/consecutive hyphens. No XML tags. Cannot contain reserved words "anthropic" or "claude". Must match the parent directory name. **Prefer gerund naming** (verb + -ing): `processing-pdfs`, `analyzing-spreadsheets`, `testing-code`. Also acceptable: noun phrases (`pdf-processing`) or action-oriented (`process-pdfs`). Avoid vague names (`helper`, `utils`, `tools`). ```yaml name: processing-pdfs ``` #### `description` (REQUIRED) Natural-language summary. 1–1,024 chars. Plain text only. No XML tags. This is the primary triggering mechanism — agents use it to decide which skills to load. **Always write in third person.** "Processes Excel files..." not "I can help you..." or "You can use this..." (inconsistent POV causes discovery problems). **Make descriptions slightly "pushy."** Claude tends to undertrigger skills. Include both what the skill does AND specific trigger contexts. List keywords users would say. Add "even if they don't explicitly ask for X" where appropriate. ```yaml description: >- Extracts text and tables from PDF files, fills forms, and merges documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction, even if they don't explicitly ask for "PDF processing." ``` #### `license` (OPTIONAL) SPDX identifier or bundled file reference. Omit if unknown. #### `compatibility` (OPTIONAL) Environment requirements. 1–500 chars. Plain text. Only include if the skill has genuine requirements — most skills don't need this field. #### `metadata` (OPTIONAL) String key-value map. Use reasonably unique key names to avoid collisions. ```yaml metadata: author: "your-org" version: "1.0" category: "document-processing" ``` #### `allowed-tools` (OPTIONAL, EXPERIMENTAL) Space-delimited list. Use scoped format where the platform supports it. ```yaml allowed-tools: Bash(git:*) Bash(jq:*) Read Write ``` ### Claude Code Extended Fields Claude Code extends the open standard with additional fields. These are platform-specific and not part of the base agentskills.io spec: | Field | Purpose | | -------------------------- | ------------------------------------------------------------ | | `argument-hint` | Autocomplete hint: `[issue-number]` or `[filename] [format]` | | `disable-model-invocation` | `true` to prevent auto-loading (manual `/name` only) | | `user-invocable` | `false` to hide from / menu (background knowledge skills) | | `model` | Model override when this skill is active | | `context` | `fork` to run in a forked subagent context | | `agent` | Subagent type to use when `context: fork` is set | | `hooks` | Hooks scoped to this skill's lifecycle | In Claude Code, `name` falls back to directory name if omitted, and `description` falls back to the first paragraph of the markdown body. --- ## 4. Body Content Everything after the closing `---` is free-form Markdown. Keep the body under **500 lines**. If approaching this limit, move detail into `references/` and add clear pointers. ### Recommended Structure ```markdown # [Skill Name] [1-3 sentence overview] ## When to use this skill [Trigger conditions — what user requests or contexts activate this] ## How to [primary action] [Step-by-step instructions] ## Examples [Input/output pairs] ## Common edge cases [Boundary conditions and handling] ## Guidelines [Key principles with reasoning] ``` ### Progressive Disclosure | Level | Content | Budget | When Loaded | | ----- | -------------------------------- | ------------------------- | -------------------------- | | 1 | Frontmatter (name + description) | ~100 tokens | Always — at agent startup | | 2 | SKILL.md body | < 500 lines / < 5K tokens | When skill activates | | 3 | scripts/, references/, assets/ | Unlimited | On demand during execution | Level 3 scripts execute without loading into context — only their output enters the context window. Reference files are read on demand. This means you can bundle comprehensive resources with no context penalty until they're accessed. ### File References Use standard Markdown links with relative paths: ```markdown See [the reference guide](references/REFERENCE.md) for detailed API documentation. Run the extraction script: `python scripts/extract.py` For TypeScript specifics, see [TypeScript patterns](references/typescript.md). ``` Keep references **one level deep** from SKILL.md. Avoid nested reference chains — Claude may partially read files referenced from other referenced files. ### Script Guidance When referencing scripts, be explicit about intent: - **Execute** (most common): "Run `scripts/analyze.py` to extract fields" - **Read as reference** (for complex logic): "See `scripts/analyze.py` for the algorithm" Scripts should handle errors explicitly. Don't let scripts fail and expect Claude to figure it out — provide fallbacks, helpful error messages, and document dependencies. For MCP tools, use fully-qualified references: `ServerName:tool_name` (e.g., `BigQuery:bigquery_schema`, `GitHub:create_issue`). --- ## 5. Creating a Skill from a User Request ### Step 1: Capture Intent Understand what the user needs before writing anything: - What should this skill enable an agent to do? - When should it trigger? (phrases, tasks, contexts) - What's the expected output format? - Are there edge cases or constraints? - Does this need testing? (Verifiable outputs benefit from evals; subjective outputs like writing style often don't) If the conversation already contains a workflow ("turn this into a skill"), extract answers from context first — tools used, step sequence, corrections the user made. ### Step 2: Choose the Name Prefer gerund form: `{verb-ing}-{noun}` → `processing-pdfs`, `testing-code`. Validate: lowercase, hyphens only, 1-64 chars, no leading/trailing/consecutive hyphens, no XML tags, no "anthropic" or "claude" in the name. ### Step 3: Write the Description Third person always. Start with action verb. Include keyword triggers. State "Use when..." Make it slightly pushy to prevent undertriggering. Aim for 200-400 characters. ### Step 4: Set Optional Fields - **license:** User-specified, or `MIT`/`Apache-2.0`, or omit - **compatibility:** Only if genuine environment requirements exist - **metadata:** Add `author`, `version`, `category` at minimum - **allowed-tools:** Use scoped format: `Bash(git:*) Read` - **Claude Code fields:** Set `disable-model-invocation`, `argument-hint`, etc. if relevant ### Step 5: Write the Body Start with a draft. Then review with fresh eyes and improve. - 1-3 sentence overview - "When to use this skill" with trigger conditions - Step-by-step instructions explaining WHY, not just WHAT - Input/output examples (crucial for quality) - Edge cases - For complex workflows: provide a copyable checklist Claude can check off as it progresses Provide a **default approach with an escape hatch** rather than listing many options. "Use pdfplumber for text extraction. For scanned PDFs requiring OCR, use pdf2image instead." ### Step 6: Iterate (Eval-First) Anthropic's recommended pattern uses two Claude instances: 1. **Establish baseline** — run Claude on representative tasks WITHOUT the skill 2. **Write minimal draft** — only enough to address observed gaps 3. **Test with Claude B** — a fresh instance with the skill loaded on 2-3 realistic prompts 4. **Observe Claude B's behavior** — note where it struggles or misses instructions 5. **Return to Claude A** to refine — share what you observed, ask for improvements 6. **Repeat** until satisfied, then expand the test set This is evaluation-driven development: create evals BEFORE writing extensive documentation. Test without the skill first, then with the skill, and measure the improvement. ### Step 7: Validate - Directory name matches `name` field exactly - Description: 1-1,024 chars, third person, action verb, keyword triggers, "Use when..." - No XML tags in name or description - All Markdown links resolve to files that exist - SKILL.md body under 500 lines - Reference files over 100 lines have a table of contents - Run `skills-ref validate ./skill-directory` if available - Run `skills-ref to-prompt ./skill-directory` to preview the XML agents see --- ## 6. Quality Patterns ### Phased Workflows ```markdown ## Phase 1: Analyze [Understand the domain, gather context] ## Phase 2: Implement [Execute the primary task] ## Phase 3: Validate [Check quality, run validation] ``` ### Feedback Loops The run-validate-fix pattern dramatically improves output quality: ```markdown 1. Make changes 2. Run: `python scripts/validate.py` 3. If validation fails: fix and return to step 1 4. Only proceed when validation passes ``` ### Copyable Checklists For complex multi-step tasks: ```markdown Copy this checklist and track progress: - [ ] Step 1: Analyze input - [ ] Step 2: Create plan - [ ] Step 3: Validate plan - [ ] Step 4: Execute - [ ] Step 5: Verify output ``` ### Verifiable Intermediate Outputs For high-stakes operations: plan → validate plan → execute → verify. The plan file is machine-verifiable before changes are applied. ### Decision Tables | Scenario | Approach | Why | | ------------- | ----------------- | ---------------------- | | < 100 items | Inline array | No pagination overhead | | 100-10K items | Cursor pagination | Stable ordering | | > 10K items | Keyset pagination | O(1) page fetch | ### Before/After Examples ```markdown **Before:** `app.get('/getUser', ...)` **After:** `app.get('/users/:id', ...)` ``` --- ## 7. Anti-Patterns | Anti-Pattern | Why It Fails | Fix | | ------------------------------- | ----------------------------------- | ------------------------------------------- | | Vague description | Agents can't match to requests | Keyword triggers + "Use when..." + be pushy | | First/second person description | Discovery problems | Third person: "Processes..." not "I can..." | | Over-explaining basics | Wastes tokens; Claude already knows | Only add what Claude doesn't have | | Too many options | Confusing; agent may pick wrong one | Default approach + escape hatch | | Rigid MUST-only rules | Brittle; agents over-literalize | Explain reasoning so agent generalizes | | Body over 500 lines | Context waste, agent may truncate | Move detail to `references/` | | Code in description | Violates spec, breaks parsers | Code in body only | | Nested file references | Claude partially reads deeper files | One level deep from SKILL.md | | Name with "anthropic"/"claude" | Reserved words; validation fails | Choose different name | | Backslash paths | Break on Unix systems | Forward slashes always | | Time-sensitive info | Becomes wrong; no auto-update | Use "old patterns" section | | No examples | Agent guesses output format | 2-3 input/output pairs minimum | --- ## 8. Security - Never include secrets, API keys, or credentials in any skill file - Never instruct agents to disable security controls - Scripts should handle errors explicitly with fallbacks - Include input validation guidance for skills handling user data - Include sanitization guidance for skills generating browser-rendered output - Flag destructive tools in `allowed-tools` so users can review --- ## 9. Testing 1. **Syntax:** `skills-ref validate ./skill-name` 2. **Prompt preview:** `skills-ref to-prompt ./skill-name` — generates XML agents see 3. **Activation test:** Prompt that SHOULD trigger → confirm it fires 4. **Negative test:** Prompt that should NOT trigger → confirm it stays dormant 5. **Quality test:** Agent executes skill → review output quality 6. **Multi-model test:** Test with Haiku, Sonnet, AND Opus — what works for Opus may need more detail for Haiku For verifiable outputs, create structured evaluations comparing with-skill vs without-skill. --- ## 10. Platform Usage - **Claude Code:** `/plugin marketplace add anthropics/skills` → `/plugin install @anthropic-agent-skills` - Skills at: `~/.claude/skills/` (personal) or `.claude/skills/` (project) - **Claude.ai:** Pre-built skills on paid plans. Custom skills uploaded via settings - **Claude API:** Container parameter with `skills-2025-10-02` beta flag --- ## 11. Resources | Resource | URL | | ------------------------------------ | ------------------------------------------------------------------------------- | | Agent Skills Specification | https://agentskills.io/specification | | Anthropic Skills Repo (79.5K+ stars) | https://github.com/anthropics/skills | | **Anthropic Best Practices** | https://docs.anthropic.com/en/docs/agents-and-tools/agent-skills/best-practices | | Claude Code Skills Docs | https://docs.anthropic.com/en/docs/claude-code/skills | | skills-ref CLI | https://agentskills.io/tools | --- ## 12. Pre-Delivery Checklist ### Frontmatter - [ ] `name`: 1-64 chars, lowercase alphanum + hyphens, gerund preferred, matches directory - [ ] `name`: no XML tags, no "anthropic"/"claude" - [ ] `description`: 1-1,024 chars, third person, action verb, trigger keywords, "Use when..." - [ ] `description`: slightly pushy to prevent undertriggering - [ ] Optional fields set where relevant — no invented fields ### Body - [ ] 1-3 sentence overview - [ ] Trigger section explaining when this skill activates - [ ] Instructions explain WHY, not just WHAT - [ ] 2-3 input/output examples minimum - [ ] Edge cases covered - [ ] Under 500 lines, pure Markdown - [ ] Consistent terminology throughout ### Files - [ ] All Markdown links resolve to real files, one level deep - [ ] Reference files >100 lines have table of contents - [ ] Scripts handle errors explicitly with helpful messages - [ ] No secrets in any file - [ ] Forward slashes in all paths ### Quality - [ ] Description alone determines relevance - [ ] Body alone enables execution without reading references - [ ] Tested with representative prompts (activation + negative + quality) - [ ] A real user would find the output useful, not just spec-compliant