--- name: quality-audit description: >- Meta-skill for auditing and validating skill quality. Use when reviewing skills for consistency, completeness, accuracy, and adherence to standards. Provides structured rubrics, scoring frameworks, and actionable recommendations. author: cortex team version: 1.0.0 license: MIT tags: [meta, quality, validation, review, standards] created: 2026-01-05 updated: 2026-01-05 triggers: - audit skill - review skill quality - validate skill - skill quality check - rubric assessment dependencies: skills: [] tools: [] token_estimate: ~2000 --- # Quality Audit Skill Systematic framework for evaluating skill quality across four dimensions: **Clarity**, **Completeness**, **Accuracy**, and **Usefulness**. ## When to Use This Skill - Reviewing a new skill before adding to the registry - Auditing existing skills for quality improvements - Creating quality rubrics for skill validation - Standardizing skill quality across the library - Preparing skills for production use ## Core Principles ### The Four Quality Dimensions | Dimension | Weight | Focus | |-----------|--------|-------| | **Clarity** | 25% | Structure, readability, progressive disclosure | | **Completeness** | 25% | Coverage, examples, edge cases, anti-patterns | | **Accuracy** | 30% | Correctness, best practices, security | | **Usefulness** | 20% | Real-world applicability, production-readiness | ### Scoring Scale (1-5) | Score | Label | Meaning | |-------|-------|---------| | 1 | Unacceptable | Fundamentally broken, dangerous, or unusable | | 2 | Needs Work | Major issues requiring significant revision | | 3 | Acceptable | Meets minimum standards, functional | | 4 | Good | High quality, minor improvements possible | | 5 | Excellent | Exemplary, production-ready, best-in-class | ### Passing Criteria - **Minimum**: 3.0 weighted average (acceptable) - **Target**: 4.0 weighted average (good) - **Exceptional**: 4.5+ weighted average (excellent) - **Blocking**: Accuracy must be ≥3.0 (no dangerous advice) ## Audit Workflow ### Phase 1: Structure Check ```yaml checklist: structure: - [ ] Has valid YAML frontmatter - [ ] Contains required metadata (name, description) - [ ] Follows progressive disclosure (Tier 1 → 2 → 3) - [ ] Sections are logically ordered - [ ] Token estimate is reasonable (<5000 for core) ``` ### Phase 2: Content Evaluation ```yaml checklist: content: - [ ] "When to Use" section is clear - [ ] Core principles are well-defined - [ ] Code examples are complete and runnable - [ ] Anti-patterns are documented - [ ] Troubleshooting guidance exists ``` ### Phase 3: Dimension Scoring For each dimension, evaluate against specific criteria: **Clarity Criteria:** - Well-organized sections with logical flow - Concise explanations without jargon overload - Code examples are readable and well-commented - Progressive disclosure from simple to complex **Completeness Criteria:** - Covers core concepts thoroughly - Includes edge cases and error handling - Provides both do's and don'ts - Has working examples for main use cases **Accuracy Criteria:** - Code examples compile/run without errors - Follows current best practices (not deprecated) - Security considerations are correct - Performance claims are verifiable **Usefulness Criteria:** - Examples solve real-world problems - Can be applied immediately - Scales to production use cases - Includes troubleshooting guidance ### Phase 4: Report Generation ```markdown ## Audit Report: {skill_name} **Date**: {date} **Auditor**: {auditor} **Status**: {PASS|FAIL|NEEDS_REVIEW} ### Scores | Dimension | Score | Weight | Weighted | |-----------|-------|--------|----------| | Clarity | {x}/5 | 25% | {x*0.25} | | Completeness | {x}/5 | 25% | {x*0.25} | | Accuracy | {x}/5 | 30% | {x*0.30} | | Usefulness | {x}/5 | 20% | {x*0.20} | | **Total** | | | **{sum}/5** | ### Issues Found - [CRITICAL] {issue description} - [MAJOR] {issue description} - [MINOR] {issue description} ### Recommendations 1. {actionable recommendation} 2. {actionable recommendation} ``` ## Implementation Patterns ### Pattern 1: Quick Audit (5-minute review) Use for rapid assessment of skill quality: ```bash # Run automated structure checks cortex skills audit --quick # Output: Pass/Fail with basic metrics ``` **Quick Audit Checks:** 1. YAML frontmatter valid? 2. Required sections present? 3. Code blocks have language tags? 4. No TODO/FIXME markers? 5. Token count reasonable? ### Pattern 2: Full Audit (15-30 minute review) Comprehensive evaluation with human review: ```bash # Generate full audit report cortex skills audit --full # Interactive mode for scoring cortex skills audit --interactive ``` **Full Audit Process:** 1. Run automated checks 2. Read through content manually 3. Test code examples 4. Score each dimension 5. Document issues and recommendations 6. Generate report ### Pattern 3: Comparative Audit Compare skill against reference implementation: ```bash # Compare against template-skill-enhanced cortex skills audit --compare template-skill-enhanced ``` ### Pattern 4: Batch Audit Audit multiple skills for registry health: ```bash # Audit all skills in a category cortex skills audit --category security # Audit skills below threshold cortex skills audit --below-score 3.5 ``` ## CLI Commands ```bash # Basic audit cortex skills audit # Options --quick Quick structural check only --full Full audit with all dimensions --interactive Interactive scoring mode --output FILE Write report to file --format FORMAT Output format (markdown|json|yaml) --compare SKILL Compare against reference skill --fix Auto-fix simple issues (formatting) ``` ## Creating Custom Rubrics Skills can define custom rubrics in `validation/rubric.yaml`: ```yaml # validation/rubric.yaml version: "1.0.0" skill_name: my-skill dimensions: clarity: weight: 25 criteria: - "API examples use realistic data" - "Error handling is shown for each operation" completeness: weight: 25 criteria: - "Covers all HTTP methods" - "Includes pagination patterns" accuracy: weight: 30 criteria: - "Follows REST conventions" - "Security headers documented" usefulness: weight: 20 criteria: - "Examples work with common frameworks" passing_criteria: minimum_score: 3.5 # Higher bar for this skill required_dimensions: - accuracy - completeness ``` ## Best Practices ### Do - **Be specific** - "Line 45: SQL query vulnerable to injection" not "has security issues" - **Be actionable** - Include how to fix each issue - **Be fair** - Use the same standards consistently - **Document evidence** - Quote specific content for each score - **Prioritize** - Critical issues first, suggestions last ### Don't - Score based on personal style preferences - Mark deprecated patterns without suggesting alternatives - Fail skills for missing optional sections - Ignore security issues regardless of other scores - Rush through audits for complex skills ## Anti-Patterns ### The Rubber Stamp **Problem**: Approving skills without thorough review **Why it's bad**: Low-quality skills erode trust in the library **Fix**: Use the full audit checklist, test code examples ### The Perfectionist Block **Problem**: Failing skills for minor issues **Why it's bad**: Prevents useful skills from being available **Fix**: Distinguish between blocking issues and suggestions ### Score Inflation **Problem**: Giving high scores without justification **Why it's bad**: Makes scores meaningless **Fix**: Document specific evidence for each score ## Integration with CI/CD ```yaml # .github/workflows/skill-quality.yml name: Skill Quality Gate on: pull_request: paths: - 'skills/**' jobs: audit: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Install cortex run: pip install cortex - name: Audit changed skills run: | for skill in $(git diff --name-only HEAD~1 | grep 'skills/' | cut -d'/' -f2 | uniq); do cortex skills audit "$skill" --quick --fail-under 3.0 done ``` ## Troubleshooting ### "Audit fails but skill looks fine" 1. Check YAML frontmatter syntax 2. Verify all required sections exist 3. Ensure code blocks have language tags 4. Check for hidden characters (copy/paste issues) ### "Scores seem inconsistent" 1. Review the scoring guide for each dimension 2. Calibrate by auditing template-skill-enhanced first 3. Use --interactive mode for clearer criteria ## External Resources - [Skill Template Reference](../template-skill-enhanced/SKILL.md) - [Rubric Schema](../rubric.schema.yaml) - [Skill Creator Guide](../skill-creator/SKILL.md) ## Changelog ### 1.0.0 (2026-01-05) - Initial release - Four-dimension scoring framework - CLI integration - CI/CD workflow example