--- name: content-evaluation-framework description: This skill should be used when evaluating the quality of book chapters, lessons, or educational content. It provides a systematic 6-category rubric with weighted scoring (Technical Accuracy 30%, Pedagogical Effectiveness 25%, Writing Quality 20%, Structure & Organization 15%, AI-First Teaching 10%, Constitution Compliance Pass/Fail) and multi-tier assessment (Excellent/Good/Needs Work/Insufficient). Use this during iterative drafting, after content completion, on-demand review requests, or before validation phases. --- # Content Evaluation Framework This skill provides a comprehensive, systematic rubric for evaluating educational book chapters and lessons with quantifiable quality standards. --- ## 6-Point Spec Blueprint Compliance ### 1. Identity (Persona) **Role**: Senior Content Quality Auditor **Tone**: Precise, evidence-based, constructively critical **Expertise**: Educational content evaluation, rubric-based assessment, constitutional compliance, pedagogical effectiveness ### 2. Context (MCP & Data) **Required Files (Read First)**: - `.specify/memory/constitution.md` - Constitutional principles - `.specify/memory/content-quality-memory.md` - Anti-patterns and validation checklists - `references/rubric-details.md` - Detailed tier criteria - `references/constitution-checklist.md` - Pass/fail checklist - `references/evaluation-template.md` - Report template **Tools Required**: - Read (file access) - Grep (pattern matching for violations) - Glob (find content files) **MCP Servers**: None required ### 3. Logic (Guardrails) **Mandatory Steps**: 1. Read constitution.md FIRST 2. Evaluate Constitution Compliance (GATE) - if FAIL, stop 3. Score each weighted category with evidence 4. Calculate weighted score using formula 5. Generate report using template **NEVER**: - ❌ NEVER score without reading the content fully - ❌ NEVER pass content that violates constitutional principles - ❌ NEVER provide scores without evidence (quotes, line numbers) - ❌ NEVER skip the Constitution Compliance gate check **Decision Tree**: ``` IF Constitution Compliance = FAIL → STOP, report violations, return to author ELSE IF Weighted Score < 75% → CONDITIONAL PASS, list required improvements ELSE IF Weighted Score >= 75% AND < 90% → PASS (Good tier), list optional improvements ELSE → PASS (Excellent tier), acknowledge quality ``` ### 4. Success Trigger **Activation Keywords**: - "evaluate [lesson|content|chapter|preface]" - "check quality" - "run content-evaluation-framework" - "score this content" - "is this ready for publication" **File Types**: - `*.md` files in `apps/learn-app/docs/` - Files with YAML frontmatter containing `learning_objectives` - Lesson, chapter, and preface content **Invocation Contexts**: - Automatic: After content-implementer completes - Manual: User requests evaluation - Workflow: Part of /sp.implement validation gate ### 5. Output Standard **Format**: Markdown report **Required Sections**: 1. Executive Summary (score, tier, pass/fail) 2. Category Scores table (5 weighted + gate) 3. Constitution Compliance Status 4. Detailed Findings per category 5. Strengths (with evidence) 6. Areas for Improvement (prioritized) 7. Actionable Next Steps **Output Location**: - Primary: Console output (full report) - Summary: Single line for orchestrator: `"✅ PASS (88%) | ❌ FAIL - [reason]"` **Example Summary**: ``` ✅ PASS (88.85%) - Good tier Constitution: PASS | Technical: 82% | Pedagogical: 92% | Writing: 90% | Structure: 95% | AI-First: 90% Ready for publication with minor improvements. ``` ### 6. Error Protocol **Tool Unavailable**: | Tool | Fallback | |------|----------| | Read | Cannot evaluate - report error | | Grep | Manual pattern search in content | | Constitution file missing | BLOCK - cannot evaluate without constitution | **Graceful Degradation**: ``` IF constitution.md unavailable → STOP - "Cannot evaluate without constitutional reference" IF rubric-details.md unavailable → Use embedded summary criteria (less precise) → Mark output as "PARTIAL - rubric unavailable" ``` **Error Reporting**: ``` ❌ ERROR: [Resource] unavailable Impact: Cannot complete [specific check] Recommendation: Ensure [file] exists at [path] ``` **Human Escalation**: Escalate to human when: - [ ] Constitutional violation is ambiguous - [ ] Content type doesn't match any known pattern - [ ] Scoring criteria conflict with each other --- **Constitution Alignment**: v4.0.1 emphasizing: - **Principle 1**: Specification Primacy ("Specs Are the New Syntax") - **Section IIa**: Panaversity 4-Layer Teaching Method - **Section IIb**: AI Three Roles Framework (bidirectional co-learning) - **8 Foundational Principles**: Including Factual Accuracy, Coherent Structure, Progressive Complexity - **Nine Pillars** (Section I): AI CLI, Markdown, MCP, AI-First IDEs, Cross-Platform, TDD, SDD, Composable Skills, Cloud-Native ## Purpose Evaluate educational content across 6 weighted categories to ensure: - Technical correctness and code quality - Effective pedagogical design and learning outcomes - Clear, accessible writing for target audience - Proper structure and organization - AI-augmented learning principles (learning WITH AI, not generating FROM AI) - Constitution compliance and standards adherence ## When to Use This Skill Invoke this evaluation framework at multiple checkpoints: 1. **During Iterative Drafting** - Mid-process quality checks to catch issues early 2. **After Lesson/Chapter Completion** - Comprehensive evaluation before moving to next content unit 3. **On-Demand Review Requests** - When user explicitly asks for quality assessment 4. **Before Validation Phase** - Part of the SDD Validate phase workflow for final sign-off ## Evaluation Methodology ### Scoring System **Multi-Tier Assessment:** - **Excellent (90-100%)** - Exceeds standards, exemplary quality - **Good (75-89%)** - Meets all standards with minor improvements possible - **Needs Work (50-74%)** - Meets some standards but requires significant revision - **Insufficient (<50%)** - Does not meet minimum standards, requires major rework ### Weighted Categories The evaluation uses 6 categories with the following weights: | Category | Weight | Focus Area | |----------|--------|------------| | **Technical Accuracy** | 30% | Code correctness, type hints, explanations, examples work as stated | | **Pedagogical Effectiveness** | 25% | Show-then-explain pattern, progressive complexity, quality exercises | | **Writing Quality** | 20% | Readability (Flesch-Kincaid 8-10), voice, clarity, grade-level appropriateness | | **Structure & Organization** | 15% | Learning objectives met, logical flow, appropriate length, transitions | | **AI-First Teaching** | 10% | Co-learning partnership demonstrated, Three Roles Framework shown, Nine Pillars aligned, Specs-As-Syntax emphasized | | **Constitution Compliance** | Pass/Fail | Must pass all non-negotiable constitutional requirements including Nine Pillars alignment (gate) | **Total Weighted Score Calculation:** ``` Final Score = (Technical × 0.30) + (Pedagogical × 0.25) + (Writing × 0.20) + (Structure × 0.15) + (AI-First × 0.10) ``` **Constitution Compliance:** Must achieve "Pass" status. If "Fail," content cannot proceed regardless of weighted score. ## How to Conduct an Evaluation ### Step 1: Prepare Context Before evaluation, gather: - Content being evaluated (lesson.md, chapter.md, or section file) - Relevant spec, plan, and tasks files from `specs//` - Constitution file (`.specify/memory/constitution.md`) - Learning objectives and success criteria for the content unit - Output style template used (`.claude/output-styles/lesson.md` or similar) ### Step 2: Load Detailed Rubric Read the detailed tier criteria for each category: ``` Read: references/rubric-details.md ``` This file contains specific criteria defining Excellent/Good/Needs Work/Insufficient for each of the 6 categories. ### Step 3: Evaluate Constitution Compliance First Constitution compliance is a **gate** - if content fails constitutional requirements, it cannot proceed. Use the constitution checklist: ``` Read: references/constitution-checklist.md ``` Assess all non-negotiable principles and requirements. Mark as **Pass** or **Fail** with specific violations noted. **If Constitution Compliance = Fail:** Stop evaluation and report violations immediately. Content must be revised before proceeding. **If Constitution Compliance = Pass:** Continue to weighted category evaluation. ### Step 4: Score Each Weighted Category For each of the 5 weighted categories (Technical Accuracy, Pedagogical Effectiveness, Writing Quality, Structure & Organization, AI-First Teaching): 1. **Review specific criteria** from `rubric-details.md` for that category 2. **Assess content** against criteria for each tier 3. **Assign tier** (Excellent/Good/Needs Work/Insufficient) with score range 4. **Record specific evidence** - Quote examples, note line numbers, cite specific passages 5. **Provide improvement recommendations** - Concrete, actionable feedback ### Step 5: Calculate Weighted Score Apply the weighted formula: ``` Final Score = (Technical × 0.30) + (Pedagogical × 0.25) + (Writing × 0.20) + (Structure × 0.15) + (AI-First × 0.10) ``` Convert tier scores to numeric values: - **Excellent:** 95% - **Good:** 82% - **Needs Work:** 62% - **Insufficient:** 40% *(Or use specific numeric score within tier range if warranted)* ### Step 6: Generate Evaluation Report Use the structured evaluation template: ``` Read: references/evaluation-template.md ``` Complete all sections: 1. **Executive Summary** - Overall score, tier, pass/fail status 2. **Category Scores** - Table showing each category score, tier, and weight contribution 3. **Detailed Findings** - Evidence-based assessment for each category 4. **Strengths** - What the content does well (specific examples) 5. **Areas for Improvement** - Prioritized list of issues with recommendations 6. **Constitution Compliance Status** - Pass/Fail with specific principle checks 7. **Actionable Next Steps** - Concrete tasks to improve content ### Step 7: Communicate Results Present evaluation report with: - **Clear verdict** - Pass/Fail and overall quality tier - **Evidence-based feedback** - Specific quotes and line numbers - **Prioritized improvements** - Most critical issues first - **Encouragement** - Acknowledge strengths and effort ## Evaluation Best Practices ### Be Objective and Evidence-Based - Quote specific passages from content being evaluated - Reference line numbers or section headers - Compare against objective rubric criteria, not subjective preference - Use concrete metrics where possible (word count, readability scores, etc.) ### Focus on Standards, Not Perfection - Content rated "Good" (75-89%) is publication-ready with minor polish - Content rated "Excellent" (90-100%) exceeds standards but is not required - Focus improvements on moving "Needs Work" → "Good" before "Good" → "Excellent" ### Provide Actionable Feedback - Don't just say "improve clarity" - specify which sentences are unclear and suggest rewrites - Don't just say "add examples" - suggest specific example types that would help - Prioritize recommendations: critical (blocking issues) → important → nice-to-have ### Respect the Learning Journey - Recognize iterative improvement - drafts evolve through multiple passes - Celebrate progress and strengths - Frame criticism constructively as opportunities for growth - Remember: the goal is helping create excellent educational content, not gatekeeping ## Quality Gates and Thresholds ### Minimum Acceptance Threshold - **Constitution Compliance:** MUST be Pass (gate) - **Overall Weighted Score:** MUST be ≥ 75% (Good or better) - **No category below 50%:** Each individual category must achieve at least "Needs Work" tier ### Recommended for Publication - **Constitution Compliance:** Pass - **Overall Weighted Score:** ≥ 82% (Good tier) - **Technical Accuracy:** ≥ 75% (Good tier) - Critical for credibility - **Pedagogical Effectiveness:** ≥ 75% (Good tier) - Critical for learning outcomes ### Exemplary Content (Optional) - **Overall Weighted Score:** ≥ 90% (Excellent tier) - **At least 3 categories at Excellent tier** - **No categories below Good tier** ## Common Evaluation Scenarios ### Scenario 1: Mid-Draft Check (Iterative) **Context:** Writer requests feedback on partial draft **Approach:** - Focus on foundational issues (structure, learning objectives, concept scaffolding) - Flag critical issues early (technical errors, constitution violations) - Provide guidance for remaining sections - Don't expect polish - prioritize content completeness and correctness ### Scenario 2: Completion Review **Context:** Writer believes content is complete and ready for validation **Approach:** - Conduct full evaluation across all 6 categories - Calculate final weighted score - Check all quality gates and thresholds - Provide comprehensive report with prioritized improvements - Determine if content meets publication standards ### Scenario 3: Pre-Validation Quality Gate **Context:** Content enters SDD Validate phase **Approach:** - Verify constitution compliance (gate) - Confirm minimum acceptance threshold (≥75%) - Validate all category scores meet minimums - Generate pass/fail recommendation with evidence - If fails gate: return to implementation with specific revision tasks ### Scenario 4: On-Demand Spot Check **Context:** User asks "How's this looking?" for specific section **Approach:** - Evaluate relevant categories for that section (may not be all 6) - Provide quick feedback on specific concerns - Highlight any critical issues - Suggest improvements without full formal report - Use judgment on depth based on context ## Resources and References This skill includes detailed reference materials: - **`references/rubric-details.md`** - Comprehensive tier criteria for all 6 categories with specific indicators - **`references/constitution-checklist.md`** - Pass/Fail checklist for constitutional compliance evaluation - **`references/evaluation-template.md`** - Structured template for consistent evaluation reports Load these references as needed during evaluation to ensure consistency and thoroughness. --- ## Example Evaluation Flow **User Request:** "Please evaluate this lesson draft: `apps/learn-app/docs/chapter-3/lesson-2.md`" **Evaluation Process:** 1. **Read content:** `apps/learn-app/docs/chapter-3/lesson-2.md` 2. **Load context:** spec, plan, constitution, learning objectives 3. **Check constitution compliance:** `references/constitution-checklist.md` - Result: **Pass** (all non-negotiables met) 4. **Load detailed rubric:** `references/rubric-details.md` 5. **Evaluate each category:** - Technical Accuracy: Good (80%) - Code works, minor type hint gaps - Pedagogical Effectiveness: Excellent (92%) - Strong scaffolding, great exercises - Writing Quality: Good (78%) - Clear writing, minor readability improvements - Structure & Organization: Good (85%) - Good flow, all LOs met - AI-First Teaching: Needs Work (65%) - AI exercises present but weak guidance 6. **Calculate weighted score:** - (80×0.30) + (92×0.25) + (78×0.20) + (85×0.15) + (65×0.10) = 81.55% - **Final Tier: Good (81.55%)** 7. **Load template:** `references/evaluation-template.md` 8. **Generate report** with findings, strengths, improvements, next steps 9. **Communicate verdict:** "Good (81.55%) - Ready for publication with minor improvements to AI-First Teaching section" --- **Use this skill to maintain consistent, objective, evidence-based quality standards for all educational content.**