--- name: code-quality-fix-all description: Fix code quality issues identified in a code quality review stored in agent_artefacts/code_quality//. Systematically addresses issues found by the code-quality-review-all skill for ANY code quality topic, with validation and testing at each step. Use when user asks to fix issues from a code quality review, or asks to fix issues from agent_artefacts/code_quality/. --- # Code Quality Fix All Fix code quality issues identified in a code quality review. This skill systematically addresses issues found by the `code-quality-review-all` skill for ANY code quality topic, with validation and testing at each step. ## Expected Arguments When invoked, this skill expects the path to a code quality topic as an argument (e.g., `agent_artefacts/code_quality/private_api_imports`). If not provided, the skill will ask the user for the topic path. Within the topic path, there are several files: - README.md - contains description of the issue and examples of how to fix it - results.json - contains list of all identified issues - SUMMARY.md - contains summary of the identified issues **Filters and options** are specified interactively after the skill starts by using the `AskUserQuestion` tool to present options unless specified otherwise in arguments. - Which issue types to target: - all - specific types - Fix complexity level (easy only, medium and below, or all) - Which evaluations to fix (all, specific ones, evaluations with small number of issues) - Maximum number of issues to fix in this run ## Workflow ### Phase 1: Understanding the Topic and Planning 1. **Read topic documentation** - Read the README.md to understand: - What code quality issue this topic addresses - Why it matters (stability, maintainability, etc.) - How to detect the issue - How to fix the issue (fix patterns, examples) - Read results.json to get all identified issues - Identify which issues are in scope based on arguments 2. **Analyze and categorize issues** - Analyze fix complexity based on: - issue_description - suggested_fix from results.json - Fix examples in README.md - Classify as: - **Easy**: Single-line changes, clear fix pattern in README - **Medium**: Multi-line changes, well-documented fix approach - **Hard**: No clear fix pattern, requires research or copying code - Group issues by evaluation and issue type - Generate statistics for presenting to user 3. **Ask user for filtering preferences** - Use `AskUserQuestion` tool to ask: - Which evaluations to fix? (all / specific ones / most affected) - Which issue types to target? (all / specific types) - Fix complexity level? (easy only / easy+medium / all) - Max issues per run? (all / limit to specific number) - Apply filters based on user responses - Present filtered plan with: - Number of issues to fix - Breakdown by evaluation and issue type - Complexity distribution - Ask for final confirmation to proceed 4. **Validate understanding of fixes** - For each unique issue type in scope: - Check if README.md documents how to fix it - Look for "Good Examples" and "Bad Examples" sections - Check "suggested_fix" field in results.json - If fix approach is unclear for any issue type: - Research the correct approach - Update `/README.md` with findings - Ask user for guidance if still uncertain ### Phase 2: Pre-Fix Validation For each issue to be fixed: 1. **Read and understand context** - Read the entire file containing the issue (not just the line) - Understand how the problematic code is used - Look for related issues in the same file - Check for patterns that might affect the fix (e.g., multiple occurrences) - Identify any cascading changes needed (related imports, type hints, etc.) 2. **Validate the suggested fix** - Review the "suggested_fix" from results.json - Check against fix patterns in README.md - Verify the fix won't break functionality - For complex fixes: - Check if dependencies/alternatives actually exist - Validate that replacement code follows same patterns - Consider edge cases 3. **Estimate change scope** - Count how many lines will change for this fix - Identify if cascading changes are needed - Determine if multiple files need updating - **If changes exceed 100 lines for a single issue:** - Alert user with: - Issue details - Why the change is large - What will change - Get explicit approval before proceeding ### Phase 3: Applying Fixes 1. **Apply fixes systematically** - Process one evaluation at a time - Within each evaluation, group by issue type - For each fix: - Use Edit tool to apply the change - Follow the suggested_fix guidance - Apply fix patterns from README.md - Handle related issues in same file together - Add comments if the fix requires it (e.g., copied code attribution) - Track what was fixed 2. **Verify changes compile/parse** - After fixing each file, validate: - File is syntactically valid (Python can parse it) - No obvious import errors introduced - Code follows repository patterns - If validation fails: - Investigate the issue - Attempt to fix validation error - Rollback change if cannot be resolved 3. **Track progress** - Maintain list of: - Issues successfully fixed (file, line, issue type) - Issues that couldn't be fixed (with reasons) - Evaluations that have been modified - Files that were changed ### Phase 4: Testing and Validation 1. **Run linting** - Run repository's linter on modified files (ruff, flake8, mypy, etc.) - Check for: - Import errors - Type checking errors - Style violations introduced - Fix any linting issues that result from changes - If linting issues can't be fixed, document them 2. **Run unit tests** - Identify test files for each modified evaluation - Run unit tests for affected evaluations using pytest: **Basic test commands:** ```bash # Run tests for a specific evaluation uv run pytest tests// # Run a specific test file uv run pytest tests/test_file.py # Run a specific test uv run pytest tests/test_file.py::TestClass::test_method # Run slow tests (excluded by default) uv run pytest --runslow tests/ # Skip dataset download tests uv run pytest -m 'not dataset_download' tests/ # Run only slow tests uv run pytest -m slow tests/ ``` **Test markers to be aware of:** - `@pytest.mark.slow` - Tests taking >10 seconds - `@pytest.mark.dataset_download` - Tests that download datasets - `@pytest.mark.docker` - Tests using Docker - `@pytest.mark.huggingface` - HuggingFace-related tests - Focus on tests for the specific evaluation - Look for test failures or errors - **IMPORTANT**: Do NOT run full evaluations (they take too long) unless user explicitly requests it 3. **Handle test failures** - For each test failure: - Read test output carefully - Determine if failure is caused by the fix - Check if it's a pre-existing failure - If caused by fix: - Try to adjust the fix to make tests pass - If cannot be resolved, rollback the change - Document the issue for user review - If pre-existing: - Note it but don't block on it - Inform user ### Phase 5: Re-Review and Handle Remaining Issues 1. **Update results.json with fix status** - For each issue that was fixed, add `"fix_status"` field after `"suggested_fix"`: ```json { ... "suggested_fix": "...", "fix_status": "fixed - please review" } ``` - For issues that couldn't be fixed, add explanation: ```json "fix_status": "not fixed - reason: ..." ``` - **IMPORTANT**: Do NOT remove any entries from results.json - only add/update "fix_status" - The code-quality-review-all skill owns results.json and is responsible for removing entries 2. **Re-run code quality review** - **IMPORTANT**: Use Task tool to spawn subagent running code-quality-review-all skill - Pass the same topic path - This will update results.json with current state - Compare results before and after to identify: - Issues that are now resolved (no longer appear) - New issues that may have been introduced - Issues that still remain despite fix attempts 3. **Fix remaining issues if in scope** - For each new or remaining in-scope issue: - Investigate why previous fix didn't work - Attempt alternative fix approach - Update "fix_status" with attempt results - Repeat this process until no more in-scope issues can be fixed 4. **Update topic's README.md** - Add any knowledge that you have discovered that will be useful in detecting or fixing topic-related issues in the future - Do not remove examples of bad code or patterns that were fixed - they will be useful in future reviews and fixes of future evaluations. 5. **Update SUMMARY.md** - Add a "Recent Fixes" section with: - Date of fix run - Number of issues fixed - Which evaluations were updated - Keep historical data (don't remove past information) - Update recommendations to reflect remaining work 6. **Run markdown linters** - Use `uv run pre-commit run markdownlint-fix` to fix markdown linting issues ### Phase 6: Create PR Description and Present Results 1. **Create/Update PR description (cumulative)** - Read existing `PR_DESCRIPTION.md` if it exists (from previous runs) - **Cumulative tracking**: PR description represents ALL changes from branch base, not just this run - If PR_DESCRIPTION.md exists: - Parse existing content to extract previous runs' data - Append information from this run - Update cumulative statistics - If PR_DESCRIPTION.md doesn't exist (first run): - Create new file - Format for GitHub/GitLab pull request with: - **Summary**: Brief overview of the code quality topic and total fixes (2-3 sentences) - **Overall Changes** (cumulative from all runs): - Total issues fixed across all runs by type - Total evaluations affected - Total files modified - **Fix Sessions**: List each run session with: - Date/time of run - Issues fixed in that session - Complexity level targeted (easy/medium/all) - **Fixed Issues** (cumulative): Table or list with all file paths and issue types from all runs - **Testing** (from latest run): - Which tests were run - Pass/fail status - Any test issues encountered - **Remaining Issues** (current state): - Count of issues still open - Brief note on what remains - **Review Notes** (cumulative): - Any complications or special considerations from any run - Areas that need extra attention during review - Use proper markdown formatting for PR readability - **IMPORTANT**: Do NOT commit PR_DESCRIPTION.md - it's only for creating the PR - Example structure: ```markdown ## Summary Fix private API imports code quality issues across evaluations. ## Overall Changes - Total issues fixed: 25 - Evaluations affected: 8 ## Fix Sessions ### Session 1: 2026-01-18 10:30 (Easy issues) - Fixed 10 easy issues - Targeted: Easy complexity, All evaluations ### Session 2: 2026-01-18 14:15 (Medium issues) - Fixed 15 medium issues - Targeted: Medium complexity, Specific evaluations ## Fixed Issues [Table of all fixed issues from all sessions] ## Testing [Latest test results] ## Remaining Issues 6 issues remain (4 hard, 2 require investigation) ## Review Notes - Session 1: All tests passed - Session 2: One test required adjustment in fortress/scorer.py ``` 2. **Present results to user** - Show high-level summary for **this run**: - X issues fixed in this session - Y issues remain - Z tests passed - Show **cumulative progress** from PR_DESCRIPTION.md: - Total issues fixed across all runs - Number of fix sessions completed - Show before/after statistics from SUMMARY.md - List modified files from this run - Display content of PR_DESCRIPTION.md for user review - **Do NOT automatically commit** - let user review changes 3. **Offer next steps** - **Create commit and PR**: Offer to: - Commit all changes (source files, results.json, SUMMARY.md) - Create pull request with description from PR_DESCRIPTION.md - Note: PR_DESCRIPTION.md itself is NOT committed (it's just for PR description) - The PR description includes **cumulative changes from all fix sessions** on this branch - **Run more fixes**: If issues remain, suggest running skill again with different filters - Running again will **append** to PR_DESCRIPTION.md, creating cumulative tracking - This allows iterative fixing: easy issues first, then medium, then hard - **Manual review needed**: List any issues that require manual attention ## Important Guidelines ### Safety First - **Never batch all fixes blindly** - Validate each fix type before applying en masse - **Always read before editing** - Understand context before changing code - **Verify fixes don't break functionality** - Run tests incrementally - **Be conservative** - Skip fixes you're uncertain about rather than risk breaking code - **Get approval for large changes** - Alert user when fixes exceed 100 lines - **Have rollback strategy** - Be able to revert if fixes cause problems ### Context is Critical - **Understand the quality issue** - Read README.md thoroughly - **Understand why code was written that way** - There might be good reasons - **Look for patterns** - Similar issues often need similar fixes - **Check related code** - Fixes might require updating nearby code - **Read existing comments** - Developers might have documented why they used certain patterns ### Validation at Every Step - **Verify fix patterns from README** - Don't guess how to fix - **Check suggested_fix in results.json** - Use provided guidance - **Validate changes compile** - Ensure code parses after changes - **Run linters** - Catch style and import issues - **Run tests** - Detect regressions immediately - **Re-run review** - Verify fixes actually resolve issues ### Communication - **Show plan before executing** - Let user see what will be fixed - **Alert for large changes** - Get approval for fixes >10 lines - **Report uncertainties** - Flag issues where fix approach is unclear - **Show progress** - Keep user informed during fixes - **Explain failures** - Document why certain issues couldn't be fixed - **Provide detailed reports** - Create comprehensive fix reports ### What NOT to Do - **Don't assume you know how to fix** - Always consult README.md and results.json - **Don't remove entries from results.json** - Only add/update "fix_status" field - **Don't replace PR_DESCRIPTION.md** - Append to it to maintain cumulative history across runs - **Don't run full evaluations** - Only run unit tests (evaluations are slow) - **Don't commit automatically** - Let user review changes first - **Don't commit PR_DESCRIPTION.md** - It's only for creating the PR - **Don't fix issues you can't validate** - Skip rather than risk breaking - **Don't ignore test failures** - Investigate or rollback - **Don't make unrelated changes** - Only fix the specific quality issues - **Don't assume all issues of same type are identical** - Context matters