---
name: quality-reflective-questions
description: |
  Provides reflective questioning framework to challenge assumptions about work completeness,
  catching incomplete implementations before they're marked "done". Use before claiming features
  complete, before moving ADRs to completed status, during self-review, or when declaring work
  finished. Triggers on "is this really done", "self-review my work", "challenge my assumptions",
  "verify completeness", or proactively before marking tasks complete. Works with any type of
  implementation work. Enforces critical thinking about integration, testing, and execution proof.

---

# Reflective Questions for Work Completeness

## Quick Start

Before marking ANYTHING as "done", ask yourself these questions and provide HONEST answers:

### The Four Mandatory Questions

1. **How do I trigger this?** (What's the entry point?)
2. **What connects it to the system?** (Where's the wiring?)
3. **What evidence proves it runs?** (Show me the logs)
4. **What shows it works correctly?** (What's the outcome?)

If you **cannot answer ALL FOUR** with specific, concrete details, **the work is NOT complete**.

### The Honesty Test

Replace vague answers with specific evidence:

❌ **Bad (vague):** "It's integrated" → ✅ **Good (specific):** "Imported in builder.py line 45"
❌ **Bad (vague):** "It works" → ✅ **Good (specific):** "Logs show execution at 10:30:45"
❌ **Bad (vague):** "Tests pass" → ✅ **Good (specific):** "46 unit tests + 2 integration tests pass"

## Table of Contents

1. When to Use This Skill
2. What This Skill Does
3. The Four Mandatory Questions (Deep Dive)
4. Category-Specific Questions
5. Red Flag Questions
6. The Honesty Checklist
7. Common Self-Deception Patterns
8. Supporting Files
9. Expected Outcomes
10. Requirements
11. Red Flags to Avoid

## When to Use This Skill

### Explicit Triggers
- "Challenge my assumptions about completeness"
- "Ask me reflective questions about my work"
- "Self-review my implementation"
- "Is this really done?"
- "Verify my work is complete"
- "Question my completion claims"

### Implicit Triggers (PROACTIVE)
- **Before marking any task complete** (every single time)
- **Before moving ADR from in_progress to completed**
- **Before claiming "feature works"**
- **Before self-approving work**
- **After implementing any feature**
- **When about to say "all tests passing ✅"**

### Debugging Triggers
- "Why do I feel uncertain about this?"
- "Something seems incomplete but I can't pinpoint it"
- "I want to mark this done but have doubts"
- "Am I missing something?"

## What This Skill Does

This skill provides a **structured framework of reflective questions** that:

1. **Challenges assumptions** about what "done" means
2. **Exposes gaps** between claimed completion and actual completion
3. **Forces specificity** instead of vague assurances
4. **Prevents premature completion** by requiring evidence
5. **Catches integration failures** before they become incidents

**This skill complements `quality-verify-implementation-complete`** by providing the mental framework for self-questioning BEFORE running technical verification.

## The Four Mandatory Questions (Deep Dive)

These questions MUST be answered for EVERY piece of work before claiming "done".

### Question 1: How do I trigger this?

**Purpose:** Verify the feature has a reachable entry point

**What it really asks:**
- Can a user/system actually invoke this code?
- Is there a documented way to make this execute?
- Could someone else trigger this without asking me?

**Good Answers (Specific):**
- ✅ "Run: `uv run temet-run -a talky -p 'analyze code'`"
- ✅ "Call: `curl -X POST /api/endpoint -d '{...}'`"
- ✅ "Import: `from myapp import MyService; MyService().method()`"
- ✅ "Event: Coordinator triggers when `should_review_architecture()` returns True"

**Bad Answers (Vague):**
- ❌ "Run the system"
- ❌ "It's automatic"
- ❌ "The coordinator calls it"
- ❌ "When needed"

**Follow-up Questions:**
- "Can you show me the EXACT command right now?"
- "What arguments/parameters are required?"
- "Under what conditions does this trigger?"
- "Could you trigger this in the next 30 seconds if asked?"

**If you cannot answer specifically:** The feature has no entry point → NOT COMPLETE

### Question 2: What connects it to the system?

**Purpose:** Verify the artifact is actually wired into the codebase

**What it really asks:**
- Where is the import statement?
- Where is the registration/initialization?
- Where is the configuration that enables this?
- Can you show me the LINE NUMBER where this is connected?

**Good Answers (Specific):**
- ✅ "builder.py line 45: `from .architecture_nodes import create_review_node`"
- ✅ "main.py line 12: `app.add_command(my_command)`"
- ✅ "container.py line 67: `container.register(MyService, scope=Scope.SINGLETON)`"
- ✅ "routes.py line 23: `router.add_route('/endpoint', handler)`"

**Bad Answers (Vague):**
- ❌ "It's imported"
- ❌ "It's in the builder"
- ❌ "It's registered"
- ❌ "It's wired up"

**Follow-up Questions:**
- "Can you paste the EXACT import line?"
- "What FILE and LINE NUMBER has the registration?"
- "Can you show me with grep output?"
- "Could I find this connection in 60 seconds if I looked?"

**If you cannot answer specifically:** The artifact is orphaned → NOT COMPLETE

### Question 3: What evidence proves it runs?

**Purpose:** Verify the code actually executes at runtime

**What it really asks:**
- Have you ACTUALLY triggered this and observed execution?
- What logs/traces show this code path was hit?
- Can you show me timestamped evidence of execution?
- Did you observe this with your own eyes (or grep)?

**Good Answers (Specific):**
- ✅ "Logs: `[2025-12-07 10:30:45] INFO architecture_review_triggered agent=talky`"
- ✅ "Output: `✓ Task completed successfully` (from CLI run at 10:30)"
- ✅ "Trace: OpenTelemetry span `architecture_review` with duration 1.2s"
- ✅ "Debug: Added print statement, saw output 'Node executed'"

**Bad Answers (Vague):**
- ❌ "It should run"
- ❌ "Tests pass"
- ❌ "No errors when I ran it"
- ❌ "The system works"

**Follow-up Questions:**
- "Can you paste the ACTUAL log line showing execution?"
- "What TIMESTAMP did this execute?"
- "Did you observe this directly or are you assuming?"
- "Could you trigger this RIGHT NOW and show me the logs?"

**If you cannot answer specifically:** No execution proof → NOT COMPLETE

### Question 4: What shows it works correctly?

**Purpose:** Verify the code produces the expected outcome

**What it really asks:**
- What observable outcome proves correct behavior?
- What state changed as a result of execution?
- What output/artifact was created?
- How do you KNOW it did the right thing?

**Good Answers (Specific):**
- ✅ "State: `result.architecture_review = ArchitectureReviewResult(status=APPROVED, violations=[])`"
- ✅ "Database: Row inserted with ID 123, verified with query"
- ✅ "File: Created `output.txt` with expected contents (see: cat output.txt)"
- ✅ "API: Returned HTTP 200 with JSON body containing expected fields"

**Bad Answers (Vague):**
- ❌ "It works"
- ❌ "No errors"
- ❌ "Tests pass"
- ❌ "Everything looks good"

**Follow-up Questions:**
- "Can you show me the EXACT output/state change?"
- "What VALUE did this produce?"
- "How do you KNOW this is correct vs just 'no errors'?"
- "Could you demonstrate correct behavior RIGHT NOW?"

**If you cannot answer specifically:** No outcome proof → NOT COMPLETE

## Category-Specific Questions

Apply the Four Questions framework to specific implementation types. For detailed questions by category, see [references/category-specific-questions.md](./references/category-specific-questions.md).

**Categories covered:**
- **Modules/Files**: Import verification, call-site validation
- **LangGraph Nodes**: Graph registration, edge connectivity
- **CLI Commands**: Registration, --help visibility, execution
- **Service Classes (DI)**: Container registration, injection points
- **API Endpoints**: Route registration, response validation

## Red Flag Questions

These questions expose common self-deception patterns. If you answer "yes" to any, **stop and investigate**.

### Integration Red Flags

1. **"Did I only test this in isolation?"**
   - If YES: You might have orphaned code
   - Action: Add integration test, verify in real system

2. **"Am I assuming something is connected without verifying?"**
   - If YES: Assumption might be wrong
   - Action: Grep for imports, verify connection exists

3. **"Did I only run unit tests, not integration tests?"**
   - If YES: Integration might be broken
   - Action: Create/run integration tests

4. **"Am I relying on 'should' or 'probably' language?"**
   - If YES: You're guessing, not verifying
   - Action: Replace guesses with evidence

5. **"Could this code exist and never execute?"**
   - If YES: It might be orphaned
   - Action: Verify call-sites exist in production code

### Execution Red Flags

6. **"Have I not actually triggered this feature?"**
   - If YES: You don't know if it works
   - Action: Trigger it, observe execution

7. **"Am I claiming it works based on 'no errors' vs positive proof?"**
   - If YES: Absence of errors ≠ presence of success
   - Action: Show positive evidence of correct behavior

8. **"Did I forget to check logs after running?"**
   - If YES: No execution proof
   - Action: Run again, capture logs

9. **"Am I trusting tests alone without manual verification?"**
   - If YES: Tests might be mocked/isolated
   - Action: Manual E2E test, verify in real environment

10. **"Could this feature be wired but the conditional never triggers?"**
    - If YES: Dead code path
    - Action: Verify the condition is reachable

### Completion Red Flags

11. **"Am I rushing to mark this complete?"**
    - If YES: Slow down, verify properly
    - Action: Run through Four Questions again

12. **"Do I have doubts I'm ignoring?"**
    - If YES: Your instinct is usually right
    - Action: Investigate the doubt before proceeding

13. **"Would I bet $1000 this works end-to-end?"**
    - If NO: You're not confident
    - Action: Find out why, verify until confident

14. **"Could someone else verify this works without asking me?"**
    - If NO: Insufficient documentation/evidence
    - Action: Document entry point, provide evidence

15. **"Am I self-approving without external review?"**
    - If YES: You might miss blind spots
    - Action: Request reviewer agent or peer review

## The Honesty Checklist

Before marking ANYTHING complete, answer these honestly:

### Evidence Requirements

- [ ] **I can paste the exact command to trigger this feature**
      (Not "run the system" - the EXACT command with args)

- [ ] **I can show the file and line number where this is imported/registered**
      (Not "it's in builder.py" - the EXACT line number)

- [ ] **I have actual logs showing this code executed**
      (Not "it should log" - actual timestamped log lines)

- [ ] **I can show the specific output/state change this produced**
      (Not "it works" - the EXACT output/data)

- [ ] **I triggered this manually and observed it work**
      (Not "tests pass" - I personally ran it)

### Integration Requirements

- [ ] **This code is imported in at least one production file**
      (grep output shows import, not just tests)

- [ ] **This code has call-sites in production paths**
      (grep output shows calls, not just definitions)

- [ ] **This code is registered/wired where it needs to be**
      (container, graph, router, CLI - verified)

- [ ] **Integration tests verify this component is in the system**
      (Not just unit tests - integration/E2E tests exist)

### Outcome Requirements

- [ ] **I can demonstrate this works to someone else right now**
      (Could walk someone through triggering and observing)

- [ ] **The behavior matches the specification**
      (Not just "no errors" - correct behavior observed)

- [ ] **I would bet money this works end-to-end**
      (Confident enough to stake reputation on it)

- [ ] **I have answered all Four Questions with specific details**
      (No vague answers, all concrete)

### If ANY checkbox is unchecked: **NOT COMPLETE**

## Common Self-Deception Patterns

Be aware of these patterns that lead to premature completion claims. For detailed analysis and fixes, see [references/self-deception-patterns.md](./references/self-deception-patterns.md).

**Common Patterns:**
1. **"Tests Pass" Syndrome** - Unit tests pass but integration untested
2. **"Should Work" Fallacy** - Using assumptions instead of evidence
3. **"No Errors" Confusion** - Equating silence with correctness
4. **"File Exists" Completion** - Code written but not integrated
5. **"Looks Good" Approval** - Vague approval without specifics
6. **"I Remember Doing It"** - Trusting memory over verification
7. **"Later Will Be Fine"** - Deferring critical verification steps

## Usage

1. **Before marking work complete**, run through the Four Questions
2. **Check the Honesty Checklist** - all boxes must be checked
3. **Verify no Red Flags** are present
4. **If uncertain**, review [references/category-specific-questions.md](./references/category-specific-questions.md) for your implementation type

**Supporting Files:**
- [references/category-specific-questions.md](./references/category-specific-questions.md) - Detailed questions by category
- [references/self-deception-patterns.md](./references/self-deception-patterns.md) - Pattern recognition and fixes

## Expected Outcomes

### Successful Self-Review

```
Reflective Questions Self-Review
Feature: ArchitectureReview Node
Date: 2025-12-07

FOUR MANDATORY QUESTIONS:

1. How do I trigger this?
   ✅ SPECIFIC: uv run temet-run -a talky -p "Write a function"
   When should_review_architecture() returns True (when code_changes detected)

2. What connects it to the system?
   ✅ SPECIFIC: builder.py line 12: from .architecture_nodes import create_architecture_review_node
   builder.py line 146: graph.add_node("architecture_review", review_node)
   builder.py line 189: Conditional edge from "query_claude"

3. What evidence proves it runs?
   ✅ SPECIFIC: Logs from execution at 2025-12-07 10:30:45:
   [INFO] architecture_review_triggered agent=talky session=abc123
   [INFO] architecture_review_complete status=approved violations=0

4. What shows it works correctly?
   ✅ SPECIFIC: state.architecture_review = ArchitectureReviewResult(
       status=ReviewStatus.APPROVED,
       violations=[],
       recommendations=["Code follows Clean Architecture"]
   )

HONESTY CHECKLIST:
✅ All evidence specific, not vague
✅ All connections verified with grep
✅ Execution observed directly
✅ Outcome matches specification

SELF-DECEPTION CHECK:
✅ Not relying on "tests pass" alone
✅ Not using "should" or "probably"
✅ Not assuming - all verified
✅ Would bet $1000 this works

DECISION: ✅ WORK IS COMPLETE
Ready to mark as done.
```

### Failed Self-Review (Catches Incompleteness)

```
Reflective Questions Self-Review
Feature: ArchitectureReview Node
Date: 2025-12-05 (BEFORE FIX)

FOUR MANDATORY QUESTIONS:

1. How do I trigger this?
   ⚠️  VAGUE: "Run the coordinator"
   FOLLOW-UP: What's the EXACT command?
   RE-ANSWER: uv run temet-run -a talky -p "..."
   ⚠️  STILL VAGUE: What prompts trigger the node?

2. What connects it to the system?
   ❌ VAGUE: "It should be in builder.py"
   FOLLOW-UP: Can you show me the line number?
   RE-CHECK: grep "architecture_nodes" builder.py
   RESULT: (empty) ❌
   CRITICAL: MODULE IS NOT IMPORTED

3. What evidence proves it runs?
   ❌ ASSUMPTION: "Tests pass so it should run"
   FOLLOW-UP: Have you actually run it and seen logs?
   RE-ANSWER: "No, just ran unit tests"
   CRITICAL: NO EXECUTION PROOF

4. What shows it works correctly?
   ❌ ASSUMPTION: "Tests verify behavior"
   FOLLOW-UP: What actual output did you observe?
   RE-ANSWER: "Just the test assertions passing"
   CRITICAL: NO RUNTIME OUTCOME PROOF

HONESTY CHECKLIST:
❌ Using vague language ("should", "I think")
❌ No specific line numbers or imports shown
❌ No execution logs captured
❌ Relying on tests, not runtime verification

SELF-DECEPTION CHECK:
❌ Relying on "tests pass" only
❌ Using "should" repeatedly
❌ Assuming instead of verifying
❌ Would NOT bet $1000 (honest answer: no)

DECISION: ❌ WORK IS NOT COMPLETE
Critical issues found:
1. Module not imported in builder.py
2. No runtime execution proof
3. No integration test

DO NOT mark as done. Fix integration first.
```

## Requirements

### Tools Required
- None (this is a mental framework)

### Knowledge Required
- Understanding of what "done" means in your domain
- Willingness to be honest with yourself
- Ability to distinguish vague from specific answers

### Mindset Required
- **Intellectual honesty** - Admit when you don't know
- **Rigor** - Don't accept vague answers from yourself
- **Patience** - Take time to verify properly
- **Courage** - Admit incompleteness vs rushing to "done"

## Red Flags to Avoid

### Do Not
- ❌ Accept vague answers from yourself
- ❌ Use "should", "probably", "I think" language
- ❌ Rush through questions to mark done faster
- ❌ Skip questions that feel uncomfortable
- ❌ Trust memory instead of current verification
- ❌ Assume connection without grep proof
- ❌ Claim execution without logs
- ❌ Rely on unit tests alone for integration work

### Do
- ✅ Answer all Four Questions with specific details
- ✅ Replace assumptions with evidence
- ✅ Be honest about gaps and uncertainties
- ✅ Verify current state, don't trust memory
- ✅ Show concrete proof (line numbers, logs, output)
- ✅ Admit incompleteness when found
- ✅ Fix gaps before marking complete
- ✅ Use this framework for EVERY completion claim

## Notes

- This skill was created in response to ADR-013 (2025-12-07)
- The pattern: Self-deception about completeness led to orphaned code
- This skill provides the mental framework BEFORE technical verification
- Pair this with `quality-verify-implementation-complete` for full coverage
- The Four Questions are the MINIMUM bar, not the complete verification
- Honesty with yourself is the foundation of quality work

**Remember:** The person you're most likely to deceive is yourself. These questions force honesty.