--- name: classify description: Assign labels or categories to items based on characteristics. Use when categorizing entities, tagging content, identifying types, or labeling data according to a taxonomy. argument-hint: "[item] [taxonomy] [multi-label]" disable-model-invocation: false user-invocable: true allowed-tools: Read, Grep context: fork agent: explore layer: UNDERSTAND --- ## Intent Assign one or more labels from a defined taxonomy to items based on their observed characteristics. This capability bridges detection and reasoning by providing semantic categorization. **Success criteria:** - Item assigned at least one label from taxonomy - Label assignment supported by evidence - Confidence scores reflect classification certainty - Ambiguous cases explicitly flagged **Compatible schemas:** - `schemas/output_schema.yaml` ## Inputs | Parameter | Required | Type | Description | |-----------|----------|------|-------------| | `item` | Yes | any | The item to classify (entity, document, code, data) | | `taxonomy` | No | string\|array | Classification scheme or list of valid labels | | `multi_label` | No | boolean | Whether multiple labels can be assigned (default: false) | | `context` | No | object | Additional context to inform classification | ## Procedure 1) **Examine the item**: Gather characteristics relevant to classification - Identify distinguishing features - Note structural patterns, content type, metadata - Collect evidence for each observed characteristic 2) **Understand the taxonomy**: Clarify the classification scheme - If taxonomy provided, use those labels exclusively - If no taxonomy, infer appropriate categories from context - Define clear boundaries between categories 3) **Match characteristics to labels**: Evaluate fit for each potential label - Score how well item characteristics match each category - Consider edge cases and borderline classifications - Note which features drive each potential classification 4) **Assign labels**: Select the most appropriate label(s) - For single-label: choose highest confidence match - For multi-label: include all labels above confidence threshold - Flag if no label is a strong match 5) **Ground classification**: Document evidence supporting each label - Reference specific characteristics that drove classification - Note any characteristics that contradict the assignment ## Output Contract Return a structured object: ```yaml labels: - label: string # Assigned category/label confidence: number # 0.0-1.0 for this specific label evidence: array[string] # Characteristics supporting this label probabilities: # Optional: full probability distribution label_1: number label_2: number primary_label: string # Highest confidence label ambiguous: boolean # True if classification is uncertain confidence: 0..1 # Overall classification confidence evidence_anchors: ["file:line", "characteristic:value"] assumptions: [] ``` ### Field Definitions | Field | Type | Description | |-------|------|-------------| | `labels` | array | Assigned labels with confidence and evidence | | `probabilities` | object | Probability for each possible label | | `primary_label` | string | Single best classification | | `ambiguous` | boolean | Whether classification is uncertain | | `confidence` | number | 0.0-1.0 overall confidence | ## Examples ### Example 1: Classify Code File **Input:** ```yaml item: "src/services/payment_processor.rb" taxonomy: ["service", "model", "controller", "helper", "concern"] ``` **Output:** ```yaml labels: - label: "service" confidence: 0.92 evidence: - "Located in services/ directory" - "Class name ends with 'Processor'" - "Contains business logic methods without persistence" probabilities: service: 0.92 model: 0.03 controller: 0.02 helper: 0.02 concern: 0.01 primary_label: "service" ambiguous: false confidence: 0.92 evidence_anchors: - "src/services/payment_processor.rb:1" - "src/services/payment_processor.rb:15-45" assumptions: - "Directory structure follows Rails conventions" ``` ### Example 2: Classify with Ambiguity **Input:** ```yaml item: "User reported issue: 'App crashes on login'" taxonomy: ["bug", "feature-request", "question", "documentation"] multi_label: true ``` **Output:** ```yaml labels: - label: "bug" confidence: 0.75 evidence: - "Reports crash behavior" - "Describes unexpected failure" - label: "question" confidence: 0.40 evidence: - "Lacks reproduction steps" - "May be user error or configuration" probabilities: bug: 0.75 question: 0.40 feature-request: 0.05 documentation: 0.10 primary_label: "bug" ambiguous: true confidence: 0.65 evidence_anchors: - "issue:title" - "issue:body" assumptions: - "Crash is not expected behavior" - "User has attempted normal login flow" ``` ## Verification - [ ] At least one label assigned with confidence > 0.3 - [ ] Evidence exists for each assigned label - [ ] Labels are from specified taxonomy (if provided) - [ ] Ambiguous flag set when confidence < 0.7 - [ ] Probabilities sum to ~1.0 (if provided) **Verification tools:** Read (to verify evidence references) ## Safety Constraints - `mutation`: false - `requires_checkpoint`: false - `requires_approval`: false - `risk`: low **Capability-specific rules:** - Do not invent labels outside the provided taxonomy - Flag uncertainty rather than forcing low-confidence classifications - Do not access data beyond what's needed for classification - Note when item characteristics are insufficient for reliable classification ## Composition Patterns **Commonly follows:** - `detect` - Classify items after detecting their presence - `observe` - Classify based on observed characteristics - `retrieve` - Classify retrieved items **Commonly precedes:** - `compare` - Classification enables comparison within categories - `plan` - Classified items inform planning decisions - `generate` - Classification guides content generation **Anti-patterns:** - Never use classify for binary detection (use `detect`) - Avoid classify when precise measurement needed (use `measure`) **Workflow references:** - See `reference/workflow_catalog.yaml#capability_gap_analysis` for classification usage