--- name: traceability-matrix-generator description: > Builds traceability matrices connecting requirements to design documents to source code implementation, tracking the complete development lifecycle. Use when you need to verify implementation completeness, ensure all requirements are implemented in code, generate compliance documentation, audit requirement coverage, identify orphaned code, or create traceability reports for stakeholders. Supports parsing requirements from Markdown, Word, and PDFs; extracting design from architecture docs and API specs; and scanning source code for implementations. Outputs to Markdown tables, Excel/CSV, and HTML visualizations. --- # Traceability Matrix Generator Build comprehensive traceability matrices linking requirements → design → implementation across the software development lifecycle. ## What is a Traceability Matrix? A traceability matrix documents relationships between: - **Requirements**: What the system must do (user stories, specs, features) - **Design**: How the system will be structured (architecture, APIs, components) - **Implementation**: Where requirements are coded (functions, classes, modules) **Benefits:** - Ensure all requirements are implemented - Identify missing implementations or tests - Support compliance and auditing - Track impact of requirement changes - Find orphaned code without requirements ## Workflow ### Step 1: Identify and Collect Artifacts Gather all traceability sources from the project. **Requirements Sources:** - `requirements.md`, `REQUIREMENTS.txt` - User story documents - Issue tracker exports (Jira, GitHub Issues) - Product requirement documents (PRDs) - Feature specifications **Design Sources:** - `DESIGN.md`, architecture documents - API specifications (OpenAPI, Swagger) - Database schemas - UML diagrams, architecture diagrams - Design decision records (ADRs) **Implementation Sources:** - Source code files (`*.py`, `*.java`, `*.js`, etc.) - Module docstrings - Function/class comments with requirement IDs - Configuration files **Checklist:** - [ ] Locate requirements documents - [ ] Find design documentation - [ ] Identify source code directories - [ ] Check for existing ID/tagging conventions - [ ] Verify file access and permissions ### Step 2: Extract Requirements Parse requirements and assign unique identifiers. **Common Requirement Formats:** **Markdown with IDs:** ```markdown ## REQ-001: User Authentication The system shall allow users to log in with email and password. ## REQ-002: Password Reset Users shall be able to reset forgotten passwords via email. ``` **User Stories:** ```markdown ### US-123: As a user, I want to search products So that I can find items quickly **Acceptance Criteria:** - Search box on homepage - Results display in < 1 second - Filter by category ``` **Numbered Lists:** ```markdown 1. **REQ-AUTH-001**: System must support OAuth 2.0 2. **REQ-AUTH-002**: Sessions expire after 24 hours 3. **REQ-DATA-001**: Data must be encrypted at rest ``` **Extraction Script (Python):** ```python import re from pathlib import Path def extract_requirements(file_path): """Extract requirements with IDs from markdown file.""" requirements = [] with open(file_path, 'r') as f: content = f.read() # Pattern: REQ-XXX or US-XXX or similar pattern = r'^#+\s*([A-Z]+-[A-Z0-9-]+):\s*(.+?)$' for match in re.finditer(pattern, content, re.MULTILINE): req_id = match.group(1) req_title = match.group(2) requirements.append({ 'id': req_id, 'title': req_title, 'source': file_path.name, 'type': 'requirement' }) return requirements # Usage reqs = extract_requirements(Path('requirements.md')) for req in reqs: print(f"{req['id']}: {req['title']}") ``` **Manual Extraction:** If documents lack IDs, assign them: ``` Original: "Users can filter search results" → Assign: REQ-SEARCH-001: Users can filter search results ``` For detailed requirement extraction patterns, see [references/extraction_patterns.md](references/extraction_patterns.md). ### Step 3: Extract Design Artifacts Identify design elements and link to requirements. **Design Linking Patterns:** **Explicit References in Design Docs:** ```markdown ## Authentication Service (REQ-001, REQ-002) **Architecture:** - OAuth 2.0 provider integration (REQ-AUTH-001) - Session management module (REQ-AUTH-002) - Password reset workflow (REQ-002) **API Endpoints:** - `POST /auth/login` - Implements REQ-001 - `POST /auth/reset` - Implements REQ-002 ``` **API Specifications:** ```yaml # openapi.yaml paths: /auth/login: post: summary: User login endpoint x-requirements: [REQ-001, REQ-AUTH-001] description: Implements user authentication ``` **Architecture Diagrams:** ```markdown [Component Diagram] - AuthService → Implements REQ-001, REQ-002 - UserDatabase → Supports REQ-DATA-001 - EmailService → Enables REQ-002 ``` **Extraction Example:** ```python def extract_design_links(design_file): """Extract design artifacts and linked requirements.""" design_artifacts = [] with open(design_file, 'r') as f: content = f.read() # Find headers with requirement references pattern = r'^#+\s*(.+?)\s*\((.+?)\)$' for match in re.finditer(pattern, content, re.MULTILINE): artifact_name = match.group(1) req_refs = match.group(2) # Parse requirement IDs req_ids = re.findall(r'[A-Z]+-[A-Z0-9-]+', req_refs) design_artifacts.append({ 'name': artifact_name, 'requirements': req_ids, 'source': design_file.name, 'type': 'design' }) return design_artifacts ``` ### Step 4: Scan Implementation Search source code for requirement references. **Code Annotation Patterns:** **Docstrings (Python):** ```python def authenticate_user(email, password): """Authenticate user credentials. Implements: REQ-001, REQ-AUTH-001 Args: email: User email address password: User password Returns: Authentication token if successful """ # Implementation... ``` **Comments (Java):** ```java /** * User authentication service * @implements REQ-001 User login * @implements REQ-AUTH-001 OAuth support */ public class AuthenticationService { // Implementation... } ``` **Comments (JavaScript):** ```javascript /** * Password reset functionality * Implements: REQ-002 */ function resetPassword(email) { // Implementation... } ``` **Scanning Script:** ```python def scan_code_for_requirements(code_dir): """Scan source code for requirement references.""" implementations = [] for file_path in Path(code_dir).rglob('*.py'): with open(file_path, 'r') as f: content = f.read() # Find requirement references in comments/docstrings matches = re.finditer( r'(?:Implements?|Satisfies|Covers):\s*([A-Z]+-[A-Z0-9-]+(?:,\s*[A-Z]+-[A-Z0-9-]+)*)', content, re.IGNORECASE ) for match in matches: req_ids = [r.strip() for r in match.group(1).split(',')] # Find containing function/class lines_before = content[:match.start()].split('\n') for i in range(len(lines_before) - 1, -1, -1): if 'def ' in lines_before[i] or 'class ' in lines_before[i]: code_element = lines_before[i].strip() break else: code_element = "Unknown" implementations.append({ 'file': str(file_path), 'element': code_element, 'requirements': req_ids, 'type': 'implementation' }) return implementations ``` For comprehensive code scanning patterns, see [references/code_scanning.md](references/code_scanning.md). ### Step 5: Build the Traceability Matrix Combine all extracted data into a structured matrix. **Data Structure:** ```python traceability_matrix = { 'REQ-001': { 'requirement': { 'id': 'REQ-001', 'title': 'User Authentication', 'source': 'requirements.md' }, 'design': [ { 'name': 'Authentication Service', 'source': 'design.md' } ], 'implementation': [ { 'file': 'auth/service.py', 'element': 'def authenticate_user()' } ] }, # ... more requirements } ``` **Building Script:** ```python def build_traceability_matrix(requirements, design_artifacts, implementations): """Build complete traceability matrix.""" matrix = {} # Initialize with requirements for req in requirements: matrix[req['id']] = { 'requirement': req, 'design': [], 'implementation': [] } # Link design artifacts for design in design_artifacts: for req_id in design.get('requirements', []): if req_id in matrix: matrix[req_id]['design'].append(design) # Link implementations for impl in implementations: for req_id in impl.get('requirements', []): if req_id in matrix: matrix[req_id]['implementation'].append(impl) return matrix ``` ### Step 6: Generate Output Formats Export matrix in multiple formats for different audiences. **Markdown Table:** ```markdown # Traceability Matrix | Requirement | Title | Design | Implementation | Status | |-------------|-------|--------|----------------|--------| | REQ-001 | User Authentication | Authentication Service | auth/service.py::authenticate_user() | ✓ Complete | | REQ-002 | Password Reset | Auth Service | auth/service.py::reset_password() | ✓ Complete | | REQ-003 | Data Encryption | - | - | ⚠ Missing | ``` **Generation Script:** ```python def generate_markdown_table(matrix): """Generate markdown traceability table.""" lines = [ "# Traceability Matrix\n", "| Requirement | Title | Design | Implementation | Status |", "|-------------|-------|--------|----------------|--------|" ] for req_id, data in sorted(matrix.items()): req = data['requirement'] design = ', '.join([d['name'] for d in data['design']]) or '-' impl = ', '.join([f"{i['file']}::{i['element']}" for i in data['implementation']]) or '-' # Determine status if data['design'] and data['implementation']: status = '✓ Complete' elif data['design'] or data['implementation']: status = '⚠ Partial' else: status = '❌ Missing' lines.append(f"| {req_id} | {req['title']} | {design} | {impl} | {status} |") return '\n'.join(lines) ``` **CSV Export:** ```python import csv def generate_csv(matrix, output_file): """Generate CSV traceability matrix.""" with open(output_file, 'w', newline='') as f: writer = csv.writer(f) # Header writer.writerow([ 'Requirement ID', 'Title', 'Source', 'Design Artifacts', 'Implementation Files', 'Status' ]) # Data rows for req_id, data in sorted(matrix.items()): req = data['requirement'] design_str = '; '.join([d['name'] for d in data['design']]) impl_str = '; '.join([f"{i['file']}" for i in data['implementation']]) if data['design'] and data['implementation']: status = 'Complete' elif data['design'] or data['implementation']: status = 'Partial' else: status = 'Missing' writer.writerow([ req_id, req['title'], req['source'], design_str, impl_str, status ]) ``` **HTML Interactive Visualization:** ```python def generate_html_visualization(matrix, output_file): """Generate interactive HTML traceability matrix.""" html = """
| Requirement | Title | Design | Implementation | Status |
|---|---|---|---|---|
| {req_id} | {req['title']} | {design} | {impl} | {status_text} |