--- name: code-analysis description: Deep code analysis for pplx-sdk — parse Python AST, build dependency graphs, extract knowledge graphs, detect patterns, and generate actionable insights about code structure, complexity, and relationships. Use when analyzing code quality, mapping dependencies, or building understanding of the codebase. context: fork agent: codegraph --- # Code Analysis — AST, Dependency Graphs & Knowledge Graphs Parse, analyze, and visualize code structure through AST analysis, dependency graphing, and knowledge extraction. Supports **Python** (via `ast` module) and **JavaScript/TypeScript** (via `grep`-based import parsing and optional `@babel/parser` / `ts-morph`). ## When to use Use this skill when: - Building or updating a dependency graph of the codebase - Analyzing imports to detect circular dependencies or layer violations - Parsing Python AST to extract class hierarchies, function signatures, or call graphs - Parsing JavaScript/TypeScript source to extract React component trees, ESM imports, and hook usage - Generating a knowledge graph of code entities and their relationships - Measuring code complexity (cyclomatic, cognitive, LOC) per module - Identifying dead code, unused imports, or orphan modules - Mapping how data flows through the SDK layers or SPA component hierarchy - Understanding coupling between modules before a refactor - Analyzing a SPA's source code structure (component graph, barrel exports, route tree) ## Instructions ### Step 1: AST Parsing Parse Python source files to extract structured representations of code entities. ```python import ast from pathlib import Path def parse_module(filepath: str) -> dict: """Extract entities from a Python module via AST.""" source = Path(filepath).read_text() tree = ast.parse(source, filename=filepath) entities = { "module": filepath, "classes": [], "functions": [], "imports": [], "constants": [], } for node in ast.walk(tree): if isinstance(node, ast.ClassDef): entities["classes"].append({ "name": node.name, "bases": [ast.dump(b) for b in node.bases], "methods": [n.name for n in node.body if isinstance(n, (ast.FunctionDef, ast.AsyncFunctionDef))], "decorators": [ast.dump(d) for d in node.decorator_list], "lineno": node.lineno, }) elif isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)): if not any(isinstance(parent, ast.ClassDef) for parent in ast.walk(tree)): entities["functions"].append({ "name": node.name, "args": [arg.arg for arg in node.args.args], "returns": ast.dump(node.returns) if node.returns else None, "is_async": isinstance(node, ast.AsyncFunctionDef), "lineno": node.lineno, }) elif isinstance(node, ast.Import): for alias in node.names: entities["imports"].append({"module": alias.name, "alias": alias.asname}) elif isinstance(node, ast.ImportFrom): entities["imports"].append({ "module": node.module, "names": [alias.name for alias in node.names], "level": node.level, }) return entities ``` ### Step 2: Dependency Graph Construction Build a directed graph of module-to-module dependencies. ```bash # Quick import graph using grep grep -rn "from pplx_sdk" pplx_sdk/ --include="*.py" | \ awk -F: '{print $1 " -> " $2}' | \ sed 's|pplx_sdk/||g' | sort -u # Or using Python AST for precision python3 -c " import ast, os, json graph = {} for root, dirs, files in os.walk('pplx_sdk'): for f in files: if f.endswith('.py'): path = os.path.join(root, f) module = path.replace('/', '.').replace('.py', '') tree = ast.parse(open(path).read()) deps = set() for node in ast.walk(tree): if isinstance(node, ast.ImportFrom) and node.module: if node.module.startswith('pplx_sdk'): deps.add(node.module) elif isinstance(node, ast.Import): for alias in node.names: if alias.name.startswith('pplx_sdk'): deps.add(alias.name) if deps: graph[module] = sorted(deps) print(json.dumps(graph, indent=2)) " ``` #### Expected Layer Dependencies ```mermaid graph TD subgraph Valid["✅ Valid Dependencies"] client --> domain client --> transport client --> shared domain --> transport domain --> shared domain --> core transport --> shared transport --> core shared --> core end subgraph Invalid["❌ Layer Violations"] core -.->|VIOLATION| shared core -.->|VIOLATION| transport shared -.->|VIOLATION| transport transport -.->|VIOLATION| domain end style Valid fill:#e8f5e9 style Invalid fill:#ffebee ``` ### Step 3: Knowledge Graph Extraction Build a knowledge graph connecting code entities with typed relationships. #### Entity Types | Entity | Source | Example | |--------|--------|---------| | `Module` | File path | `pplx_sdk.transport.sse` | | `Class` | AST ClassDef | `SSETransport`, `PerplexityClient` | | `Function` | AST FunctionDef | `stream_ask`, `retry_with_backoff` | | `Protocol` | typing.Protocol | `Transport`, `StreamParser` | | `Exception` | Exception subclass | `TransportError`, `RateLimitError` | | `Type` | TypeAlias | `Headers`, `JSONData`, `Mode` | | `Constant` | Module-level assign | `SSE_ENDPOINT`, `DEFAULT_TIMEOUT` | #### Relationship Types | Relationship | Meaning | Example | |-------------|---------|---------| | `IMPORTS` | Module imports another | `transport.sse IMPORTS core.protocols` | | `DEFINES` | Module defines entity | `core.exceptions DEFINES TransportError` | | `INHERITS` | Class extends another | `AuthenticationError INHERITS TransportError` | | `IMPLEMENTS` | Class implements protocol | `SSETransport IMPLEMENTS Transport` | | `CALLS` | Function calls another | `stream_ask CALLS retry_with_backoff` | | `RETURNS` | Function returns type | `stream_ask RETURNS Iterator[StreamChunk]` | | `RAISES` | Function raises exception | `request RAISES AuthenticationError` | | `USES_TYPE` | Function uses type hint | `request USES_TYPE Headers` | | `BELONGS_TO` | Entity belongs to layer | `SSETransport BELONGS_TO transport` | #### Knowledge Graph as Mermaid ```mermaid graph LR subgraph core["core/"] Transport[/"Transport
(Protocol)"/] PerplexitySDKError["PerplexitySDKError"] TransportError["TransportError"] end subgraph transport["transport/"] SSETransport["SSETransport"] HttpTransport["HttpTransport"] end subgraph shared["shared/"] retry["retry_with_backoff()"] end SSETransport -->|IMPLEMENTS| Transport HttpTransport -->|IMPLEMENTS| Transport TransportError -->|INHERITS| PerplexitySDKError SSETransport -->|RAISES| TransportError HttpTransport -->|CALLS| retry style core fill:#e1f5fe style transport fill:#fff3e0 style shared fill:#f3e5f5 ``` ### Step 4: Code Complexity Analysis ```bash # Lines of code per module find pplx_sdk -name "*.py" -exec wc -l {} + | sort -n # Cyclomatic complexity (if radon is available) pip install radon 2>/dev/null && radon cc pplx_sdk/ -s -a # Function count per module grep -c "def " pplx_sdk/**/*.py 2>/dev/null || \ find pplx_sdk -name "*.py" -exec grep -c "def " {} + # Class count per module find pplx_sdk -name "*.py" -exec grep -c "class " {} + ``` ### Step 5: Pattern Detection Detect common patterns and anti-patterns in the codebase: | Check | Command | What to Look For | |-------|---------|-----------------| | Circular imports | AST import graph cycle detection | Cycles in the dependency graph | | Layer violations | Import direction analysis | Lower layers importing higher layers | | Unused imports | `ruff check --select F401` | Imports that are never used | | Dead code | `vulture pplx_sdk/` (if available) | Functions/classes never called | | Missing types | `mypy pplx_sdk/ --strict` | Untyped functions or `Any` usage | | Large functions | AST line count per function | Functions > 50 lines | | Deep nesting | AST indent depth analysis | Nesting > 4 levels | | Protocol conformance | Compare class methods vs Protocol | Missing protocol method implementations | ### Step 6: JavaScript/TypeScript Code Graph (SPA) When analyzing a SPA codebase (React, Next.js, Vite), build a code graph from JavaScript/TypeScript source files. #### Import Graph Extraction ```bash # ESM imports (import ... from '...') grep -rn "import .* from " src/ --include="*.ts" --include="*.tsx" --include="*.js" --include="*.jsx" | \ sed "s/:/ → /" | sort -u # Re-exports / barrel files grep -rn "export .* from " src/ --include="*.ts" --include="*.tsx" | sort -u # Dynamic imports (lazy loading / code splitting) grep -rn "import(" src/ --include="*.ts" --include="*.tsx" | sort -u # CommonJS requires (legacy) grep -rn "require(" src/ --include="*.js" | sort -u ``` #### React Component Tree ```bash # Find all React components (function components) grep -rn "export \(default \)\?function \|export const .* = (" src/ --include="*.tsx" --include="*.jsx" # Find component usage (JSX self-closing or opening tags) grep -rn "<[A-Z][a-zA-Z]*[\ />\n]" src/ --include="*.tsx" --include="*.jsx" | \ grep -oP '<[A-Z][a-zA-Z]*' | sort | uniq -c | sort -rn # Find hooks usage grep -rn "use[A-Z][a-zA-Z]*(" src/ --include="*.ts" --include="*.tsx" | \ grep -oP 'use[A-Z][a-zA-Z]*' | sort | uniq -c | sort -rn # Find context providers grep -rn "createContext\|\.Provider" src/ --include="*.tsx" --include="*.ts" ``` #### Route Tree (Next.js / React Router) ```bash # Next.js App Router pages find app/ -name "page.tsx" -o -name "page.jsx" -o -name "layout.tsx" 2>/dev/null # Next.js Pages Router find pages/ -name "*.tsx" -o -name "*.jsx" 2>/dev/null # React Router route definitions grep -rn "Route\|createBrowserRouter\|path:" src/ --include="*.tsx" --include="*.ts" ``` #### SPA Dependency Graph as Mermaid ```mermaid graph TD subgraph pages["Pages / Routes"] SearchPage["SearchPage"] ThreadPage["ThreadPage"] end subgraph components["Components"] SearchBar["SearchBar"] ResponseView["ResponseView"] SourceCard["SourceCard"] end subgraph hooks["Hooks"] useQuery["useQuery()"] useStreaming["useStreaming()"] useAuth["useAuth()"] end subgraph services["Services / API"] apiClient["apiClient"] sseHandler["sseHandler"] end SearchPage --> SearchBar SearchPage --> useQuery ThreadPage --> ResponseView ThreadPage --> useStreaming ResponseView --> SourceCard useQuery --> apiClient useStreaming --> sseHandler SearchBar --> useAuth style pages fill:#e1f5fe style components fill:#fff3e0 style hooks fill:#f3e5f5 style services fill:#e8f5e9 ``` #### SPA Entity Types | Entity | Source | Example | |--------|--------|---------| | `Component` | Function returning JSX | `SearchBar`, `ResponseView` | | `Hook` | `use*` function | `useQuery`, `useAuth` | | `Context` | `createContext()` | `AuthContext`, `ThemeContext` | | `Route` | Page/layout file | `/search`, `/thread/[id]` | | `Service` | API client module | `apiClient`, `sseHandler` | | `Store` | State management | Zustand store, Redux slice | | `Type` | TypeScript interface/type | `SearchResult`, `ThreadData` | #### SPA Relationship Types | Relationship | Meaning | Example | |-------------|---------|---------| | `RENDERS` | Component renders another | `SearchPage RENDERS SearchBar` | | `USES_HOOK` | Component uses a hook | `SearchPage USES_HOOK useQuery` | | `PROVIDES` | Component provides context | `AuthProvider PROVIDES AuthContext` | | `CONSUMES` | Component consumes context | `SearchBar CONSUMES AuthContext` | | `CALLS_API` | Hook/service calls API endpoint | `useQuery CALLS_API /rest/search` | | `IMPORTS` | Module imports another | `SearchPage IMPORTS SearchBar` | | `LAZY_LOADS` | Dynamic import for code splitting | `App LAZY_LOADS SettingsPage` | | `EXTENDS_TYPE` | Type extends another | `ThreadResponse EXTENDS_TYPE BaseResponse` | ### Step 7: Output Insights Report Generate a structured report combining all analyses: ```markdown ## Code Analysis Report: pplx-sdk ### Module Summary | Module | Classes | Functions | Lines | Complexity | |--------|---------|-----------|-------|------------| | core/protocols.py | 2 | 0 | 45 | A | | transport/sse.py | 1 | 5 | 180 | B | | ... | ... | ... | ... | ... | ### SPA Component Summary (when analyzing JS/TS) | Component | Props | Hooks Used | Children | Lines | |-----------|-------|------------|----------|-------| | SearchPage | 2 | useQuery, useAuth | SearchBar, ResultList | 120 | | ... | ... | ... | ... | ... | ### Dependency Graph [Mermaid diagram] ### Knowledge Graph - N entities, M relationships - [Mermaid diagram] ### Layer Compliance - ✅ No circular dependencies - ✅ No upward layer violations - ⚠️ 2 unused imports detected ### Complexity Hotspots | Function | Module | CC | Lines | Recommendation | |----------|--------|----|-------|----------------| | `_parse_event` | transport/sse.py | 8 | 45 | Consider splitting | ### Dead Code | Entity | Module | Last Referenced | |--------|--------|----------------| | ... | ... | ... | ``` ## Documentation Discovery When analyzing dependencies or researching libraries, use these discovery methods to find LLM-optimized documentation: ### llms.txt / llms-full.txt The `llms.txt` standard provides LLM-optimized documentation at known URLs: ```bash # Check if a dependency publishes llms.txt curl -sf https://docs.pydantic.dev/llms.txt | head -20 curl -sf https://www.python-httpx.org/llms.txt | head -20 # Check for the full version (entire docs in one file) curl -sf https://docs.pydantic.dev/llms-full.txt | head -20 # Use the llms-txt MCP server for indexed search # Tools: list_llm_txt, get_llm_txt, search_llm_txt ``` ### .well-known/agentskills.io Discover agent skills published by libraries and frameworks: ```bash # Check if a site publishes agent skills curl -sf https://example.com/.well-known/agentskills.io/skills/ | head -20 # Look for specific SKILL.md files curl -sf https://example.com/.well-known/agentskills.io/skills/default/SKILL.md ``` ### MCP Documentation Servers | MCP Server | Purpose | Key Tools | |-----------|---------|-----------| | `context7` | Library docs lookup | Context-aware search by library name | | `deepwiki` | GitHub repo documentation | `read_wiki_structure`, `read_wiki_contents`, `ask_question` | | `llms-txt` | llms.txt file search | `list_llm_txt`, `get_llm_txt`, `search_llm_txt` | | `fetch` | Any URL as markdown | General-purpose URL fetching | ### Discovery Workflow ``` 1. Check llms.txt at dependency's docs URL 2. Check .well-known/agentskills.io for skills 3. Query deepwiki for the dependency's GitHub repo 4. Query context7 for library-specific context 5. Fall back to fetch for raw documentation URLs ``` ## Integration with Other Skills | When code-analysis finds... | Delegate to... | Action | |---------------------------|----------------|--------| | Layer violation | `architect` | Produce corrected dependency diagram | | Circular import | `code-reviewer` | Review and suggest refactor | | Missing protocol method | `scaffolder` | Scaffold missing implementation | | Dead code | `code-reviewer` | Confirm and remove | | High complexity | `code-reviewer` | Review for refactor opportunity | | New entity relationships | `architect` | Update architecture diagrams | | SPA component tree | `spa-expert` | Cross-reference with runtime fiber tree | | SPA API endpoints in source | `reverse-engineer` | Validate against live traffic captures | | SPA hook dependencies | `architect` | Visualize hook → service → API chain | | SPA barrel file cycles | `code-reviewer` | Review circular re-exports |