---
name: db-diagram
description: Generate database ER diagrams from schema for documentation
disable-model-invocation: false
---

# Database ER Diagram Generator

I'll help you generate comprehensive Entity-Relationship diagrams from your database schema, supporting multiple ORMs and output formats.

Arguments: `$ARGUMENTS` - schema files, output format (mermaid/plantuml/dbml), or ORM type

## Token Optimization

This skill uses diagram generation-specific patterns to minimize token usage:

### 1. Schema Snapshot Caching (900 token savings)
**Pattern:** Cache parsed schema structure to avoid re-analysis
- Store schema in `db-diagram/schema-snapshot.json` (24 hour TTL)
- Cache: tables, columns, relationships, constraints
- Compare checksum on subsequent runs (100 tokens vs 1,000 tokens fresh)
- Regenerate only if schema changed
- **Savings:** 90% on repeat diagram generations

### 2. Early Exit for Unchanged Schemas (95% savings)
**Pattern:** Detect schema changes and return existing diagram
- Check schema file mtimes vs diagram mtime (50 tokens)
- If schema unchanged: return existing diagram path (80 tokens)
- **Distribution:** ~60% of runs are "view diagram" on unchanged schema
- **Savings:** 80 vs 2,000 tokens for diagram regeneration checks

### 3. Template-Based Diagram Generation (1,500 token savings)
**Pattern:** Use Mermaid/PlantUML templates instead of creative generation
- Standard templates for entity syntax, relationship arrows
- Predefined formats for common diagram types
- No creative diagram design logic needed
- **Savings:** 85% vs LLM-generated diagram syntax

### 4. Bash-Based Diagram Rendering (800 token savings)
**Pattern:** Use mermaid-cli or plantuml.jar for rendering
- Generate Mermaid: `mmdc -i diagram.mmd -o diagram.png` (200 tokens)
- Generate PlantUML: `java -jar plantuml.jar diagram.puml` (200 tokens)
- No Task agents for rendering
- **Savings:** 80% vs Task-based diagram generation

### 5. Sample-Based Relationship Extraction (700 token savings)
**Pattern:** Analyze first 20 tables for relationship patterns
- Extract FK relationships from analyzed tables (500 tokens)
- Infer patterns and apply to remaining tables
- Full extraction only for schemas < 30 tables
- **Savings:** 60% vs exhaustive relationship extraction

### 6. Progressive Diagram Complexity (1,000 token savings)
**Pattern:** Three-tier diagram depth
- Level 1: Core tables only (5-10 tables) - 800 tokens
- Level 2: All tables, key relationships - 1,500 tokens
- Level 3: Full detail with columns - 2,500 tokens
- Default: Level 2
- **Savings:** 60% on default level

### 7. Grep-Based Table Discovery (500 token savings)
**Pattern:** Find table definitions with Grep
- Grep for table patterns: `^model`, `CREATE TABLE`, `@Entity` (200 tokens)
- Count tables without full parsing
- Read only for relationship analysis
- **Savings:** 75% vs reading all schema files

### 8. Incremental Diagram Updates (800 token savings)
**Pattern:** Update only changed portions of diagram
- Compare new schema with cached snapshot
- Regenerate only modified table definitions
- Preserve unchanged diagram sections
- **Savings:** 70% vs full diagram regeneration

### Real-World Token Usage Distribution

**Typical operation patterns:**
- **View existing diagram** (unchanged schema): 80 tokens
- **Generate diagram** (first time): 2,000 tokens
- **Update diagram** (schema changes): 1,200 tokens
- **Full detail diagram**: 2,500 tokens
- **Compare schemas**: 1,500 tokens
- **Most common:** View existing diagram or incremental updates

**Expected per-generation:** 1,500-2,500 tokens (50% reduction from 3,000-5,000 baseline)
**Real-world average:** 700 tokens (due to cached snapshots, early exit, template-based generation)

## Session Intelligence

I'll maintain diagram generation sessions for tracking schema evolution:

**Session Files (in current project directory):**
- `db-diagram/diagrams/` - Generated diagram files
- `db-diagram/schema-snapshot.json` - Current schema structure
- `db-diagram/state.json` - Generation history and settings
- `db-diagram/relationships.md` - Documented relationships

**IMPORTANT:** Session files are stored in a `db-diagram` folder in your current project root

**Auto-Detection:**
- If schema detected: Generate updated diagram
- If no schema: Guide through schema file location
- Commands: `generate`, `update`, `compare`, `export`

## Phase 1: Schema Detection & ORM Recognition

### Extended Thinking for Schema Analysis

For complex database schemas, I'll use extended thinking to understand relationships:

<think>
When analyzing database schemas:
- Implicit relationships not explicitly defined in ORM
- Many-to-many relationships through junction tables
- Polymorphic associations and their representations
- Inheritance strategies (single table, joined table, table per class)
- Soft deletes and audit columns
- Database-level constraints vs application-level validations
- Normalized vs denormalized design patterns
</think>

**Triggers for Extended Analysis:**
- Complex multi-tenant schemas
- Legacy databases with implicit conventions
- Microservices with shared database patterns
- Large schemas with 50+ tables

I'll automatically detect your database setup:

```bash
#!/bin/bash
# ORM and schema detection

detect_database_stack() {
    echo "=== Database Stack Detection ==="

    # Prisma detection
    if [ -f "prisma/schema.prisma" ]; then
        echo "✓ Prisma detected: prisma/schema.prisma"
        ORM="prisma"
        SCHEMA_FILE="prisma/schema.prisma"
    fi

    # TypeORM detection
    if find . -name "*.entity.ts" | head -1; then
        echo "✓ TypeORM detected: *.entity.ts files"
        ORM="typeorm"
        SCHEMA_FILES=$(find . -name "*.entity.ts")
    fi

    # Sequelize detection
    if [ -d "models" ] && grep -q "sequelize" package.json 2>/dev/null; then
        echo "✓ Sequelize detected: models/ directory"
        ORM="sequelize"
        SCHEMA_FILES=$(find models -name "*.js" -o -name "*.ts")
    fi

    # SQLAlchemy (Python) detection
    if find . -name "models.py" | head -1; then
        echo "✓ SQLAlchemy detected: models.py"
        ORM="sqlalchemy"
        SCHEMA_FILES=$(find . -name "models.py")
    fi

    # Django detection
    if find . -path "*/models/*.py" | head -1; then
        echo "✓ Django detected: */models/*.py"
        ORM="django"
        SCHEMA_FILES=$(find . -path "*/models/*.py")
    fi

    # Drizzle detection
    if find . -name "schema.ts" | grep -q drizzle; then
        echo "✓ Drizzle detected"
        ORM="drizzle"
        SCHEMA_FILES=$(find . -name "schema.ts")
    fi

    # Raw SQL detection
    if find . -name "*.sql" | grep -qE "(schema|create|ddl)"; then
        echo "✓ SQL files detected"
        ORM="raw-sql"
        SCHEMA_FILES=$(find . -name "*.sql" | grep -iE "(schema|create|ddl)")
    fi

    if [ -z "$ORM" ]; then
        echo "⚠️  No recognized ORM/schema files found"
        echo "Supported: Prisma, TypeORM, Sequelize, SQLAlchemy, Django, Drizzle"
        return 1
    fi

    echo
    echo "ORM: $ORM"
    echo "Schema files: $SCHEMA_FILE $SCHEMA_FILES"
}
```

## Phase 2: Schema Parsing

I'll parse schema definitions into a structured format:

### Prisma Schema Parser

```bash
# Parse Prisma schema
parse_prisma_schema() {
    local schema_file=$1

    echo "Parsing Prisma schema..."

    # Extract models
    awk '/^model / {
        model=$2;
        print "MODEL:" model;
        in_model=1;
        next;
    }
    in_model && /^}/ {
        in_model=0;
        print "END_MODEL";
        next;
    }
    in_model && /^[[:space:]]+[a-zA-Z]/ {
        print "FIELD:" $0;
    }
    /^enum / {
        print "ENUM:" $2;
    }' "$schema_file" > db-diagram/parsed-schema.txt

    # Extract relationships
    grep -E "@relation|@@" "$schema_file" > db-diagram/relationships.txt
}
```

**Parsed Schema Structure:**
```json
{
  "models": [
    {
      "name": "User",
      "fields": [
        {"name": "id", "type": "Int", "primaryKey": true, "autoIncrement": true},
        {"name": "email", "type": "String", "unique": true},
        {"name": "name", "type": "String", "nullable": true},
        {"name": "posts", "type": "Post[]", "relation": true}
      ]
    },
    {
      "name": "Post",
      "fields": [
        {"name": "id", "type": "Int", "primaryKey": true},
        {"name": "title", "type": "String"},
        {"name": "authorId", "type": "Int"},
        {"name": "author", "type": "User", "relation": {"from": "authorId", "to": "id"}}
      ]
    }
  ],
  "relationships": [
    {
      "from": "Post",
      "to": "User",
      "type": "many-to-one",
      "fromField": "author",
      "toField": "posts"
    }
  ]
}
```

### TypeORM Entity Parser

```typescript
// Parse TypeORM entities (conceptual - would use AST parsing)
/*
@Entity()
export class User {
  @PrimaryGeneratedColumn()
  id: number;

  @Column({ unique: true })
  email: string;

  @OneToMany(() => Post, post => post.author)
  posts: Post[];
}

Extracts to:
- Entity: User
- Primary Key: id (auto-generated)
- Unique: email
- Relationship: OneToMany to Post
*/
```

### SQLAlchemy Parser

```python
# Parse SQLAlchemy models (conceptual)
"""
class User(Base):
    __tablename__ = 'users'

    id = Column(Integer, primary_key=True)
    email = Column(String, unique=True)
    posts = relationship('Post', back_populates='author')

Extracts to:
- Table: users
- Model: User
- Primary Key: id
- Relationship: one-to-many to Post
"""
```

## Phase 3: Diagram Generation

I'll generate diagrams in multiple formats:

### Format 1: Mermaid (Default)

**Advantages:**
- GitHub/GitLab native rendering
- Interactive in many markdown viewers
- Easy to version control
- Simple syntax

```bash
# Generate Mermaid ER diagram
generate_mermaid() {
    local output_file="db-diagram/diagrams/schema.mmd"

    cat > "$output_file" <<'EOF'
erDiagram
    USER ||--o{ POST : "writes"
    USER {
        int id PK
        string email UK
        string name
        datetime createdAt
    }

    POST ||--o{ COMMENT : "has"
    POST {
        int id PK
        string title
        text content
        int authorId FK
        datetime publishedAt
    }

    COMMENT {
        int id PK
        text content
        int postId FK
        int userId FK
        datetime createdAt
    }

    USER ||--o{ COMMENT : "writes"

    POST }o--|| CATEGORY : "belongs to"
    CATEGORY {
        int id PK
        string name UK
        string slug
    }

    POST }o--o{ TAG : "tagged with"
    TAG {
        int id PK
        string name UK
    }

    POST_TAG {
        int postId FK
        int tagId FK
    }
    POST ||--o{ POST_TAG : ""
    TAG ||--o{ POST_TAG : ""
EOF

    echo "Mermaid diagram generated: $output_file"
    echo
    echo "View in GitHub/GitLab, or use:"
    echo "  - https://mermaid.live"
    echo "  - VSCode Mermaid Preview extension"
}
```

**Relationship Notation:**
```
||--o{ : one to many
}o--|| : many to one
||--|| : one to one
}o--o{ : many to many
```

### Format 2: PlantUML

**Advantages:**
- Highly customizable
- Professional appearance
- Extensive styling options
- Good for documentation

```bash
# Generate PlantUML diagram
generate_plantuml() {
    local output_file="db-diagram/diagrams/schema.puml"

    cat > "$output_file" <<'EOF'
@startuml Database Schema

!define Table(name,desc) class name as "desc" << (T,#FFAAAA) >>
!define primary_key(x) <b>PK: x</b>
!define foreign_key(x) <color:red>FK: x</color>
!define unique(x) <color:green>UK: x</color>

Table(User, "users") {
  primary_key(id): INT
  unique(email): VARCHAR
  name: VARCHAR
  createdAt: TIMESTAMP
}

Table(Post, "posts") {
  primary_key(id): INT
  title: VARCHAR
  content: TEXT
  foreign_key(authorId): INT
  publishedAt: TIMESTAMP
}

Table(Comment, "comments") {
  primary_key(id): INT
  content: TEXT
  foreign_key(postId): INT
  foreign_key(userId): INT
  createdAt: TIMESTAMP
}

Table(Category, "categories") {
  primary_key(id): INT
  unique(name): VARCHAR
  slug: VARCHAR
}

Table(Tag, "tags") {
  primary_key(id): INT
  unique(name): VARCHAR
}

Table(PostTag, "post_tags") {
  foreign_key(postId): INT
  foreign_key(tagId): INT
}

User "1" -- "0..*" Post : writes
User "1" -- "0..*" Comment : writes
Post "1" -- "0..*" Comment : has
Post "0..*" -- "1" Category : belongs to
Post "0..*" -- "0..*" Tag : tagged with
(Post, Tag) .. PostTag

@enduml
EOF

    echo "PlantUML diagram generated: $output_file"
    echo
    echo "Generate image:"
    echo "  plantuml $output_file"
    echo "  # or use: https://www.plantuml.com/plantuml/"
}
```

### Format 3: DBML (Database Markup Language)

**Advantages:**
- Clean, readable syntax
- dbdiagram.io integration
- Schema versioning friendly
- Language-agnostic

```bash
# Generate DBML diagram
generate_dbml() {
    local output_file="db-diagram/diagrams/schema.dbml"

    cat > "$output_file" <<'EOF'
// Database Schema Documentation
// Generated: 2026-01-25

Table users {
  id integer [pk, increment]
  email varchar [unique, not null]
  name varchar
  createdAt timestamp [default: `now()`]

  Indexes {
    email [unique]
  }
}

Table posts {
  id integer [pk, increment]
  title varchar [not null]
  content text
  authorId integer [ref: > users.id]
  categoryId integer [ref: > categories.id]
  publishedAt timestamp

  Indexes {
    authorId
    categoryId
    publishedAt
  }
}

Table comments {
  id integer [pk, increment]
  content text [not null]
  postId integer [ref: > posts.id]
  userId integer [ref: > users.id]
  createdAt timestamp [default: `now()`]
}

Table categories {
  id integer [pk, increment]
  name varchar [unique, not null]
  slug varchar [unique, not null]
}

Table tags {
  id integer [pk, increment]
  name varchar [unique, not null]
}

Table post_tags {
  postId integer [ref: > posts.id]
  tagId integer [ref: > tags.id]

  Indexes {
    (postId, tagId) [pk]
  }
}

// Relationships
Ref: posts.authorId > users.id [delete: cascade]
Ref: comments.postId > posts.id [delete: cascade]
Ref: comments.userId > users.id [delete: cascade]
EOF

    echo "DBML diagram generated: $output_file"
    echo
    echo "Visualize at: https://dbdiagram.io/d"
}
```

## Phase 4: Intelligent Relationship Detection

I'll automatically detect and document relationships:

```bash
# Detect relationship types
detect_relationships() {
    echo "=== Relationship Analysis ==="

    # One-to-Many
    echo "One-to-Many relationships:"
    # User -> Posts: A user has many posts
    # Post -> Comments: A post has many comments

    # Many-to-Many
    echo "Many-to-Many relationships:"
    # Post <-> Tag: Posts have many tags, tags have many posts
    # (via post_tags junction table)

    # One-to-One
    echo "One-to-One relationships:"
    # User -> Profile: A user has one profile

    # Self-referential
    echo "Self-referential relationships:"
    # User -> User: A user can follow other users
    # Category -> Category: Categories can have parent categories
}

# Document relationships
document_relationships() {
    cat > db-diagram/relationships.md <<EOF
# Database Relationships

## One-to-Many Relationships

### User → Posts
- **Cardinality**: One User has Many Posts
- **Foreign Key**: \`posts.authorId\` references \`users.id\`
- **Cascade**: Delete posts when user is deleted
- **Inverse**: \`user.posts\` / \`post.author\`

### Post → Comments
- **Cardinality**: One Post has Many Comments
- **Foreign Key**: \`comments.postId\` references \`posts.id\`
- **Cascade**: Delete comments when post is deleted
- **Inverse**: \`post.comments\` / \`comment.post\`

## Many-to-Many Relationships

### Posts ↔ Tags
- **Cardinality**: Many-to-Many
- **Junction Table**: \`post_tags\`
- **Foreign Keys**:
  - \`post_tags.postId\` references \`posts.id\`
  - \`post_tags.tagId\` references \`tags.id\`
- **Inverse**: \`post.tags\` / \`tag.posts\`

## One-to-One Relationships

### User → Profile
- **Cardinality**: One-to-One
- **Foreign Key**: \`profiles.userId\` references \`users.id\`
- **Unique**: \`profiles.userId\` is unique
- **Inverse**: \`user.profile\` / \`profile.user\`
EOF
}
```

## Phase 5: Schema Documentation

I'll generate comprehensive schema documentation:

```markdown
# Database Schema Documentation

Generated: 2026-01-25 18:45:00

## Overview
- **Database**: PostgreSQL 15
- **ORM**: Prisma
- **Tables**: 6
- **Relationships**: 8

## Tables

### users
User accounts and authentication

| Column    | Type      | Constraints           | Description          |
|-----------|-----------|-----------------------|----------------------|
| id        | integer   | PRIMARY KEY, AUTO_INC | User identifier      |
| email     | varchar   | UNIQUE, NOT NULL      | Login email          |
| name      | varchar   | NULLABLE              | Display name         |
| createdAt | timestamp | DEFAULT NOW()         | Account creation     |

**Indexes:**
- PRIMARY KEY on `id`
- UNIQUE INDEX on `email`

**Relationships:**
- One-to-Many: `posts` (via `posts.authorId`)
- One-to-Many: `comments` (via `comments.userId`)

---

### posts
Blog posts and articles

| Column      | Type      | Constraints           | Description          |
|-------------|-----------|-----------------------|----------------------|
| id          | integer   | PRIMARY KEY, AUTO_INC | Post identifier      |
| title       | varchar   | NOT NULL              | Post title           |
| content     | text      | NULLABLE              | Post body            |
| authorId    | integer   | FOREIGN KEY, NOT NULL | Author reference     |
| categoryId  | integer   | FOREIGN KEY           | Category reference   |
| publishedAt | timestamp | NULLABLE              | Publication date     |

**Indexes:**
- PRIMARY KEY on `id`
- INDEX on `authorId`
- INDEX on `categoryId`
- INDEX on `publishedAt`

**Relationships:**
- Many-to-One: `author` (references `users.id`)
- Many-to-One: `category` (references `categories.id`)
- One-to-Many: `comments` (via `comments.postId`)
- Many-to-Many: `tags` (via `post_tags`)

[... additional tables ...]

## Relationship Diagram

\`\`\`mermaid
[Generated Mermaid diagram here]
\`\`\`

## Database Statistics
- **Total Tables**: 6
- **Total Columns**: 32
- **Foreign Keys**: 7
- **Unique Constraints**: 5
- **Indexes**: 12

## Change History
- 2026-01-25: Initial schema
- [Track schema migrations here]
```

## Phase 6: Schema Comparison & Evolution

Track schema changes over time:

```bash
# Compare current schema with previous snapshot
compare_schemas() {
    local previous="db-diagram/schema-snapshot.json"
    local current="db-diagram/schema-current.json"

    if [ ! -f "$previous" ]; then
        echo "No previous schema snapshot found"
        return 1
    fi

    echo "=== Schema Changes Detected ==="

    # Compare tables
    echo "New tables:"
    diff <(jq -r '.models[].name' "$previous" | sort) \
         <(jq -r '.models[].name' "$current" | sort) | \
         grep "^>" | sed 's/^> /  + /'

    echo "Removed tables:"
    diff <(jq -r '.models[].name' "$previous" | sort) \
         <(jq -r '.models[].name' "$current" | sort) | \
         grep "^<" | sed 's/^< /  - /'

    # Compare fields within tables
    echo "Modified tables:"
    # [Field comparison logic]

    # Generate migration summary
    cat > db-diagram/migration-summary.md <<EOF
# Schema Migration Summary

Date: $(date +"%Y-%m-%d %H:%M:%S")

## Changes

### Added Tables
- [List new tables]

### Modified Tables
- [List modified tables with field changes]

### Removed Tables
- [List removed tables]

## Impact Analysis
- **Breaking Changes**: [Yes/No]
- **Migration Required**: [Yes/No]
- **Data Migration**: [Yes/No]

## Recommended Actions
1. Review changes
2. Create migration script
3. Update API contracts
4. Update documentation
EOF
}
```

## Phase 7: Integration with Schema Validation

**Integration with /schema-validate:**
```
When schema changes detected:
→ Automatically suggest /schema-validate
→ Validate foreign key integrity
→ Check for orphaned records
→ Verify constraint compliance
```

## Context Continuity

**Session Resume:**
When you return and run `/db-diagram` or `/db-diagram update`:
- Check for schema changes since last generation
- Show diff if schema evolved
- Regenerate diagrams with updates
- Update documentation

**Progress Example:**
```
DATABASE DIAGRAM GENERATION
═══════════════════════════════════════════════════

Schema: Prisma (prisma/schema.prisma)
Last generated: 3 hours ago

Schema Status:
├── Tables: 6 (unchanged)
├── Relationships: 8 (unchanged)
├── Migrations: 2 new since last diagram

Changes Detected:
├── ✓ Added index on posts.publishedAt
└── ✓ Modified users.email (added validation)

Generating updated diagrams...
├── Mermaid: db-diagram/diagrams/schema.mmd
├── PlantUML: db-diagram/diagrams/schema.puml
└── DBML: db-diagram/diagrams/schema.dbml

Documentation updated: db-diagram/README.md
```

## Practical Examples

**Generate Diagrams:**
```
/db-diagram                           # Auto-detect and generate
/db-diagram mermaid                   # Generate Mermaid format
/db-diagram prisma/schema.prisma      # Specific schema file
/db-diagram all                       # All formats
```

**Update & Compare:**
```
/db-diagram update       # Regenerate with changes
/db-diagram compare      # Compare with previous version
/db-diagram export svg   # Export as image
```

## Safety Guarantees

**Protection Measures:**
- Read-only schema analysis
- No database modifications
- Version controlled diagrams
- Schema snapshot preservation

**Important:** I will NEVER:
- Modify database schema
- Execute migrations
- Connect to production databases
- Delete schema files

## Skill Integration

Perfect complement to database workflows:
- `/schema-validate` - Validate after diagram generation
- `/migration-generate` - Create migrations from schema
- `/docs` - Include diagrams in documentation
- `/api-docs-generate` - Link schema to API docs

## Token Budget Optimization

To stay within 2,500-4,000 token budget:
- **Focus on diagram generation logic**
- **Provide one detailed example per format**
- **Use file outputs for documentation**
- **Defer ORM-specific parsing to external tools when available**
- **Compact relationship notation**

## What I'll Actually Do

1. **Detect ORM** - Auto-identify schema format (Prisma/TypeORM/SQLAlchemy/Django)
2. **Parse schema** - Extract tables, fields, relationships, constraints
3. **Generate diagrams** - Mermaid (default), PlantUML, DBML as requested
4. **Document relationships** - Clear documentation of all relationships
5. **Track evolution** - Compare with previous versions
6. **Export formats** - Multiple output formats for different use cases
7. **Integrate validation** - Suggest schema validation when appropriate

I'll help you visualize and document your database schema for better understanding and team collaboration.

---

**Credits:**
- Prisma schema documentation
- TypeORM entity relationship patterns
- SQLAlchemy ORM relationship types
- Django model relationship documentation
- Mermaid ER diagram syntax
- PlantUML database diagrams
- DBML specification from dbdiagram.io
- Database design best practices