---
name: file-converter
description: This skill handles file format conversions across documents (PDF, DOCX,
  Markdown, HTML, TXT), data files (JSON, CSV, YAML, XML, TOML), and images (PNG,
  JPG, WebP, SVG, GIF). Use when the user requests converting, transforming, or exporting
  files between formats. Generates conversion code dynamically based on the specific
  request.
author: Joseph OBrien
status: unpublished
updated: '2025-12-23'
version: 1.0.1
tag: skill
type: skill
---

# File Converter

## Overview

Convert files between formats across three categories: documents, data files, and images. Generate Python code dynamically for each conversion request, selecting appropriate libraries and handling edge cases.

## Conversion Categories

### Documents

| From | To | Recommended Library |
|------|-----|---------------------|
| Markdown | HTML | `markdown` or `mistune` |
| HTML | Markdown | `markdownify` or `html2text` |
| HTML | PDF | `weasyprint` or `pdfkit` (requires wkhtmltopdf) |
| PDF | Text | `pypdf` or `pdfplumber` |
| DOCX | Markdown | `mammoth` |
| DOCX | PDF | `docx2pdf` (Windows/macOS) or LibreOffice CLI |
| Markdown | PDF | Convert via HTML first, then to PDF |

### Data Files

| From | To | Recommended Library |
|------|-----|---------------------|
| JSON | YAML | `pyyaml` |
| YAML | JSON | `pyyaml` |
| JSON | CSV | `pandas` or stdlib `csv` + `json` |
| CSV | JSON | `pandas` or stdlib `csv` + `json` |
| JSON | TOML | `tomli`/`tomllib` (read) + `tomli-w` (write) |
| XML | JSON | `xmltodict` |
| JSON | XML | `dicttoxml` or `xmltodict.unparse` |

### Images

| From | To | Recommended Library |
|------|-----|---------------------|
| PNG/JPG/WebP/GIF | Any raster | `Pillow` (PIL) |
| SVG | PNG/JPG | `cairosvg` or `svglib` + `reportlab` |
| PNG | SVG | `potrace` (CLI) for tracing, limited fidelity |

## Workflow

1. Identify source format (from file extension or user statement)
2. Identify target format
3. Check `references/` for format-specific guidance
4. Generate conversion code using recommended library
5. Handle edge cases (encoding, transparency, nested structures)
6. Execute conversion and report results

## Quick Patterns

### Data: JSON to YAML

```python
import json
import yaml

with open("input.json") as f:
    data = json.load(f)

with open("output.yaml", "w") as f:
    yaml.dump(data, f, default_flow_style=False, allow_unicode=True)
```

### Data: CSV to JSON

```python
import csv
import json

with open("input.csv") as f:
    reader = csv.DictReader(f)
    data = list(reader)

with open("output.json", "w") as f:
    json.dump(data, f, indent=2)
```

### Document: Markdown to HTML

```python
import markdown

with open("input.md") as f:
    md_content = f.read()

html = markdown.markdown(md_content, extensions=["tables", "fenced_code"])

with open("output.html", "w") as f:
    f.write(html)
```

### Image: PNG to WebP

```python
from PIL import Image

img = Image.open("input.png")
img.save("output.webp", "WEBP", quality=85)
```

### Image: SVG to PNG

```python
import cairosvg

cairosvg.svg2png(url="input.svg", write_to="output.png", scale=2)
```

## Resources

Detailed guidance for complex conversions is in `references/`:

- `references/document-conversions.md` - PDF handling, encoding issues, styling preservation
- `references/data-conversions.md` - Schema handling, type coercion, nested structures
- `references/image-conversions.md` - Quality settings, transparency, color profiles

Consult these references when handling edge cases or when the user has specific quality/fidelity requirements.