--- name: files-buddy description: >- Safe filesystem organization, deduplication, renaming, and cleanup with cloud drive support. Delegates to best-in-class CLI tools. Use for file management. NOT for shell scripts (shell-scripter). argument-hint: " [options]" model: sonnet license: MIT metadata: author: wyattowalsh version: "1.0.0" --- # Files Buddy Safe filesystem organization and cleanup. Delegates to best-in-class CLI tools for deduplication, renaming, archiving, and analysis. Cloud drives (iCloud, Google Drive, Dropbox, OneDrive) are first-class citizens with auto-detection and adjusted safety. **Scope:** File organization, cleanup, renaming, deduplication, archiving, and analysis. NOT for shell script generation (shell-scripter), CI/CD pipelines (devops-engineer), or database work (database-architect). ## Canonical Vocabulary | Term | Definition | |------|-----------| | **dry-run** | Preview via tool's native mode (fclones group, f2 default, organize sim, detox -n) | | **manifest** | JSON log at `~/.files-buddy/manifests/` enabling undo and discovery | | **blast radius** | Total file count and size affected by an operation | | **protected path** | Hard-blocked (system) or escalated-confirmation directory | | **trash** | Reversible deletion via gomi / OS trash / `.files-buddy-trash/` | | **scope pin** | Hard boundary = user-referenced directory only | | **material drift** | Filesystem changed >10% between preview and execution | | **batch** | Non-overlapping operations; rollback unit | | **intent contract** | User-confirmed description (not individual file list) | | **hardlink cluster** | Files sharing same inode — NOT duplicates | | **tool delegation** | Invoking CLI tool via subprocess; prefer over reimplementation | | **fallback** | Python stdlib when CLI tool is not installed | | **evicted file** | Cloud placeholder; must materialize before size/hash analysis | | **cloud-safe** | Operations adjusted for sync implications | | **conflict copy** | Duplicate from sync conflict (e.g., `file (1).txt`) | | **dashboard** | Self-contained HTML visualization opened in browser | ## Dispatch | $ARGUMENTS | Mode | Destructive? | |------------|------|-------------| | `organize ` | organize | Yes (moves) | | `clean ` | clean | Yes (trash) | | `audit ` | audit | No (read-only) | | `rename ` | rename | Yes (renames) | | `flatten ` | flatten | Yes (moves) | | `archive ` | archive | Yes (moves) | | `sanitize ` | sanitize | Yes (renames) | | `find ` | find | No (read-only) | | `watch [rules]` | watch | Yes (moves) | | `undo ` | undo | Yes (restores) | | `dashboard [path]` | dashboard | No (writes report) | | Empty or unrecognized | — | Show mode menu | ### Auto-Detection Heuristic 1. "sort", "organize", "tidy" + path -> **organize** 2. "duplicates", "clean", "dedup", "lint" + path -> **clean** 3. "how big", "usage", "analyze", "scan" + path -> **audit** 4. "rename", "batch rename" + pattern -> **rename** 5. "flatten", "collapse" + path -> **flatten** 6. "archive", "compress", "old files" + path -> **archive** 7. "fix names", "sanitize", "encoding" + path -> **sanitize** 8. "find", "search", "where is" + query -> **find** 9. "watch", "auto-organize", "monitor" + path -> **watch** 10. "undo", "reverse", "restore" + manifest -> **undo** 11. "dashboard", "visualize", "report" -> **dashboard** 12. Ambiguous -> ask which mode ## Structural Constraints 1. **Operation whitelist:** move, rename, copy, trash, mkdir. NEVER `rm`, `chmod`, `chown`. 2. **Scope pinning:** boundary = user-referenced directory only. 3. **Hard-blocked paths:** root/system directories from `references/protected-paths.md`, including `/`, `/System`, `/Library`, `/Applications`, `/usr`, `/bin`, `/sbin`, `/var`, `/etc`, `/private`, `/boot`, `/dev`, `/proc`, `/sys`, `/run`, `/lib`, `/lib64` 4. **Symlink resolution:** `os.path.realpath()` before plan gen. Cycles detected (max 40 hops). 5. **`.git/` always excluded.** 6. **Cloud-safe:** NEVER auto-delete cloud files. Materialize evicted files before analysis. ## Escalated-Confirmation Paths `~/.ssh`, `~/.gnupg`, `~/.aws`, `~/.config`, `~/.kube`, `~/.local/share/keyrings`, any directory containing `.env` Require full preview + explicit path naming + warning before any operation. ## Tiered Friction Model | Tier | Trigger | Friction | |------|---------|----------| | **Low** | Rename, move within parent, <10 files | Inline plan, `[y/N]` | | **Medium** | Cross-dir move, 10-100 files, archive | Summary preview, confirmation | | **High** | Any trash, recursive, 100+ files, 1 GB+ | Full preview, blast radius, type "yes" | | **Critical** | Escalated paths, cloud directories | Full preview + path naming + warning | AI-initiated ops bias one tier higher. Cloud ops always at least Medium. ## Pre-Flight Checks Run before every mode: 1. **Path resolution** — reject hard-blocked and escalated paths (or escalate friction) 2. **Scope boundary** — confirm scope pin to user-referenced directory 3. **Cloud detection** — check `~/Library/CloudStorage/*`, `~/Library/Mobile Documents/com~apple~CloudDocs`; tag cloud-synced dirs, adjust behavior 4. **Symlink inventory** — flag escapes outside scope, detect cycles 5. **Tool availability** — `command -v fd fclones rmlint f2 dust erd gomi ouch zstd b3sum detox convmv rclone pueue bat watchexec organize 2>/dev/null`; report missing, suggest install 6. **Permission check** — flag restricted files (`stat` for read/write access) 7. **Disk space** — verify free space >= estimated operation size 8. **Git awareness** — detect `.gitignore`, warn before moving tracked files 9. **Case sensitivity** — detect APFS case-insensitive volumes, flag rename collisions 10. **Eviction check** — materialize placeholder files before analysis (`brctl download` for iCloud, access for GDrive stream) ## Mode: organize Sort files by type, date, project, or custom rules using organize-tool. 1. Generate organize-tool YAML config from user intent. Read `references/organization-strategies.md` 2. Run `organize sim ` (dry-run) — parse output, present preview 3. Show blast radius: file count, size, destination structure as tree 4. On confirmation, run `organize run ` with manifest logging 5. Report: operations completed, manifest path, `open ` to verify 6. **Fallback:** `shutil.move` with manual rule matching if organize-tool not installed ## Mode: clean Remove duplicates, lint filesystem, trash temp files. 1. Read `references/duplicate-detection.md`. Run `fclones group --format json ` for duplicates 2. Run `rmlint -o json: ` for empty dirs, broken symlinks, orphan files 3. For cloud dirs: use `rclone dedupe --dry-run :path` (Google Drive duplicate filenames) 4. Present grouped findings: duplicates (with sizes), lint issues, reclaimable space 5. On confirmation, trash selected items via `gomi` (never `rm`). Log to manifest 6. **Fallback:** `hashlib` + `os.walk` for dedup; `os.listdir` for empty dirs ## Mode: audit Analyze disk usage and find issues. Strictly read-only. 1. Run `dust -j ` for disk usage summary (JSON output) 2. Run `erd -l -s rsize ` for directory tree with sizes 3. Run `rmlint -o json: ` for lint issues (empty dirs, broken symlinks) 4. For cloud dirs: `rclone size :path` + `rclone lsjson` (materialize evicted files first) 5. Present: top space consumers, file type distribution, stale files (>1yr untouched), issues 6. Offer transitions: "Clean duplicates?" / "Archive old files?" 7. **Fallback:** `os.stat` + `os.walk` for sizes; `pathlib` for file listing ## Mode: rename Batch rename with regex, EXIF, ID3 templates using f2. 1. Read `references/rename-patterns.md`. Construct `f2` command from user pattern 2. Run `f2 ` (dry-run by default) — parse rename table 3. Present before/after diff: ```diff - old-ugly-name_FINAL_v2.pdf + 2024-01-project-report.pdf ``` 4. On confirmation, run `f2 -x ` (execute). Log to manifest 5. For undo: `f2 -u` using f2's native undo support 6. **Fallback:** `re.sub` + `os.rename` with collision detection ## Mode: flatten Collapse nested directories to a single level. 1. Inventory target dir — build tree, count files, detect naming collisions 2. Plan flat destination with collision resolution (append `-1`, `-2`, etc.) 3. Present preview: tree before vs flat list after, collision resolutions 4. On confirmation, `shutil.move` each file. Log to manifest 5. Clean up empty directories (bottom-up traversal) 6. No CLI tool dependency — Python stdlib only ## Mode: archive Compress old or unused files. 1. Identify candidates: files untouched >N days (user-specified or default 365) 2. Group by directory or type for archive bundles 3. Run `ouch compress ` (or `zstd` for single files) 4. Present preview: files to archive, compressed size estimate, destination 5. On confirmation, compress and optionally move originals to trash. Log to manifest 6. For cloud: `rclone move :archive/` for cloud archiving 7. **Fallback:** `tarfile` + `gzip` from Python stdlib ## Mode: sanitize Fix filenames: remove special characters, fix encoding, normalize Unicode. 1. Run `detox -n ` (dry-run) for character cleanup preview 2. Run `convmv -f -t utf-8 --nfc ` for encoding normalization preview 3. Present before/after rename table with changes highlighted 4. On confirmation, run `detox ` and `convmv --notest -f -t utf-8 --nfc `. Log to manifest 5. **Fallback:** `re.sub` for character cleanup; `unicodedata.normalize` for NFC/NFD ## Mode: find Smart file search with rich output. Strictly read-only. 1. Translate natural language to `fd` flags: "large PDFs" -> `fd -e pdf --size +10m ` 2. Run `fd` command, pipe matches through `bat` for syntax-highlighted preview 3. Present results as markdown table: name, size, modified date, path 4. Offer transitions: "Found 42 PDFs -> organize them?" / "Found duplicates -> clean them?" 5. For cloud dirs: `rclone lsjson :path --recursive` with `--include`/`--exclude` 6. **Fallback:** `os.walk` with `fnmatch` filtering ## Mode: watch Auto-organize files on creation using watchexec + organize-tool. 1. Generate organize-tool YAML config from user rules. Read `references/organization-strategies.md` 2. Start watcher: `watchexec -e jpg,png,pdf -- organize run ` (background) 3. Register in `~/.files-buddy/watchers.json` for persistence 4. Log all auto-organized files to manifest 5. Subcommands: `watch start `, `watch stop `, `watch list`, `watch status` 6. **Requires:** watchexec + organize-tool (no fallback — notify user to install) ## Mode: undo Reverse a previous operation using manifest records. 1. If no manifest specified, run `manifest-manager.py list` — show recent operations 2. Load manifest, validate paths are within `~/.files-buddy/manifests/` 3. Verify completed file ops include recorded BLAKE3 hashes and that current files match manifest records — abort on mismatch or missing integrity metadata 4. Reverse operations in reverse order. Restore cloud-tagged manifests from `.files-buddy-trash/`; otherwise restore from gomi / OS trash or `.files-buddy-trash/` fallback 5. Mark manifest status = `undone`. Report restored files 6. **Fallback:** `mv` from `.files-buddy-trash/` if gomi unavailable ## Mode: dashboard Open a visual HTML dashboard. Analysis stays read-only, but rendering writes a local report file. 1. Run audit analysis (dust, rmlint, fd) to collect data for target path 2. Generate JSON: disk usage treemap, file type distribution, duplicates, large files, stale files, operation history from manifests 3. Run `uv run python skills/files-buddy/scripts/dashboard-renderer.py --data --output --open` 4. Default dashboard path is `~/.claude/files-buddy/{YYYY-MM-DD}-dashboard.html` unless `--output` is provided 5. Opens in default browser. Includes a self-contained treemap, sortable tables, dark/light theme 6. **Fallback:** Print summary tables in terminal if browser unavailable ## Dry-Run Preview Protocol Every destructive mode produces a preview before execution: 1. **Tool native dry-run:** `fclones group` (no `--delete`), `f2` (default), `organize sim`, `detox -n`, `ouch compress --dry-run` 2. **Preview table:** Source, destination, operation type, size 3. **Blast radius summary:** Total files, total size, directories affected 4. **Risk indicators:** `🔴` Critical, `🟠` High, `🟡` Medium, `🟢` Low 5. **Confirmation gate:** Tier-appropriate friction (see Tiered Friction Model) ## Execution Protocol Transaction-batched execution for all destructive modes: 1. **Manifest** — atomic write (`.tmp` -> `os.rename()`) at `~/.files-buddy/manifests/{ts}-{uuid8}-{mode}.json` via `manifest-manager.py create` 2. **Batches** — dirs (parallel) -> files (parallel non-overlapping, `pueue` for cloud) -> empty dir cleanup (sequential) 3. **TOCTOU check** — if material drift >10% between preview and execution, halt and re-preview 4. **State tracking** — runtime may track preview / in-progress state internally, but manifest rows are appended once per finalized operation: `completed` or `failed` 5. **Trash** — local paths: gomi -> OS trash -> `.files-buddy-trash/`; cloud-tagged paths: `.files-buddy-trash/` only. NEVER `rm` 6. **Metadata** — finalized file ops record: source, dest, type, timestamp, BLAKE3 hash (b3sum), size, st_mode, st_ino 7. **Cloud batch** — `pueue` with `parallel 2` + delays to avoid API rate limiting 8. **Failure** — roll back current batch, preserve prior completed batches, manifest status = `partial` 9. **Report** — operations completed, manifest path, undo command 10. **Notify** — desktop notification on completion if operation took >10s (`osascript` / `notify-send`) ## Scaling Strategy | Scope | Strategy | |-------|----------| | <100 files | Direct operation, full preview | | 100-1,000 files | Batched preview (summary + sample), pueue for parallelism | | 1,000-10,000 files | Sampling preview (10%), batched execution, progress tracking | | 10,000+ files | Sampling preview (1%), pueue queued batches, parallel subagents | Cloud directories: halve the parallelism, double the batch delays. ## Reference Files Load ONE reference at a time. Do not preload all references into context. | File | Content | Read When | |------|---------|-----------| | `references/tool-integrations.md` | CLI interfaces for 20+ tools: fd, fclones, rmlint, f2, dust, erdtree, gomi, ouch, zstd, b3sum, detox, convmv, rclone, pueue, bat, watchexec, organize-tool, czkawka_cli. Install commands per platform. Output parsing. | Pre-flight tool detection | | `references/cloud-drives.md` | Detection paths (macOS CloudStorage, iCloud Mobile Documents), brctl/fileproviderctl, evicted file handling, rclone backends, Google Drive dedupe, conflict copies, rate limiting | Pre-flight cloud detection | | `references/organization-strategies.md` | organize-tool YAML templates, extension-to-category mapping, date grouping, project detection, collision handling | Organize mode, watch mode | | `references/protected-paths.md` | Hard-blocked paths (macOS + Linux), escalated paths, validation algorithm, `.git/` exclusion | Pre-flight checks | | `references/duplicate-detection.md` | fclones JSON parsing, rmlint lint types, rclone dedupe modes, hardlink detection, zero-byte exclusion, NFC/NFD gotchas | Clean mode | | `references/rename-patterns.md` | f2 patterns (regex, EXIF `{xt.make}`, ID3 `{id3.artist}`, hash `{hash.blake3}`), CSV batch, conflicts, f2 undo | Rename mode | | `references/safety-workflow.md` | Manifest schema, atomic writes, trash hierarchy, TOCTOU drift, cloud safety rules, permission restoration, corruption recovery | Undo mode, execution protocol | | Script | When to Run | |--------|-------------| | `scripts/manifest-manager.py` | Create, list, search, validate, and close manifests — all destructive modes | | `scripts/dashboard-renderer.py` | Inject JSON data into HTML template, open browser — dashboard mode | | Template | When to Render | |----------|----------------| | `templates/dashboard.html` | After audit analysis — inject data JSON into `