` to confirm | Use a different identity context, not a different port |
| `pd begin` says session already active | Stale session for this PID | `pd whoami` | `pd done` then re-`begin`, or continue in the existing one |
| `pd session files claim` says conflict | Another agent already claimed it | `pd swarm_awareness` or read the warning | `examples/02-two-agents-same-file.md` |
| Fleet agent never fires on commit | `fleet.name != basename(projectDir)` (most common) | `scripts/fleet-validate.sh` | Fix `fleet.name`; re-`pd fleet up` |
| Fleet agent fires but does nothing | `prompt` is too vague, or `allowedTools` blocks needed action | `pd spawned` shows it ran briefly; check its log | Tighten the prompt; widen the allowlist |
| `pd salvage` shows huge backlog | Stale entries from agents that crashed long ago | `pd salvage --project ` | Triage with `scripts/salvage-triage.sh`; dismiss obviously-dead entries |
| `pd tuple rd` returns nothing | Wrong harbor or pattern shape | `pd tuple scan --harbor ` | `schemas/tuple-shape.md` for grammar |
| `pd note` writes succeed but `pd notes` returns empty | Notes encrypted but key changed | `ls -la ~/.port-daddy/master.key` perms must be `0600` | Restore key from backup; otherwise the notes are unreadable |
| Pheromone always reads near-zero | Decay outpacing spray cadence | `pd pheromone show --table --id ` | Either spray more often or raise `--strength`; pheromones are not a database |
| Two agents both `tuple in` succeed on same tuple | Bug — file feedback | `.spark/feedback/$(date +%F)-tuple-double-take.md` | The daemon should atomic-take; this is a regression |
## L3 Index — Where the Deep Knowledge Lives
This file is the L1/L2 entry. Don't paste deep references inline — load on demand:
| Need | Read |
|---|---|
| Worked end-to-end scenario | `examples/INDEX.md` (8 scenarios: bootstrap, conflict, salvage, fleet, swarm, daemon-down, port-collision) |
| Authoritative contract / data shape | `schemas/INDEX.md` (semantic-identity, pd-fleet schema, tuple, note, pheromone, salvage, MCP tools) |
| One-shot helper to run | `scripts/` (preflight, session-resume, salvage-triage, fleet-validate, agent-handshake) |
| Template to copy and edit | `assets/` (pd-fleet.starter.yml, .portdaddyrc.starter, session-note.template.md) |
| Full HTTP API (93+ endpoints) | `references/api-reference.md` |
| JavaScript SDK | `references/sdk-reference.md` |
| Advanced multi-agent patterns | `references/multi-agent-patterns.md` |
| `.portdaddyrc` per-project config | `references/portdaddyrc-spec.md` |
## Quick Start (Do This First)
```bash
# 1. Start your session — ALWAYS do this first
pd begin "Building auth module"
# 2. Claim a port — deterministic, never conflicts
PORT=$(pd claim myapp:api:main -q)
# 3. Leave breadcrumbs for other agents
pd note "JWT validation working, moving to refresh tokens"
# 4. Check who else is working here
pd salvage --project myapp # Any dead agents to rescue?
# 5. End cleanly
pd done
```
## Why This Matters
Without Port Daddy:
- Port conflicts every time two agents run dev servers
- No record of what agents did or decided
- Crashed agents leave orphaned work nobody finds
- No way for agents to signal each other
- File edit collisions destroy work silently
With Port Daddy:
- Deterministic ports — same identity always gets the same port
- Immutable notes — full audit trail of every decision
- Salvage queue — dead agent work is preserved and claimable
- Pub/sub + file claims — agents coordinate without stepping on each other
- Background fleet — QA, docs, testing run automatically on every commit
- Binary IPC — sub-microsecond heartbeats and pheromone sprays over Unix socket
- Pheromone trails — ambient numeric signals that decay over time for contention detection
- Tuple space — shared typed memory for swarm coordination
- Semantic trie — O(k) identity lookups replacing SQL LIKE scans
## Shibboleths
- If the task only needs one temporary port and no shared session state, launching fleets and pheromone trails is theater, not coordination.
- If you need hard exclusion for a critical section, use a lock; advisory file claims are for negotiation, not safety.
- If the failure is in container ingress, DNS, or production service discovery, Port Daddy is the wrong layer.
## MCP Tools Available
**Start here (high-level, one call does many things):**
| Tool | What It Does |
|------|-------------|
| `begin_session` | Register as an agent + start a session atomically |
| `end_session_full` | End session + unregister atomically |
| `whoami` | What agent am I? What session? What files do I own? |
| `catch_me_up` | What happened while I was away? Recent activity, notes, dead agents |
| `swarm_awareness` | Who else is working here? All agents, sessions, file claims |
| `file_heat` | Which files are agents fighting over? Pheromone-based contention map |
| `talk_to_agent` | Send a direct message to a specific fleet agent by name |
| `claim_port` | Get a deterministic port for a service identity |
| `add_note` | Leave an immutable breadcrumb (notes can never be deleted) |
| `acquire_lock` | Distributed lock for critical sections |
| `spawn_agent` | Launch a background AI agent with a task |
| `fleet_init` | Set up a background agent fleet with git hooks and pd-fleet.yml |
| `pd_discover` | Find additional tools by category |
**Tuple space tools (shared swarm memory):**
| Tool | What It Does |
|------|-------------|
| `tuple_out` | Write a typed tuple to the shared space (harbor-scoped) |
| `tuple_read` | Read tuples matching a pattern (non-destructive) |
| `tuple_take` | Atomically read + remove tuples matching a pattern |
| `tuple_scan` | List all tuples in a harbor or global space |
| `tuple_count` | Count tuples matching a pattern |
**Discover more tools by category:**
Call `pd_discover` with a category name: `magic`, `session-lifecycle`, `ports`, `sessions`, `notes`, `locks`, `messaging`, `agents`, `inbox`, `webhooks`, `integration`, `dns`, `briefing`, `tunnels`, `projects`, `changelog`, `activity`, `system`, `tuples`, `pheromone`
**Integration signals:** Use `integration ready` and `integration needs` to coordinate service dependencies. When your service is ready, signal it so other agents can proceed.
## Core Concepts
### Semantic Identities: `project:stack:context`
Every service gets a semantic name. The name IS the port — deterministic hashing means the same identity always maps to the same port. Identities are indexed in an in-memory **Adaptive Radix Tree** for O(k) lookups (where k is key length), replacing SQL LIKE scans.
```bash
pd claim myapp:api:main # Always gets port 3142 (or whatever hash gives)
pd claim myapp:api:feature-auth # Different port, same project
pd find 'myapp:*' # Prefix search — resolves through the trie, not SQL
pd find 'myapp:*:main' # Wildcard — all stacks with context "main"
```
### Sessions & Notes
Sessions track what each agent is doing. Notes are **immutable** — once written, they can never be edited or deleted. This creates an audit trail that agents and humans can trust. Notes are **encrypted at rest** with AES-256-GCM (master key at `~/.port-daddy/master.key`, auto-generated on first boot).
```bash
pd begin --identity myapp:api --purpose "Building auth"
pd note "Found SQL injection in token validation"
pd note "Patched. Tests green."
pd done
```
### Salvage (Dead Agent Recovery)
When an agent crashes, its session enters the salvage queue. Another agent can claim and continue the work:
```bash
pd salvage --project myapp # See dead agents' context
pd salvage claim dead-agent-42 # Pick up their work
```
**IMPORTANT:** Always check `pd salvage` at the start of a session. You might be able to continue where a crashed agent left off instead of starting from scratch.
### File Claims (Advisory)
```bash
pd session files claim src/auth/*.ts
# Another agent tries the same file:
pd session files claim src/auth/login.ts
# → CONFLICT: claimed by agent 'myapp:api'
```
Claims are advisory — they warn, don't lock. Hard locks cause deadlocks. Advisory claims cause conversations.
### Pub/Sub Messaging
Agents signal each other through channels:
```bash
# Agent A finishes database setup
pd pub myapp:events "database-ready"
# Agent B was watching
pd watch myapp:events --exec "npm run migrate"
```
### Distributed Locks
For operations that truly must be exclusive:
```bash
pd with-lock deployment -- npm run deploy
# Or manually:
pd lock db-migration --ttl 300
pd unlock db-migration
```
## Binary IPC Protocol (v3.8.2)
High-frequency agent communication over a Unix domain socket with MessagePack encoding. The IPC channel sits alongside the HTTP API — agents that need low-latency communication (heartbeats, pheromone sprays, pub/sub publish) use IPC automatically when the daemon is running.
**Key properties:**
- **7-byte header**: `[type:1][conv_id:4][payload_len:2]` + MessagePack payload
- **70-80% bandwidth reduction** vs HTTP JSON
- **~3us latency** for fire-and-forget operations (vs ~200us HTTP)
- **13 FIPA performatives**: INFORM, REQUEST, QUERY_REF, REFUSE, FAILURE, NOT_UNDERSTOOD, SUBSCRIBE, UNSUBSCRIBE, etc.
- **Fire-and-forget**: heartbeats, pheromone sprays, pub/sub publish (conv_id=0)
- **Request-response**: claims, locks, sessions (conv_id for correlation)
- **Pub/sub subscriptions**: with dead-man cleanup on disconnect
- **Auto-reconnect**: client reconnects with subscription replay on socket drop
- **SDK fast paths**: `heartbeat()`, `pheromoneSpray()`, `publish()` auto-use IPC when available
**Socket location:** `~/.port-daddy/daemon.ipc`
**Security hardening:**
- Rate limiting: 500 frames/sec per connection
- Connection limit: 256 max (REFUSE for excess)
- 3-strike protocol violation budget (malformed frames disconnect)
- Backpressure via write queue + drain events
- Lock release on IPC disconnect
You don't need to use IPC directly. The SDK and CLI use it transparently for hot-path operations.
## Fleet: Background Agents (v3.8.0)
Declare agents in YAML. They fire on git commits, cron schedules, or pub/sub messages. Auto-respawn on crash with circuit breaker.
```bash
pd fleet init # Creates pd-fleet.yml + git hook
pd fleet up # Starts the fleet
git commit -m "fix auth" # QA, docs, cartographer fire automatically
pd fleet status # What is the fleet doing?
pd fleet down # Stop the fleet
```
The starter fleet includes: **QA** (bug hunting), **Documentarian** (docs sync), **Cartographer** (roadmap tracking), **Spark** (idea generation), **Spider** (cross-feature connections).
```yaml
# pd-fleet.yml
fleet:
name: myapp
harbor: "{project}:fleet"
agents:
qa:
trigger: git:committed # React to pub/sub events
respawn: true # Auto-restart on crash
max_respawns: 3 # Circuit breaker
backend: claude-cli
allowedTools: "Read,Grep,Glob,Bash(npm test*)"
prompt: "Review the last commit for bugs..."
gardener:
schedule: "*/10 * * * *" # Or run on a cron schedule
backend: custom
prompt: "git status --porcelain"
on_success: publish git:status # Chain agents via channels
channels:
git:committed:
description: "Fired after a successful commit"
consumers: [qa]
```
**Key features:**
- Works with any LLM backend: `claude-cli`, `ollama`, `gemini`, `aider`, `custom`
- Template variables (`{project}`) resolve from the YAML context
- `on_success: publish ` chains agents via pub/sub (DAG topology validated at startup)
- Fleet harbor auto-created on `pd fleet up` — all agents share a semantic namespace
- Each agent gets full PD coordination: registration, sessions, heartbeats, salvage on crash
- Auto-respawn with `respawn: true` and `max_respawns` circuit breaker
## Tuple Space: Shared Swarm Memory (v3.8.0)
Agents write typed tuples to a shared space. Other agents query by pattern. Based on Linda (Gelernter, 1985). Harbor-scoped for fleet isolation. TTL for auto-expiry.
```bash
# Spider writes a connection it discovered
pd tuple out '["connection", "trie+pubsub=routing", "spider", 0.9]' --harbor myapp:fleet
# Spark reads all connections with confidence > 0.7
pd tuple rd '["connection", "*", "*", ">0.7"]' --harbor myapp:fleet
# Take (remove) a processed task from the space
pd tuple in '["task", "build-auth", "pending"]'
# Scan all tuples in a harbor
pd tuple scan --harbor myapp:fleet
# Count tuples
pd tuple count --harbor myapp:fleet
```
Pattern matching: exact values, `*` wildcard, `>N`/`