# pi-taskflow vs the Broader Agent-Orchestration Landscape (June 2026) > Cross-ecosystem comparison (beyond Pi). For the Pi-internal comparison see > [`PI-ECOSYSTEM.md`](./PI-ECOSYSTEM.md); for strategy see [`STRATEGY.md`](./STRATEGY.md). > Researched via live web search; `UNVERIFIED` = not confirmed in fetched sources. > Values: `YES` confirmed · `PARTIAL` exists with limits · `NO` absent · `PLANNED` roadmap. ## Full Matrix | Framework | Orchestration model | Authoring | Durable / resume | Dynamic fan-out | Human-in-loop | Quality gate | Static verification | Cross-run memoization | Observability | Deps | Uniquely good at | |---|---|---|---|:--:|:--:|:--:|:--:|:--:|---|:--:|---| | **▼ LOCAL / EMBEDDED DAG DSL** ||||||||||| | **pi-taskflow** | Declarative JSON-DSL DAG (10 phase types) | JSON data | phase-level input-hash cache + cross-session resume | **YES** (`map`) | YES (approval) | YES (verdict) | NO (planned) | **YES** | basic (live DAG render; no OTel) | **ZERO** | zero-dep declarative subagent DAG; phase-hash caching; when-guards; budget caps; loop·tournament·cross-run cache | | **▼ GRAPH / STATE-MACHINE** ||||||||||| | LangGraph | StateGraph nodes/edges + Pregel super-steps | Python/JS code | checkpointer every super-step; time travel | YES (`Send`) | YES (`interrupt()` any node) | NO builtin | YES (`.compile()` + typed state) | within-session only | YES (LangSmith) | heavy | checkpoint durability + time travel; largest community | | Google ADK | Directed graph + agent-tree hierarchy | Python/JS/Go/Java/Kotlin | durable memory; event-driven dormancy | YES | YES (any-node) | YES (safety framework) | YES (inferred from graph compile) | UNVERIFIED | YES (builtin logs/metrics/traces) | medium | multi-language agent-tree hierarchy; first-class delegation | | **▼ ROLE / TEAM-BASED** ||||||||||| | CrewAI | role-goal-backstory sequential/hierarchical | Python SDK | PARTIAL (no auto-recovery per Diagrid) | PARTIAL | YES (builtin HITL) | UNVERIFIED | UNVERIFIED | UNVERIFIED | YES (rich tracing + OTel) | light | fastest prototyping; 44K★; 60% Fortune-500 adoption | | AutoGen / AG2 | async actor + group-chat | Python SDK | NO (maintenance mode) | PARTIAL | PARTIAL (user proxy) | UNVERIFIED | UNVERIFIED | UNVERIFIED | NO (prototype-grade) | medium | research-style agent conversations; 58K★ | | MS Agent Framework | graph superstep/Pregel | .NET/Python SDK | PARTIAL (superstep snapshot) | YES | YES (RequestPort + resume) | UNVERIFIED | PARTIAL (topology SHA-256) | UNVERIFIED | YES (Azure Foundry + OTel) | heavy | unified AutoGen+SK successor; A2A/AG-UI/MCP; GA Apr 2026 | | **▼ DURABLE EXECUTION** ||||||||||| | Temporal | event-sourced Workflow + Activity | TS/Py/Go/Java/Ruby/C#/PHP | **GOLD STANDARD** — replay from snapshot; exactly-once | YES (child workflows) | YES (Signals, awaits indefinitely) | UNVERIFIED | PARTIAL (determinism checks) | UNVERIFIED | YES (UI + OpenMetrics + OTel) | heavy | industrial durability at scale (OpenAI Codex, Replit, Cursor) | | Inngest AgentKit | event-driven durable functions + Network/Router | TypeScript | YES (each step persisted; resume from failed step) | YES (code router + LLM routing) | UNVERIFIED | UNVERIFIED | minimal (Zod) | UNVERIFIED | YES (dashboard + traces) | medium-light | TS-native agent networking; React streaming; MCP first-class | | Mastra | step-engine + autonomous agents | TypeScript (Vercel AI SDK) | YES (`suspend()`/`resume()`; snapshot; time-travel) | YES | **first-class** (`suspend`/`resume`/`bail` typed) | **YES** (`@mastra/evals` scorers) | PARTIAL (Zod) | UNVERIFIED | YES (Studio UI) | light | batteries-included TS framework; first-class evals; ~25K★, 300K+ wk dl | | **▼ PLATFORM SDKs / LOW-CODE** ||||||||||| | OpenAI Agents SDK | agent handoff (tool delegation) | Python SDK | YES (session backends) | UNVERIFIED (1:1 handoff) | YES | PARTIAL (guardrails/tripwires) | UNVERIFIED | UNVERIFIED | YES (builtin tracing) | light | first-party handoffs; tight OpenAI integration; guardrails | | Dify | visual builder; single-agent + RAG (multi-agent not GA) | low-code canvas | UNVERIFIED | YES (branches) | PARTIAL | UNVERIFIED | UNVERIFIED | UNVERIFIED | YES (dashboard + cost) | heavy | visual AI-app builder + integrated RAG | | n8n AI Agents | event-driven automation; AI = 1 node of 400+ | low-code nodes | YES (state persisted; retry from failure) | YES (Split-in-Batches) | YES (Wait node) | UNVERIFIED | UNVERIFIED | PARTIAL (data pinning + cache node) | YES (execution history) | moderate | general automation; 400+ integrations | | AWS Bedrock multi-agent | hierarchical supervisor-collaborator | console + IaC | PARTIAL (managed state) | YES (supervisor→collaborators) | YES (confirmation prompts) | **YES** (per-agent guardrails) | UNVERIFIED | UNVERIFIED | YES (CloudWatch + X-Ray) | heavy | fully-managed enterprise multi-agent; native guardrails | ## Where pi-taskflow is already differentiated TODAY 1. **Zero dependencies.** Every competitor needs at minimum an SDK package; most need a server/DB/cloud. 2. **Declarative-as-data authoring.** The entire DAG is data → programmatic generation, diffing, and static reasoning without execution. Competitors are code-first or visual. 3. **Phase-level input-hash caching.** Precise invalidation: unchanged inputs → cached outputs. LangGraph checkpoints super-steps but doesn't content-address phases. 4. **Quality gate verdict as a first-class DAG phase** (not an eval add-on or a safety filter). 5. **Budget caps per phase** with hard enforcement (others only *track* usage). ## Where competitors currently beat pi-taskflow | Capability | Winner | Our gap | |---|---|---| | Gold-standard durability (event-sourcing, crash recovery, exactly-once) | Temporal | session resume only; mid-phase crash may re-execute | | Node-level checkpoint + time-travel | LangGraph | phase-level only | | Evaluation/scoring framework | Mastra (`@mastra/evals`) | verdicts but no eval framework to produce them | | Visual low-code builder | Dify, n8n | JSON-DSL only | | Fast prototyping with role metaphor | CrewAI | JSON authoring, no role abstraction | | Built-in guardrails | OpenAI Agents SDK, Bedrock | gate but no inline guardrails | | Multi-language agents | Google ADK, Temporal | TS/Node only | | Managed enterprise deployment | AWS Bedrock | self-hosted | | Production observability (OTel/Prometheus) | most | live DAG render only | | Community & ecosystem | CrewAI/AutoGen/Mastra | niche project | ## White space nobody owns yet (surpass targets) | # | White space | State across all competitors | Our opportunity | |---|-------------|------------------------------|-----------------| | 1 | **Zero-token static DAG verification** | none has a dead-phase/unreachable/ref analyzer that runs without an LLM | ship `verify` — structural correctness for 0 tokens | | 2 | **Cross-run memoization keyed on phase input hash** | nobody (Temporal=within-run, LangGraph=within-session) | `cache` — **✅ shipped** (git/glob/file/env fingerprints, TTL, LRU eviction) | | 3 | **Declarative-as-data multi-target compilation** | nobody (all runtime-coupled) | "LLVM of agent orchestration" — compile one DSL to many runtimes | | 4 | **Typed human-approval verdict schemas** | most have generic pause/approve | formalize verdict outcomes + auto-routing | | 5 | **Budget-aware DAG with hard enforcement** | all track, none enforce | budget pools + pre-flight cost estimation | | 6 | **Subagent-native orchestration** | none targets a coding agent's internal subagent pipeline | defensible specialization for Pi's AGENTS.md routing | | 7 | **Worktree-isolated phase execution** | none isolates per-phase filesystem | worktree-per-phase with explicit merge | | 8 | **Tournament/bracket pattern** | none has rank-and-promote | `tournament` phase type — **✅ shipped** | | 9 | **Loop-until-done with convergence detection** | LangGraph has cycles, none has declarative convergence loop | `loop` phase type — **✅ shipped** (until+convergence+maxIterations) | ## Key Insights - **Nobody owns cross-run memoization** — pi-taskflow shipped it (`cache` → persistent store with fingerprint guards); nobody else has equivalent. - **Temporal is the only true durability** — we don't need to match it; Mastra-level suspend/resume + phase-hash caching is enough for subagent orchestration. - **The visual-builder gap is real but may not matter** for our target users (Pi subagent operators). - **Multi-target compilation is the deepest moat** — competitors are runtime-coupled and structurally can't do it. --- *All cells grounded in public docs/blogs as of 2026-06-08. No private docs or live testing. Diagrid's March 2026 durability analysis applies: only Temporal offers true Durable Execution.*