# Dagu Dagu is a self-contained workflow orchestration engine for running DAGs defined in YAML. It runs as a single binary without requiring an external database or message broker. It stores state locally by default and supports local, queued, and distributed execution modes. Use this compact reference when authoring, validating, or troubleshooting Dagu workflows with an AI agent. For the full human documentation, see https://docs.dagu.sh/. --- # DAG Authoring Load only the reference file that matches the task. ## Default Approach - Prefer `type: graph` for new DAGs. It supports both sequential flow via `depends:` and parallel flow. - Prefer `id` on every step. Omit `name` unless the display label must differ from the step ID. - Prefer `dagu enqueue` over `dagu start` for agent-run workflows. - Prefer `dagu schema ...` and `dagu validate ...` over guessing field names or shapes. - Prefer `action: template.render` when generating text files, prompts, or artifacts instead of assembling them with shell `echo` or heredocs. - Prefer `file.*` actions for local file operations such as stat, read, write, copy, move, delete, mkdir, and list instead of shelling out to `cp`, `mv`, `rm`, or `mkdir`. - Prefer `stdout.artifact` / `stderr.artifact` when a command stream should become a DAG-run artifact, especially for large reports, JSON, Markdown, logs, or generated files. - Prefer `artifact.*` actions for explicit artifact reads/writes/lists. Use `DAG_RUN_ARTIFACTS_DIR` only when a tool truly needs a filesystem path inside the step. - Prefer string-form `output: VAR_NAME` for capturing small stdout values into flat variables. - Prefer object-form `output:` when downstream steps need structured values via `${step_id.output.*}`. - Prefer `stdout.outputs` or `action: outputs.write` when a DAG or remote action needs to return caller-visible values via `${step_id.outputs.*}`. - Prefer `state.*` actions for small persistent JSON state across DAG runs, such as cursors, checkpoints, and previous-value comparisons. - Prefer temporary files in the artifacts dir only when downstream steps need file paths; otherwise let commands write large artifact content to stdout and attach it with `stdout.artifact`. - Declare portable external CLI dependencies in top-level `tools` using aqua shorthand when the binary version affects reproducibility, for example `tools: ["jqlang/jq@jq-1.7.1"]`. - For remote actions, put `tools` in the referenced action DAG file, not in `dagu-action.yaml`; caller DAG tools are not inherited across the action boundary. - Do not add `tools` for CLIs that intentionally depend on user or worker preconfiguration, login state, local profiles, plugins, or credentials, such as `gcloud` and AI agent CLIs. - Use remote action packages (`dagu-action.yaml`) when reusable logic needs helper files, its own DAG, versioning, or an input/output schema contract. ## High-Signal Rules - `output:` has two modes: - string form captures trimmed stdout into a flat variable such as `${VERSION}` - object form publishes structured step-scoped output for `${step_id.output.*}` access - `stdout.artifact` / `stderr.artifact` store command stdout/stderr directly as relative artifact paths, for example `stdout: {artifact: reports/report.md}`. Artifact outputs auto-enable artifacts unless `artifacts.enabled: false` is explicitly set, which is invalid. - `${step_id.stdout}` is a log file path, not stdout content. - `env:` should use list-of-maps when values depend on earlier env vars. - `params:` values arrive as strings. The `params:` field supports JSON schema-like types and validation, check for schema to see how to specify types and validation rules. - Do not assume `bash` for `run:` steps. If a script depends on a specific interpreter, add a shebang such as `#!/bin/sh` or `#!/usr/bin/env bash` only after checking that shell exists on the target host or container. Otherwise keep the script portable or set `with.shell:` explicitly. - `parallel:` currently requires `action: dag.run` to a child DAG. - Sub-DAGs do not inherit parent env vars; pass what you need via `params:`. - For arbitrary text inside shell steps, prefer `printenv VAR_NAME` or `action: template.render` over `${VAR}` interpolation. - DAG/action outputs are collected from string-form `output: VAR_NAME`, `stdout.outputs`, and `action: outputs.write`. Object-form `output:` stays step-scoped for `${step_id.output.*}` unless the workflow explicitly republishes values through `stdout.outputs` or `outputs.write`. - `state.get`, `state.set`, `state.delete`, `state.list`, and `state.diff` persist small JSON values across DAG runs. State scopes are `dag`, `root_dag`, `global`, and `custom`; use artifacts or external storage for large payloads. - Remote action packages define `dagu-action.yaml` with `apiVersion: v1alpha1`, `name`, `dag`, and optional `inputs`/`outputs` JSON Schemas. `inputs` validates caller `with:` before the action DAG starts; `outputs` validates the final action output object after the action DAG returns. - Remote action manifests do not support `tools`. Declare external CLI tools in the action DAG itself so local and distributed workers prepare the right binaries for that action run. - In remote action examples, prefer `dag: workflow.yaml` for the action DAG filename. The `dag` field accepts any safe relative file path, but `workflow.yaml` avoids confusing the executable DAG with the `dagu-action.yaml` manifest. - Object-form `output:` with `decode: json` or `decode: yaml` can act as lightweight runtime validation. Malformed data or an unresolved `select:` path fails the step, so normal `retry_policy` applies. - Use `dagu schema dag` to check the full list of available fields and their shapes. - Use `dagu example` to see different DAG patterns and how to express them in YAML. ## Example of Params, template step, and artifacts ```yaml params: type: object properties: name: type: string maxLength: 50 age: type: integer minimum: 0 maximum: 120 favorite_color: type: string required: [name, age] steps: - id: render action: template.render with: data: name: ${name} age: ${age} favorite_color: ${favorite_color} template: | Hello, {{ .name }}! You are {{ .age }} years old. {{- if .favorite_color }} Your favorite color is {{ .favorite_color }}. {{- end }} stdout: artifact: greeting.txt ``` ## Example of Large Command Output as Artifact ```yaml steps: - id: report run: ./generate-report --format markdown stdout: artifact: reports/report.md ``` ## Example of Reproducible External CLI ```yaml tools: - jqlang/jq@jq-1.7.1 steps: - id: inspect run: jq --version ``` ## Example of Object-Form Output ```yaml steps: - id: inspect_build run: echo '{"version":"v1.2.3","artifact":{"url":"https://example.test/app.tgz"}}' output: # decode + select act as a lightweight contract check: # malformed JSON or a missing selected field fails the step. version: from: stdout decode: json select: .version artifact: from: stdout decode: json select: .artifact - id: publish depends: [inspect_build] output: versionLabel: "ver - ${inspect_build.output.version}" artifactUrl: "${inspect_build.output.artifact.url}" ``` ## Example of Action Outputs ```yaml steps: - id: classify run: ./classify.sh "${INPUT}" stdout: outputs: fields: label: decode: json select: .label confidence: decode: json select: .confidence - id: publish depends: [classify] action: outputs.write with: values: label: ${classify.outputs.label} reviewed: false ``` ## Reference Guide Load only the file you need: - `references/steptypes.md` when choosing an action or checking executor-specific caveats such as `dag.run`, `parallel`, `jq.filter`, `file.*`, `state.*`, or `template.render` - `references/dagu-action.md` when creating a reusable `dagu-action.yaml` package or checking action input/output schema behavior - `references/cli.md` when you need command flags or lookup commands such as `dagu schema`, `dagu config`, or `dagu history` - `references/env.md` when execution environment variables, `DAGU_*` config vars, or `params:`/`env:` resolution order matters - `references/codingagent.md` only when the DAG itself runs AI coding agents as steps --- # Actions ## run: Shell Commands And Scripts Use top-level `run:` for local shell commands and scripts. ```yaml steps: - id: hello run: echo "hello" - id: multi_line run: | echo "step 1" echo "step 2" - id: custom_shell run: | set -euo pipefail echo "running in bash" with: shell: /bin/bash ``` Fields: - `run` - command string or multi-line shell script - `with.shell` - shell interpreter, for example `/bin/bash` - `with.shell_args` - shell interpreter arguments - `with.shell_packages` - optional packages to install before execution Notes: - Dagu expands `${VAR}` before the shell runs. For large or arbitrary text, prefer `printenv VAR_NAME`, reading `${step_id.stdout}` as a file, or `action: template.render`. - When large command output should become an artifact, write it to stdout/stderr and attach the stream directly instead of redirecting inside shell: ```yaml steps: - id: report run: ./generate-report --format markdown stdout: artifact: reports/report.md ``` - Use string-form `output: VAR_NAME` only for small stdout values. Large reports, JSON dumps, Markdown summaries, and logs belong in `stdout.artifact` / `stderr.artifact`. ## docker.run / container.run Run commands in Docker containers. ```yaml steps: - id: build action: docker.run with: image: golang:1.23 pull: always auto_remove: true working_dir: /app volumes: - /local/src:/app command: go build ./... ``` `with` fields: `image`, `container_name`, `pull`, `auto_remove`, `working_dir`, `volumes`, `network`, `platform`, `command`. ## dag.run Execute another DAG as a child DAG. ```yaml steps: - id: child action: dag.run with: dag: child-workflow params: input: /data/file.csv ``` Sub-DAGs do not inherit parent env vars. Pass values explicitly via `with.params`. ## outputs.write Publish DAG or remote action outputs assembled from literals, parameters, or prior step values. ```yaml steps: - id: send run: ./scripts/notify.sh "${text}" output: response: from: stdout decode: json - id: publish depends: [send] action: outputs.write with: values: messageId: ${send.output.response.id} status: sent ``` Published values are available as `${publish.outputs.messageId}` in the same DAG. When the step runs inside a remote action DAG, the parent action caller reads the final action outputs as `${action_step.outputs.messageId}`. Notes: - `values` must be a non-empty object. - Keep values small and JSON-compatible; use artifacts for files, reports, logs, screenshots, or large JSON payloads. - If the remote action manifest declares an `outputs` schema, Dagu validates the final collected action output object after the action DAG returns. `outputs.write` itself does not validate the manifest. ## state.get / state.set / state.delete / state.list / state.diff Read and write persistent JSON state that survives across DAG runs. Use state actions for cursors, checkpoints, and comparing the current result with the previous run. Use artifacts or external storage for large files. ```yaml steps: - id: load_cursor action: state.get with: key: cursors/feed default: null - id: save_cursor action: state.set with: key: cursors/feed value: ${fetch.output.nextCursor} - id: detect_change action: state.diff with: key: snapshots/feed value: ${fetch.output.items} update: true ``` Scope fields: - `scope` - state scope: `dag` (default), `root_dag`, `global`, or `custom` - `namespace` - namespace override. For `custom` scope, this is required. Default namespaces: - `dag` - current DAG name - `root_dag` - root DAG name for nested DAG runs - `global` - `_` - `custom` - no default; set `namespace` Operation fields: - `state.get`: `key`, optional `default`, `required` - `state.set`: `key`, `value`, optional `expected_version`, `create_only` - `state.delete`: `key` - `state.list`: optional `prefix`, `limit`, `include_values` - `state.diff`: `key`, `value`, optional `expected_version`, `update` All state actions write JSON to stdout. Common output fields include `operation`, `scope`, `namespace`, and key or prefix information. - `state.get` returns `found`, and when found, `value`, `version`, and `hash`. If not found and `default` is set, `value` contains the default. - `state.set` returns `version`, `hash`, and `created`. - `state.delete` returns `deleted`. - `state.list` returns `entries`; entry values are omitted unless `include_values` is true. - `state.diff` returns `changed`, `foundPrevious`, `current`, optional `previous`, and `version` / `hash` when the stored value was written or already exists. Values must be JSON-serializable. Dagu normalizes state values before storing them and enforces the state payload size limit after normalization. ## parallel `parallel:` currently works only with `action: dag.run`. ```yaml steps: - id: fan_out action: dag.run with: dag: process-item parallel: items: - item1 - item2 - item3 max_concurrent: 5 - id: fan_out_dynamic action: dag.run with: dag: process-item parallel: ${ITEMS} ``` Each child invocation receives the current item as `ITEM`. ## ssh.run / sftp.upload / sftp.download Remote command execution and file transfer over SSH. ```yaml steps: - id: remote action: ssh.run with: user: deploy host: server.example.com key: ~/.ssh/id_rsa timeout: 60s command: systemctl restart app - id: upload action: sftp.upload with: user: deploy host: server.example.com key: ~/.ssh/id_rsa source: /local/file.tar.gz destination: /remote/file.tar.gz ``` Shared SSH fields: `user`, `host`, `port`, `key`, `password`, `timeout`, `strict_host_key`, `known_host_file`, `shell`, `shell_args`, `bastion`. ## http.request HTTP requests. ```yaml steps: - id: api_call action: http.request with: method: POST url: https://api.example.com/data headers: Authorization: "Bearer ${TOKEN}" Content-Type: application/json body: '{"key": "value"}' json: true timeout: 30 ``` `with` fields: `method`, `url`, `timeout`, `headers`, `query`, `body`, `silent`, `debug`, `json`, `skip_tls_verify`. ## jq.filter JSON processing. ```yaml steps: - id: transform action: jq.filter with: filter: ".items[] | {name: .name, count: .quantity}" data: items: - name: a quantity: 1 - id: transform_file action: jq.filter with: filter: .name input: ${fetch_json.stdout} ``` Use `with.data` for inline JSON or `with.input` for a JSON file path. Do not set both. ## template.render Render text using Go `text/template`. ```yaml steps: - id: render action: template.render with: data: name: Alice template: | Hello, {{ .name }}! output: RESULT ``` `with.template` is required and is rendered as a template, not executed as shell. `with.output` writes rendered content to a file; top-level `output:` captures or publishes step output. ## file.stat / file.read / file.write / file.copy / file.move / file.delete / file.mkdir / file.list Local filesystem operations. ```yaml steps: - id: ensure_output_dir action: file.mkdir with: path: ${DAG_RUN_ARTIFACTS_DIR}/reports - id: write_report action: file.write with: path: ${DAG_RUN_ARTIFACTS_DIR}/reports/summary.txt content: "status=ok\n" overwrite: true - id: copy_report action: file.copy with: source: ${DAG_RUN_ARTIFACTS_DIR}/reports/summary.txt destination: ${DAG_RUN_ARTIFACTS_DIR}/reports/latest.txt overwrite: true - id: list_reports action: file.list with: path: ${DAG_RUN_ARTIFACTS_DIR}/reports pattern: "*.txt" ``` Use `path` for `file.stat`, `file.read`, `file.write`, `file.delete`, `file.mkdir`, and `file.list`. Use `source` and `destination` for `file.copy` and `file.move`. `file.write` also requires `content`. `with` fields: `path`, `source`, `destination`, `content`, `mode`, `format`, `pattern`, `overwrite`, `create_dirs`, `atomic`, `recursive`, `missing_ok`, `dry_run`, `include_dirs`, `follow_symlinks`, `max_bytes`. Safety defaults: - `overwrite` defaults to false for write, copy, and move. - `atomic` defaults to true for file writes. - `recursive` is required for directory copy and directory delete. - `file.delete` refuses to delete the filesystem root. - Copy and move reject the same source and destination, and directory copy rejects destinations inside the source tree. ## postgres.query / sqlite.query / postgres.import / sqlite.import SQL database queries and imports. ```yaml steps: - id: query action: postgres.query with: dsn: "postgres://user:pass@localhost:5432/db" query: "SELECT * FROM users WHERE active = true" output_format: json timeout: 120 transaction: true ``` `with` fields include `dsn`, `query`, `params`, `timeout`, `transaction`, `isolation_level`, `output_format`, `headers`, `null_string`, `max_rows`, `streaming`, `output_file`, and `import`. ## redis. Redis operations use the operation in the action name. ```yaml steps: - id: cache_set action: redis.set with: url: "redis://localhost:6379" key: mykey value: myvalue ttl: 3600 ``` Connection fields: `url`, `host`, `port`, `password`, `username`, `db`, TLS fields, `mode`, `timeout`, `max_retries`. ## s3.upload / s3.download / s3.list / s3.delete S3 object operations. ```yaml steps: - id: upload action: s3.upload with: region: us-east-1 bucket: my-bucket key: data/output.csv source: /local/output.csv ``` Connection fields: `region`, `endpoint`, `access_key_id`, `secret_access_key`, `session_token`, `profile`, `force_path_style`. ## mail.send Send email. ```yaml steps: - id: notify action: mail.send with: from: noreply@example.com to: team@example.com subject: "Build Complete" message: "The build finished successfully." ``` SMTP server settings come from global configuration. ## archive.create / archive.extract / archive.list Archive operations. ```yaml steps: - id: compress action: archive.create with: source: /data/output destination: /data/output.tar.gz format: tar.gz exclude: - "*.tmp" ``` `with` fields: `source`, `destination`, `format`, `compression_level`, `password`, `overwrite`, `strip_components`, `include`, `exclude`. ## agent.run AI agent loop with tools. ```yaml steps: - id: research action: agent.run with: task: "Begin research on ${TOPIC}" model: claude-sonnet-4-20250514 tools: enabled: - web_search - bash skills: - my-skill-id max_iterations: 50 safe_mode: true ``` Use `with.task`, `with.prompt`, or `with.messages` for the user input. ## harness.run Run coding agent CLIs such as Claude Code, Codex, Copilot, OpenCode, and Pi. ```yaml harnesses: gemini: binary: gemini prefix_args: ["run"] prompt_mode: flag prompt_flag: --prompt harness: provider: gemini model: gemini-2.5-pro fallback: - provider: claude model: sonnet steps: - id: generate_tests action: harness.run with: prompt: "Write unit tests for the auth module" yolo: true output: RESULT ``` `with.prompt` is the prompt. `with.stdin` is piped to stdin as supplementary context. `with.provider` can reference a built-in provider or a top-level `harnesses:` entry. ## router.route Conditional routing based on expression value. Routes reference existing step IDs. ```yaml steps: - id: check_status run: "curl -s -o /dev/null -w '%{http_code}' https://example.com" output: STATUS - id: route action: router.route with: value: ${STATUS} routes: "200": - handle_ok "re:5\\d{2}": - handle_error - send_alert depends: [check_status] - id: handle_ok run: echo "success" - id: handle_error run: echo "server error occurred" - id: send_alert run: echo "alerting on-call" ``` Routes are evaluated in priority order: exact matches first, then regex, then catch-all. --- # Remote Action Packages Use this reference when creating a reusable package-style action with `dagu-action.yaml`. Remote actions are different from DAG-local `actions:` templates: - DAG-local `actions:` are inline wrappers around built-in actions. - Remote actions are directories or Git repositories that contain a manifest, a DAG entrypoint, and any helper files the action needs. - Callers use them with `action: owner/repo@version`, `action: name@version`, or `action: source:target@version`. ## Package Layout ```text dagu-action-notify/ ├── dagu-action.yaml ├── workflow.yaml └── scripts/ └── notify.sh ``` This reference uses `workflow.yaml` as the recommended entrypoint DAG filename to keep it visually distinct from the `dagu-action.yaml` manifest. The `dag` field can point to any safe relative file path inside the package. `dagu-action.yaml` supports exactly these fields: - `apiVersion` - required, currently `v1alpha1` - `name` - required action name - `dag` - required relative path to the action DAG file - `inputs` - optional JSON Schema object for the caller's `with:` - `outputs` - optional JSON Schema object for the action output object Unknown manifest keys are rejected. The `dag` path must resolve to a file inside the package. ## Manifest Example ```yaml apiVersion: v1alpha1 name: notify dag: workflow.yaml inputs: type: object additionalProperties: false required: [text] properties: text: type: string outputs: type: object additionalProperties: false required: [messageId] properties: messageId: type: string status: type: string ``` `inputs` validates the caller's `with:` object before the action DAG starts. JSON Schema `default` values are validated as schema defaults, but they are not applied to the caller's `with:` object before parameters are passed. ## Action DAG The action DAG is a normal Dagu workflow. Do not set `working_dir` in the action DAG or local sub-DAGs inside the package; Dagu runs them in the materialized action workspace so relative package files are available. ```yaml tools: - jqlang/jq@jq-1.7.1 params: - text steps: - id: send run: ./scripts/notify.sh "${text}" stdout: outputs: fields: messageId: decode: json select: .id status: decode: json select: .status ``` Scalar `with:` fields are passed as runtime parameters and can be read as `${text}`. For structured input, pass an explicit JSON string and decode it in the action DAG; do not assume nested YAML/JSON input objects arrive as structured params. ## Tools If the action DAG invokes portable external CLIs, declare them with top-level `tools` in the action DAG file. Do not put `tools` in `dagu-action.yaml`; unknown manifest keys are rejected. Caller DAG tools are not inherited by remote actions. The action DAG is a separate DAG run, and the worker running it prepares that DAG's tools in the worker-local tools cache. Built-in-only actions do not need `tools`, but reusable action packages that call binaries such as `jq`, `yq`, or release helpers should pin those dependencies inside the action DAG. ## Returning Outputs Use `stdout.outputs` when a command emits the action result on stdout: ```yaml steps: - id: classify run: ./classify.sh "${text}" stdout: outputs: fields: label: decode: json select: .label confidence: decode: json select: .confidence ``` Use `outputs.write` when the result is assembled from parameters, previous step output, or literals: ```yaml steps: - id: send run: ./scripts/notify.sh "${text}" output: response: from: stdout decode: json - id: publish depends: [send] action: outputs.write with: values: messageId: ${send.output.response.id} status: sent ``` Do not use object-form `output:` to return data to the parent DAG. Object-form `output:` is step-scoped inside the action DAG and is read as `${step_id.output.*}`. To cross the action boundary, republish values with `stdout.outputs` or `outputs.write`. If the manifest declares `outputs`, Dagu validates the final collected action output object after the action DAG returns a run result. Validation failure fails the parent action step. The action executor also writes compact output JSON to the action step stdout for compatibility, but callers should read structured values through `${step.outputs.}`. Compatibility note: if an action DAG publishes no typed outputs, legacy string-form run outputs from `output: NAME` can be carried as action outputs. New action packages should prefer `stdout.outputs` or `outputs.write` because those define the action boundary explicitly. ## Caller Example ```yaml steps: - id: notify action: acme/dagu-action-notify@v1.2.0 with: text: "Build ${BUILD_ID} finished" - id: audit depends: [notify] run: echo "Message ID: ${notify.outputs.messageId}" ``` ## References And Workers Reference formats: - `name@version` - official Dagu action, resolved as `dagucloud/name` - `owner/repo@version` - GitHub repository - `source:target@version` - explicit local path, `file://` path, or Git source Use immutable tags or commit SHAs for production. Local `source:` paths are useful for development and shared-volume workers only when the worker executing the action can read the same path. For shared-nothing or heterogeneous workers, prefer GitHub or explicit Git `source:` refs. After resolution, Dagu packages the action workspace and can send that bundle to the worker running the child action DAG. --- # Dagu CLI Reference Global flags on all commands: `--config/-c`, `--dagu-home`, `--quiet/-q`, `--cpu-profile` Advanced and deprecated flags below remain implemented in `internal/cmd/start.go`, `internal/cmd/enqueue.go`, `internal/cmd/exec.go`, and `internal/cmd/migrate.go`, so this reference keeps them documented even when they are mainly used by automation or backward-compatibility paths. ## Core Commands ### dagu start Execute a DAG. ```sh dagu start [flags] [-- params...] ``` Flags: - `--params/-p` — Parameters (key=value or positional) - `--name/-N` — Override DAG name - `--run-id/-r` — Custom run ID - `--from-run-id` — Historic dag-run ID to use as the template for a new run - `--labels` — Additional labels (comma-separated key=value or key-only) - `--tags` — Deprecated alias for `--labels` - `--default-working-dir` — Default working directory for DAGs without explicit workingDir - `--worker-id` — Worker ID executing this DAG run; auto-set in distributed mode and defaults to `local` - `--trigger-type` — Trigger source (`scheduler`, `manual`, `webhook`, `subdag`, `retry`, `catchup`); defaults to `manual` ### dagu enqueue Enqueue a DAG run for later execution. ```sh dagu enqueue [flags] [-- params...] ``` Flags: - `--params/-p` — Parameters (key=value or positional) - `--name/-N` — Override DAG name - `--run-id/-r` — Custom run ID - `--queue/-u` — Override the DAG-level queue definition - `--labels` — Additional labels (comma-separated key=value or key-only) - `--tags` — Deprecated alias for `--labels` - `--default-working-dir` — Default working directory for DAGs without explicit workingDir - `--trigger-type` — Trigger source (`scheduler`, `manual`, `webhook`, `subdag`, `retry`, `catchup`); defaults to `manual` ### dagu exec Execute a one-off command as a DAG run without a DAG YAML file. ```sh dagu exec [flags] -- [args...] ``` Flags: - `--run-id/-r` — Custom run ID - `--name/-N` — Override DAG name - `--workdir` — Working directory for the command (defaults to the current directory) - `--shell` — Override shell binary for the command - `--base` — Path to a base DAG YAML whose defaults are applied before inline overrides - `--env/-E` — Environment variable (`KEY=VALUE`) to include in the run; repeatable - `--dotenv` — Path to a dotenv file to load before execution; repeatable - `--worker-label` — Worker label selector (`key=value`) for distributed execution; repeatable ### dagu dequeue Dequeue a DAG run from a queue (marks it as aborted): `dagu dequeue [--dag-run/-d ]` ### dagu stop Stop an active DAG run: `dagu stop [--run-id/-r ]` ### dagu restart Stop and restart a DAG run: `dagu restart [--run-id/-r ]` ### dagu retry Retry a previous DAG run using the same run ID. ```sh dagu retry --run-id/-r [--step ] [--worker-id ] ``` ### dagu dry Dry-run a DAG without executing commands: `dagu dry [--params/-p] [--name/-N] [-- params...]` ### dagu validate Validate DAG YAML without executing: `dagu validate ` ### dagu status Show DAG run status: `dagu status [--run-id/-r ] [--sub-run-id/-s ]` ### dagu history Show DAG run history. ```sh dagu history [dag-name] ``` Flags: - `--from` — Start date/time in UTC (format: `2006-01-02` or `2006-01-02T15:04:05Z`) - `--to` — End date/time in UTC (same formats as `--from`) - `--last` — Relative time period (e.g. `7d`, `24h`, `1w`). Cannot combine with `--from`/`--to` - `--status` — Filter by status: `running`, `succeeded`, `failed`, `aborted`, `queued`, `waiting`, `rejected`, `not_started`, `partially_succeeded` - `--run-id` — Filter by run ID (partial match supported) - `--labels` — Filter by labels (comma-separated key=value or key-only, AND logic) - `--tags` — Deprecated alias for `--labels` - `--format/-f` — Output format: `table` (default), `json`, `csv` - `--limit/-l` — Max results (default 100, max 1000) Default: shows runs from the last 30 days, newest first. ### dagu cleanup Remove old DAG run history. Active runs are never deleted. ```sh dagu cleanup [--retention-days ] [--dry-run] [--yes/-y] ``` ### dagu schema Show JSON schema documentation. Use a dot-separated path to drill into nested sections. ```sh dagu schema [path] ``` Examples: - `dagu schema dag` — All DAG root-level fields - `dagu schema dag steps` — Step definition structure - `dagu schema dag steps.container` — Container configuration - `dagu schema dag steps.retry_policy` — Retry policy fields - `dagu schema dag steps.agent` — Agent step configuration - `dagu schema dag handler_on` — Lifecycle event hooks - `dagu schema config` — All config root-level fields - `dagu schema config auth` — Authentication configuration ### dagu config Show resolved configuration paths. ```sh dagu config ``` ## Server & Scheduling ### dagu start-all Start server + scheduler + optionally coordinator in one process. Coordinator enabled by default (disable with `DAGU_COORDINATOR_ENABLED=false`). ```sh dagu start-all [--host/-s ] [--port/-p ] [--dags/-d ] ``` Also accepts `--coordinator.*` and `--peer.*` flags for distributed setup. ### dagu server Start web UI + REST API. ```sh dagu server [--host/-s ] [--port/-p ] [--dags/-d ] [--tunnel/-t] ``` ### dagu scheduler Start cron scheduler. Monitors DAGs and triggers runs on schedule; also processes queued runs. ```sh dagu scheduler [--dags/-d ] ``` ## Distributed Execution ### dagu coordinator Start gRPC coordinator: `dagu coordinator [--coordinator.host/-H ] [--coordinator.port/-P ] [--peer.*]` ### dagu worker Start distributed worker: `dagu worker [--worker.id/-w ] [--worker.max-active-runs/-m ] [--worker.labels/-l ] [--worker.coordinators ] [--peer.*]` ## Git Sync `dagu sync ` — Git sync operations for DAG definitions. | Subcommand | Description | | ---------- | ----------- | | `sync status` | Show sync status (repository, branch, per-DAG status) | | `sync pull` | Pull changes from remote | | `sync publish [dag] [--message/-m] [--all] [--force/-f]` | Publish local changes to remote | | `sync discard [--yes/-y]` | Discard local changes, restore remote version | | `sync forget ... [--yes/-y]` | Remove state entries for missing/untracked items | | `sync cleanup [--dry-run] [--yes/-y]` | Remove all missing entries from sync state | | `sync delete [--message/-m] [--force] [--all-missing] [--dry-run] [--yes/-y]` | Delete from remote, local, and sync state | | `sync mv [--message/-m] [--force] [--dry-run] [--yes/-y]` | Rename across local, remote, and sync state | ## Other Commands - `dagu agent [--model ] [--soul ]` — Start an interactive Dagu agent chat using the current CLI context - `dagu agent -p [--model ] [--soul ]` — Send one non-interactive prompt to the Dagu agent - `dagu agent history [--limit ]` — List Dagu agent sessions - `dagu agent resume [-p ] [--model ] [--soul ]` — Resume interactively or send one non-interactive prompt to a Dagu agent session - `dagu example [id]` — Show built-in example DAGs - `dagu migrate history` — Migrate legacy DAG run history from the v1.16 layout to the v1.17+ format and archive the old data - `dagu version` — Show version - `dagu upgrade [--check] [--version/-v ] [--dry-run] [--yes/-y]` — Self-update binary - `dagu license ` — Manage license --- # Environment Variables ## Execution Variables Set automatically during DAG execution. Defined in `internal/core/exec/env.go`. ### Always Available (set for every step) | Variable | Description | | --------- | ----------- | | `DAG_NAME` | Name of the executing DAG | | `DAG_RUN_ID` | Unique run identifier | | `DAG_RUN_LOG_FILE` | Path to the main log file for the DAG run | | `DAG_RUN_STEP_NAME` | Name of the currently executing step | | `DAG_RUN_STEP_STDOUT_FILE` | Path to the step's stdout log file | | `DAG_RUN_STEP_STDERR_FILE` | Path to the step's stderr log file | ### Conditionally Set | Variable | Condition | Description | | -------- | --------- | ----------- | | `DAG_RUN_WORK_DIR` | Only if a per-run working directory is configured | Path to the per-DAG-run working directory | | `DAG_DOCS_DIR` | Only if `paths.docs_dir` is configured | Per-DAG docs directory (`{docs_dir}/{dag_name}`) | | `DAGU_PARAMS_JSON` | Only if the DAG has parameters | Resolved parameters encoded as JSON | ### Handler-Only Variables These are only available inside lifecycle handler steps, not during normal step execution. | Variable | Handler Scope | Description | | ------------------- | ---------------------------------------------------- | -------------------------------------------------------------------------- | | `DAG_RUN_STATUS` | `onSuccess`, `onFailure`, `onAbort`, `onExit`, `onWait` | Current DAG run status (e.g., `success`, `failed`) | | `DAG_WAITING_STEPS` | `onWait` only | Comma-separated list of step names that are waiting for approval | ## Param and Env Resolution - `params:` values are exposed as strings. Pass structured data as JSON strings if a downstream step needs objects or arrays. - `env:` values can reference `params:` values because parameter resolution happens first. - Use list-of-maps for `env:` when one env var depends on another. Go maps do not preserve evaluation order. ```yaml params: base: /tmp env: - ROOT: "${base}" - OUTPUT_DIR: "${ROOT}/out" ``` ## Configuration Variables All configuration environment variables use the `DAGU_` prefix. They map to config keys via viper bindings in `internal/cmn/config/loader.go`. ### Paths | Variable | Default | Description | | -------- | ------- | ----------- | | `DAGU_HOME` | XDG dirs | Base directory for all Dagu data. When set, all paths use unified structure under this directory | | `DAGU_DAGS_DIR` | `$DAGU_HOME/dags` | DAG YAML files directory | | `DAGU_LOG_DIR` | `$DAGU_HOME/logs` | Log files directory | | `DAGU_DATA_DIR` | `$DAGU_HOME/data` | Data storage directory | | `DAGU_DAG_STATE_DIR` | `$DAGU_DATA_DIR/dag-state` | Persistent DAG state files directory | | `DAGU_TOOLS_DIR` | `$DAGU_DATA_DIR/tools` | Managed DAG tool cache directory | | `DAGU_DOCS_DIR` | — | Documentation directory | ### Server | Variable | Default | Description | | -------- | ------- | ----------- | | `DAGU_HOST` | `127.0.0.1` | Server bind address | | `DAGU_PORT` | `8080` | Server port | | `DAGU_BASE_PATH` | `""` (empty) | URL base path for reverse proxy setups | | `DAGU_TZ` | system | Timezone for schedules | ### Core | Variable | Default | Description | | -------- | ------- | ----------- | | `DAGU_DEFAULT_SHELL` | — | Default shell for commands | | `DAGU_SKIP_EXAMPLES` | `false` | Skip creating example DAGs | | `DAGU_DEFAULT_EXECUTION_MODE` | `local` | Execution mode: `local` or `distributed` | ### Features | Variable | Default | Description | | ----------------------- | ------- | --------------------------- | | `DAGU_TERMINAL_ENABLED` | `false` | Enable web terminal feature | | `DAGU_QUEUE_ENABLED` | `true` | Enable queue system | ### Authentication | Variable | Default | Description | | -------- | ------- | ----------- | | `DAGU_AUTH_MODE` | `builtin` | Auth mode: `none`, `basic`, `builtin` | | `DAGU_AUTH_BASIC_USERNAME` | — | Basic auth username (requires `auth.mode=basic`) | | `DAGU_AUTH_BASIC_PASSWORD` | — | Basic auth password (requires `auth.mode=basic`) | OIDC settings are available under the `DAGU_AUTH_OIDC_*` prefix (client ID, secret, issuer, scopes, role mappings, etc.). ### TLS | Variable | Description | | -------- | ----------- | | `DAGU_CERT_FILE` | TLS certificate file path | | `DAGU_KEY_FILE` | TLS key file path | ### Distributed Mode Coordinator settings use the `DAGU_COORDINATOR_*` prefix (host, port, advertise address). Worker settings use the `DAGU_WORKER_*` prefix (worker ID, max active runs, labels, coordinator addresses, PostgreSQL pool settings). ### Other Configuration Prefixes - **Git Sync**: `DAGU_GITSYNC_*` — repository sync settings (repo URL, branch, auth, auto-sync interval) - **Tunnel**: `DAGU_TUNNEL_*` — Tailscale tunnel settings - **Peer TLS**: `DAGU_PEER_*` — gRPC peer TLS settings ## Path Resolution When `DAGU_HOME` is set, all paths use a **unified structure** under that directory: ```sh $DAGU_HOME/ ├── dags/ # DAG definitions ├── data/ # Application data ├── logs/ # Logs │ └── admin/ # Admin logs ├── suspend/ # Suspend flags └── base.yaml # Base configuration ``` When `DAGU_HOME` is not set, XDG-compliant paths are used (`$XDG_CONFIG_HOME/dagu/`, `$XDG_DATA_HOME/dagu/`). Individual path variables (e.g., `DAGU_DAGS_DIR`) override the defaults regardless of which resolution mode is active. --- # Coding Agent Integration Use `action: harness.run` to run AI coding agent CLIs as DAG steps. The harness executor spawns the CLI as a subprocess in non-interactive mode. ## Supported Providers | Provider | Binary | CLI invocation | |----------|--------|----------------| | `claude` | `claude` | `claude -p "" [flags]` | | `codex` | `codex` | `codex exec "" [flags]` | | `copilot` | `copilot` | `copilot -p "" [flags]` | | `opencode` | `opencode` | `opencode run "" [flags]` | | `pi` | `pi` | `pi -p "" [flags]` | The selected attempt's binary must be resolvable when it runs. Built-in providers use `PATH`; custom harnesses can use a binary name or an explicit path resolved from the step working directory. ## How `with` Works Harness supports built-in providers and named custom harness definitions: - `with.provider` selects a built-in provider such as `claude`, `codex`, or `copilot` - top-level `harnesses.` defines how to invoke a custom harness CLI - `with.provider` can point at either a built-in provider or a custom `harnesses:` entry All non-reserved `with` keys are passed directly as CLI flags: - `key: "value"` → `--key value` - `key: true` → `--key` - `key: false` → omitted - `key: 123` → `--key 123` - built-in providers also normalize `snake_case` keys to kebab-case flags, so `max_turns` becomes `--max-turns` Reserved keys are `prompt`, `stdin`, `provider`, and `fallback`. `provider` may be parameterized with `${...}` and is resolved at runtime after interpolation. ## Custom Harness Registry Define reusable custom harness adapters once at the DAG level: ```yaml harnesses: gemini: binary: gemini prefix_args: ["run"] prompt_mode: flag prompt_flag: --prompt option_flags: model: --model steps: - id: review action: harness.run with: prompt: "Review the current branch" provider: gemini model: gemini-2.5-pro ``` Custom harness definition fields: - `binary` — CLI binary or path - `prefix_args` — args that always appear before prompt placement and runtime flags - `prompt_mode` — `arg`, `flag`, or `stdin` - `prompt_flag` — required when `prompt_mode: flag` - `prompt_position` — `before_flags` or `after_flags` - `flag_style` — `gnu_long` or `single_dash` - `option_flags` — per-option override from `with` key to exact flag token ## DAG-Level Defaults and Fallback Use top-level `harness:` to define shared defaults for every harness step in the DAG. ```yaml harness: provider: claude model: sonnet bare: true fallback: - provider: codex full-auto: true - provider: copilot yolo: true silent: true steps: - id: step1 action: harness.run with: prompt: "Write tests" - id: step2 action: harness.run with: prompt: "Fix bugs" model: opus effort: high - id: step3 action: harness.run with: prompt: "Generate docs" provider: copilot fallback: - provider: claude model: haiku ``` Merge rules: - DAG-level primary harness config is the base - Step-level `with` overlays it - Step-level `with.fallback` replaces DAG-level `fallback` - New DAGs should use `action: harness.run`; legacy `type: harness` inference exists only for backward compatibility. ## Pattern 1: Single Agent Step ```yaml params: - PROMPT: "Explain the main function in this project" harness: provider: claude model: sonnet bare: true steps: - id: run_agent action: harness.run with: prompt: "${PROMPT}" output: RESULT ``` ## Pattern 2: Multi-Agent Pipeline Chain agents, passing output between steps via `output:` variables or `${step_id.stdout}` file references. ```yaml type: graph params: - topic: "" steps: - id: research action: harness.run with: prompt: "Research every approach to: ${topic}. List all approaches with pros, cons, and when to use each." provider: claude model: sonnet bare: true output: RESEARCH - id: review action: harness.run with: prompt: "Review the research provided on stdin for completeness and gaps" # Interpolated before execution, then piped to the harness CLI on stdin. stdin: | Review this research for completeness and gaps: ${RESEARCH} provider: codex full-auto: true skip-git-repo-check: true depends: [research] output: REVIEW - id: refine action: harness.run with: prompt: "Refine this research incorporating the review feedback provided via stdin." stdin: | === Research === ${RESEARCH} === Review Feedback === ${REVIEW} provider: claude model: sonnet bare: true depends: [review] output: REFINED ``` `with.prompt` is the prompt. For built-in providers and custom `arg`/`flag` harnesses, `with.stdin` is piped to stdin as supplementary context. For custom `stdin` harnesses, stdin receives the prompt, then a blank line, then `with.stdin` when both are present. ## Pattern 3: Parameterized ```yaml params: - PROVIDER: claude - MODEL: sonnet - PROMPT: "Analyze this codebase" steps: - id: agent action: harness.run with: prompt: "${PROMPT}" provider: "${PROVIDER}" model: "${MODEL}" output: RESULT ``` ## Provider Examples ### Claude Code ```yaml steps: - id: task action: harness.run with: prompt: "Write tests for the auth module" provider: claude model: sonnet effort: high max-turns: 20 max-budget-usd: 2.00 permission-mode: auto allowed-tools: "Bash,Read,Edit" bare: true timeout_sec: 300 output: RESULT ``` ### Codex ```yaml steps: - id: task action: harness.run with: prompt: "Fix failing tests in src/" provider: codex full-auto: true sandbox: workspace-write ephemeral: true skip-git-repo-check: true timeout_sec: 300 ``` ### Copilot ```yaml steps: - id: task action: harness.run with: prompt: "Refactor the authentication middleware" provider: copilot autopilot: true yolo: true silent: true no-ask-user: true no-auto-update: true timeout_sec: 300 ``` ### OpenCode ```yaml steps: - id: task action: harness.run with: prompt: "Refactor the database layer" provider: opencode format: json timeout_sec: 300 ``` ### Pi ```yaml steps: - id: task action: harness.run with: prompt: "Design a rate limiting middleware" provider: pi thinking: high tools: read,bash timeout_sec: 300 ``` ## Notes 1. **Model names** — Look up current model names from each provider's documentation. Do not rely on hardcoded names; they change frequently. 2. **Prompt as a parameter** — Expose the prompt via `params:` so users can customize from UI/CLI without editing the DAG. 3. **Timeouts** — Set `timeout_sec:` (300-600s+) on agent steps. Agent CLIs can run for minutes. 4. **Retry on transient failures** — Add `retry_policy: { limit: 3, interval_sec: 30 }` to handle rate limits and network errors. 5. **Working directory** — Use `working_dir:` on the step. The CLI operates relative to this directory. 6. **Output capture** — Use string-form `output: VAR_NAME` for small flat values, object-form `output:` for structured `${step_id.output.*}` access, and `stdout.artifact` / `stderr.artifact` when large agent output, reports, JSON, Markdown, or logs should be stored as DAG-run artifacts. Use `${step_id.stdout}` only when a downstream step needs the stdout log file path. 7. **Exit codes** — 0 = success, 1 = CLI error, 124 = step timed out. Last 1KB of stderr is included in the error message on failure. 8. **Fallback behavior** — If the primary harness config fails and the context is still active, fallback entries are tried in order. Failed-attempt stdout is discarded; stderr remains visible in logs.