--- name: adk-deploy-guide description: > MUST READ before deploying any ADK agent. ADK deployment guide — Agent Engine, Cloud Run, GKE, CI/CD pipelines, secrets, observability, and production workflows. Use when deploying agents to Google Cloud or troubleshooting deployments. Do NOT use for API code patterns (use adk-cheatsheet), evaluation (use adk-eval-guide), or project scaffolding (use adk-scaffold). metadata: license: Apache-2.0 author: Google --- # ADK Deployment Guide > **Scaffolded project?** Use the `make` commands throughout this guide — they wrap Terraform, Docker, and deployment into a tested pipeline. > > **No scaffold?** See [Quick Deploy](#quick-deploy-adk-cli) below, or the [ADK deployment docs](https://google.github.io/adk-docs/deploy/). > For production infrastructure, scaffold with `/adk-scaffold`. ### Reference Files For deeper details, consult these reference files in `references/`: - **`cloud-run.md`** — Scaling defaults, Dockerfile, session types, networking - **`agent-engine.md`** — deploy.py CLI, AdkApp pattern, Terraform resource, deployment metadata, CI/CD differences - **`terraform-patterns.md`** — Custom infrastructure, IAM, state management, importing resources - **`event-driven.md`** — Pub/Sub, Eventarc, BigQuery Remote Function triggers via custom `fast_api_app.py` endpoints > **Observability:** See the **adk-observability-guide** skill for Cloud Trace, prompt-response logging, BigQuery Analytics, and third-party integrations. --- ## Deployment Target Decision Matrix Choose the right deployment target based on your requirements: | Criteria | Agent Engine | Cloud Run | GKE | |----------|-------------|-----------|-----| | **Languages** | Python | Python | Python (+ others via custom containers) | | **Scaling** | Managed auto-scaling (configurable min/max, concurrency) | Fully configurable (min/max instances, concurrency, CPU allocation) | Full Kubernetes scaling (HPA, VPA, node auto-provisioning) | | **Networking** | VPC-SC and PSC supported | Full VPC support, direct VPC egress, IAP, ingress rules | Full Kubernetes networking| | **Session state** | Native `VertexAiSessionService` (persistent, managed) | In-memory (dev), Cloud SQL, or Agent Engine session backend | Custom (any Kubernetes-compatible store) | | **Batch/event processing** | Not supported | `/invoke` endpoint for Pub/Sub, Eventarc, BigQuery | Custom (Kubernetes Jobs, Pub/Sub) | | **Cost model** | vCPU-hours + memory-hours (not billed when idle) | Per-instance-second + min instance costs | Node pool costs (always-on or auto-provisioned) | | **Setup complexity** | Lower (managed, purpose-built for agents) | Medium (Dockerfile, Terraform, networking) | Higher (Kubernetes expertise required) | | **Best for** | Managed infrastructure, minimal ops | Custom infra, event-driven workloads | Full control, open models, GPU workloads | **Ask the user** which deployment target fits their needs. Each is a valid production choice with different trade-offs. --- ## Quick Deploy (ADK CLI) For projects without Agent Starter Pack scaffolding. No Makefile, Terraform, or Dockerfile required. ```bash # Cloud Run adk deploy cloud_run --project=PROJECT --region=REGION path/to/agent/ # Agent Engine adk deploy agent_engine --project=PROJECT --region=REGION path/to/agent/ # GKE (requires existing cluster) adk deploy gke --project=PROJECT --cluster_name=CLUSTER --region=REGION path/to/agent/ ``` All commands support `--with_ui` to deploy the ADK dev UI. Cloud Run also accepts extra `gcloud` flags after `--` (e.g., `-- --no-allow-unauthenticated`). See `adk deploy --help` or the [ADK deployment docs](https://google.github.io/adk-docs/deploy/) for full flag reference. > For CI/CD, observability, or production infrastructure, scaffold with `/adk-scaffold` and use the sections below. --- ## Dev Environment Setup & Deploy (Scaffolded Projects) ### Setting Up Dev Infrastructure (Optional) `make setup-dev-env` runs `terraform apply` in `deployment/terraform/dev/`. This provisions supporting infrastructure: - Service accounts (`app_sa` for the agent, used for runtime permissions) - Artifact Registry repository (for container images) - IAM bindings (granting the app SA necessary roles) - Telemetry resources (Cloud Logging bucket, BigQuery dataset) - Any custom resources defined in `deployment/terraform/dev/` This step is **optional** — `make deploy` works without it (Cloud Run creates the service on the fly via `gcloud run deploy --source .`). However, running it gives you proper service accounts, observability, and IAM setup. ```bash make setup-dev-env ``` ### Deploying 1. **Notify the human**: "Eval scores meet thresholds and tests pass. Ready to deploy to dev?" 2. **Wait for explicit approval** 3. Once approved: `make deploy` **IMPORTANT**: Never run `make deploy` without explicit human approval. --- ## Production Deployment — CI/CD Pipeline **Best for:** Production applications, teams requiring staging → production promotion. **Prerequisites:** 1. Project must NOT be in a gitignored folder 2. User must provide staging and production GCP project IDs 3. GitHub repository name and owner **Steps:** 1. If prototype, first add Terraform/CI-CD files using the Agent Starter Pack CLI (see `/adk-scaffold` for full options): ```bash uvx agent-starter-pack enhance . --cicd-runner github_actions -y -s ``` 2. Ensure you're logged in to GitHub CLI: ```bash gh auth login # (skip if already authenticated) ``` 3. Run setup-cicd: ```bash uvx agent-starter-pack setup-cicd \ --staging-project YOUR_STAGING_PROJECT \ --prod-project YOUR_PROD_PROJECT \ --repository-name YOUR_REPO_NAME \ --repository-owner YOUR_GITHUB_USERNAME \ --auto-approve \ --create-repository ``` 4. Push code to trigger deployments #### Key `setup-cicd` Flags | Flag | Description | |------|-------------| | `--staging-project` | GCP project ID for staging environment | | `--prod-project` | GCP project ID for production environment | | `--repository-name` / `--repository-owner` | GitHub repository name and owner | | `--auto-approve` | Skip Terraform plan confirmation prompts | | `--create-repository` | Create the GitHub repo if it doesn't exist | | `--cicd-project` | Separate GCP project for CI/CD infrastructure. Defaults to prod project | | `--local-state` | Store Terraform state locally instead of in GCS (see `references/terraform-patterns.md`) | Run `uvx agent-starter-pack setup-cicd --help` for the full flag reference (Cloud Build options, dev project, region, etc.). ### Choosing a CI/CD Runner | Runner | Pros | Cons | |--------|------|------| | **github_actions** (Default) | No PAT needed, uses `gh auth`, WIF-based, fully automated | Requires GitHub CLI authentication | | **google_cloud_build** | Native GCP integration | Requires interactive browser authorization (or PAT + app installation ID for programmatic mode) | ### How Authentication Works (WIF) Both runners use **Workload Identity Federation (WIF)** — GitHub/Cloud Build OIDC tokens are trusted by a GCP Workload Identity Pool, which grants `cicd_runner_sa` impersonation. No long-lived service account keys needed. Terraform in `setup-cicd` creates the pool, provider, and SA bindings automatically. If auth fails, re-run `terraform apply` in the CI/CD Terraform directory. ### CI/CD Pipeline Stages The pipeline has three stages: 1. **CI (PR checks)** — Triggered on pull request. Runs unit and integration tests. 2. **Staging CD** — Triggered on merge to `main`. Builds container, deploys to staging, runs load tests. 3. **Production CD** — Triggered after successful staging deploy. Requires **manual approval** before deploying to production. **IMPORTANT**: `setup-cicd` creates infrastructure but doesn't deploy automatically. Terraform configures all required GitHub secrets and variables (WIF credentials, project IDs, service accounts). Push code to trigger the pipeline: ```bash git add . && git commit -m "Initial agent implementation" git push origin main ``` To approve production deployment: ```bash # GitHub Actions: Approve via repository Actions tab (environment protection rules) # Cloud Build: Find pending build and approve gcloud builds list --project=PROD_PROJECT --region=REGION --filter="status=PENDING" gcloud builds approve BUILD_ID --project=PROD_PROJECT ``` --- ## Cloud Run Specifics For detailed infrastructure configuration (scaling defaults, Dockerfile, FastAPI endpoints, session types, networking), see `references/cloud-run.md`. For ADK docs on Cloud Run deployment, fetch `https://google.github.io/adk-docs/deploy/cloud-run/index.md` via WebFetch. --- ## Agent Engine Specifics Agent Engine is a managed Vertex AI service for deploying Python ADK agents. Uses source-based deployment (no Dockerfile) via `deploy.py` and the `AdkApp` class. > **No `gcloud` CLI exists for Agent Engine.** Deploy via `deploy.py` or `adk deploy agent_engine`. Query via the Python `vertexai.Client` SDK. Deployments can take 5-10 minutes. If `make deploy` times out, check if the engine was created and manually populate `deployment_metadata.json` with the engine resource ID (see reference for details). For detailed infrastructure configuration (deploy.py flags, AdkApp pattern, Terraform resource, deployment metadata, session/artifact services, CI/CD differences), see `references/agent-engine.md`. For ADK docs on Agent Engine deployment, fetch `https://google.github.io/adk-docs/deploy/agent-engine/index.md` via WebFetch. --- ## Service Account Architecture Scaffolded projects use two service accounts: - **`app_sa`** (per environment) — Runtime identity for the deployed agent. Roles defined in `deployment/terraform/iam.tf`. - **`cicd_runner_sa`** (CI/CD project) — CI/CD pipeline identity (GitHub Actions / Cloud Build). Lives in the CI/CD project (defaults to prod project), needs permissions in **both** staging and prod projects. Check `deployment/terraform/iam.tf` for exact role bindings. Cross-project permissions (Cloud Run service agents, artifact registry access) are also configured there. **Common 403 errors:** - "Permission denied on Cloud Run" → `cicd_runner_sa` missing deployment role in the target project - "Cannot act as service account" → Missing `iam.serviceAccountUser` binding on `app_sa` - "Secret access denied" → `app_sa` missing `secretmanager.secretAccessor` - "Artifact Registry read denied" → Cloud Run service agent missing read access in CI/CD project --- ## Secret Manager (for API Credentials) Instead of passing sensitive keys as environment variables, use GCP Secret Manager. ```bash # Create a secret echo -n "YOUR_API_KEY" | gcloud secrets create MY_SECRET_NAME --data-file=- # Update an existing secret echo -n "NEW_API_KEY" | gcloud secrets versions add MY_SECRET_NAME --data-file=- ``` **Grant access:** For Cloud Run, grant `secretmanager.secretAccessor` to `app_sa`. For Agent Engine, grant it to the platform-managed SA (`service-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.com`). **Pass secrets at deploy time (Agent Engine):** ```bash make deploy SECRETS="API_KEY=my-api-key,DB_PASS=db-password:2" ``` Format: `ENV_VAR=SECRET_ID` or `ENV_VAR=SECRET_ID:VERSION` (defaults to latest). Access in code via `os.environ.get("API_KEY")`. --- ## Observability See the **adk-observability-guide** skill for observability configuration (Cloud Trace, prompt-response logging, BigQuery Analytics, third-party integrations). --- ## Testing Your Deployed Agent ### Agent Engine Deployment **Option 1: Testing Notebook** ```bash jupyter notebook notebooks/adk_app_testing.ipynb ``` **Option 2: Python Script** ```python import json import vertexai with open("deployment_metadata.json") as f: engine_id = json.load(f)["remote_agent_engine_id"] client = vertexai.Client(location="us-central1") agent = client.agent_engines.get(name=engine_id) async for event in agent.async_stream_query(message="Hello!", user_id="test"): print(event) ``` **Option 3: Playground** ```bash make playground ``` ### Cloud Run Deployment > **Auth required by default.** Cloud Run deploys with `--no-allow-unauthenticated`, so all requests need an `Authorization: Bearer` header with an identity token. Getting a 403? You're likely missing this header. To allow public access, redeploy with `--allow-unauthenticated`. ```bash # Test health endpoint curl -H "Authorization: Bearer $(gcloud auth print-identity-token)" \ https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app/health # Test SSE streaming endpoint (ADK HTTP mode) curl -X POST "https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app/run_sse" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $(gcloud auth print-identity-token)" \ -d '{"message": "Hello!", "user_id": "test", "session_id": "test-session"}' ``` ### Load Tests ```bash make load-test ``` See `tests/load_test/README.md` for configuration, default settings, and CI/CD integration details. --- ## Deploying with a UI (IAP) To expose your agent with a web UI protected by Google identity authentication: ```bash # Deploy with IAP (built-in framework UI) make deploy IAP=true # Deploy with custom frontend on a different port make deploy IAP=true PORT=5173 ``` IAP (Identity-Aware Proxy) secures the Cloud Run service — only authorized Google accounts can access it. After deploying, grant user access via the [Cloud Console IAP settings](https://cloud.google.com/run/docs/securing/identity-aware-proxy-cloud-run#manage_user_or_group_access). For Agent Engine with a custom frontend, use a **decoupled deployment** — deploy the frontend separately to Cloud Run or Cloud Storage, connecting to the Agent Engine backend API. --- ## Rollback & Recovery The primary rollback mechanism is **git-based**: fix the issue, commit, and push to `main`. The CI/CD pipeline will automatically build and deploy the new version through staging → production. For immediate Cloud Run rollback without a new commit, use revision traffic shifting: ```bash gcloud run revisions list --service=SERVICE_NAME --region=REGION gcloud run services update-traffic SERVICE_NAME \ --to-revisions=REVISION_NAME=100 --region=REGION ``` Agent Engine doesn't support revision-based rollback — fix and redeploy via `make deploy`. --- ## Custom Infrastructure (Terraform) For custom infrastructure patterns (Pub/Sub, BigQuery, Eventarc, Cloud SQL, IAM), consult `references/terraform-patterns.md` for: - Where to put custom Terraform files (dev vs CI/CD) - Resource examples (Pub/Sub, BigQuery, Eventarc triggers) - IAM bindings for custom resources - Terraform state management (remote vs local, importing resources) - Common infrastructure patterns --- ## Troubleshooting | Issue | Solution | |-------|----------| | Terraform state locked | `terraform force-unlock -force LOCK_ID` in deployment/terraform/ | | GitHub Actions auth failed | Re-run `terraform apply` in CI/CD terraform dir; verify WIF pool/provider | | Cloud Build authorization pending | Use `github_actions` runner instead | | Resource already exists | `terraform import` (see `references/terraform-patterns.md`) | | Agent Engine deploy timeout / hangs | Deployments take 5-10 min; check if engine was created (see Agent Engine Specifics) | | Secret not available | Verify `secretAccessor` granted to `app_sa` (not the default compute SA) | | 403 on deploy | Check `deployment/terraform/iam.tf` — `cicd_runner_sa` needs deployment + SA impersonation roles in the target project | | 403 when testing Cloud Run | Default is `--no-allow-unauthenticated`; include `Authorization: Bearer $(gcloud auth print-identity-token)` header | | Cold starts too slow | Set `min_instance_count > 0` in Cloud Run Terraform config | | Cloud Run 503 errors | Check resource limits (memory/CPU), increase `max_instance_count`, or check container crash logs |