--- # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. name: ci-cd description: Guides understanding and working with Apache Beam's CI/CD system using GitHub Actions. Use when debugging CI failures, understanding test workflows, or modifying CI configuration. --- # CI/CD in Apache Beam ## Overview Apache Beam uses GitHub Actions for CI/CD. Workflows are located in `.github/workflows/`. ## Workflow Types ### PreCommit Workflows - Run on PRs and merges - Validate code changes before merge - Naming: `beam_PreCommit_*.yml` ### PostCommit Workflows - Run after merge and on schedule - More comprehensive testing - Naming: `beam_PostCommit_*.yml` ### Scheduled Workflows - Run nightly on master - Check for external dependency impacts - Tag master with `nightly-master` ## Key Workflows ### PreCommit | Workflow | Description | |----------|-------------| | `beam_PreCommit_Java.yml` | Java build and tests | | `beam_PreCommit_Python.yml` | Python tests | | `beam_PreCommit_Go.yml` | Go tests | | `beam_PreCommit_RAT.yml` | License header checks | | `beam_PreCommit_Spotless.yml` | Code formatting | ### PostCommit - Java | Workflow | Description | |----------|-------------| | `beam_PostCommit_Java.yml` | Full Java test suite | | `beam_PostCommit_Java_ValidatesRunner_*.yml` | Runner validation tests | | `beam_PostCommit_Java_Examples_*.yml` | Example pipeline tests | ### PostCommit - Python | Workflow | Description | |----------|-------------| | `beam_PostCommit_Python.yml` | Full Python test suite | | `beam_PostCommit_Python_ValidatesRunner_*.yml` | Runner validation | | `beam_PostCommit_Python_Examples_*.yml` | Examples | ### Load & Performance Tests | Workflow | Description | |----------|-------------| | `beam_LoadTests_*.yml` | Load testing | | `beam_PerformanceTests_*.yml` | I/O performance | ## Triggering Tests ### Automatic - PRs trigger PreCommit tests - Merges trigger PostCommit tests ### Triggering Specific Workflows Use [trigger files](https://github.com/apache/beam/blob/master/.github/workflows/README.md#running-workflows-manually) to run specific workflows. ### Workflow Dispatch Most workflows support manual triggering via GitHub UI. ## Understanding Test Results ### Finding Logs 1. Go to PR → Checks tab 2. Click on failed workflow 3. Expand failed job 4. View step logs ### Common Failure Patterns #### Flaky Tests - Random failures unrelated to change - Solution: Use [trigger files](https://github.com/apache/beam/blob/master/.github/workflows/README.md#running-workflows-manually) to re-run the specific workflow. #### Timeout - Increase timeout in workflow if justified - Or optimize test #### Resource Exhaustion - GCP quota issues - Check project settings ## GCP Credentials Workflows requiring GCP access use these secrets: - `GCP_PROJECT_ID` - Project ID (e.g., `apache-beam-testing`) - `GCP_REGION` - Region (e.g., `us-central1`) - `GCP_TESTING_BUCKET` - Temp storage bucket - `GCP_PYTHON_WHEELS_BUCKET` - Python wheels bucket - `GCP_SA_EMAIL` - Service account email - `GCP_SA_KEY` - Base64-encoded service account key Required IAM roles: - Storage Admin - Dataflow Admin - Artifact Registry Writer - BigQuery Data Editor - Service Account User ## Self-hosted vs GitHub-hosted Runners ### Self-hosted (majority of workflows) - Pre-configured with dependencies - GCP credentials pre-configured - Naming: `beam_*.yml` ### GitHub-hosted - Used for cross-platform testing (Linux, macOS, Windows) - May need explicit credential setup ## Workflow Structure ```yaml name: Workflow Name on: push: branches: [master] pull_request: branches: [master] schedule: - cron: '0 0 * * *' workflow_dispatch: jobs: build: runs-on: [self-hosted, ...] steps: - uses: actions/checkout@v4 - name: Run Gradle run: ./gradlew :task:name ``` ## Local Debugging ### Run Same Commands as CI Check workflow file's `run` commands: ```bash ./gradlew :sdks:java:core:test ./gradlew :sdks:python:test ``` ### Common Issues - Clean gradle cache: `rm -rf ~/.gradle .gradle` - Remove build directory: `rm -rf build` - Check Java version matches CI ## Snapshot Builds ### Locations - Java SDK: https://repository.apache.org/content/groups/snapshots/org/apache/beam/ - SDK Containers: https://gcr.io/apache-beam-testing/beam-sdk - Portable Runners: https://gcr.io/apache-beam-testing/beam_portability - Python SDK: gs://beam-python-nightly-snapshots ## Release Workflows | Workflow | Purpose | |----------|---------| | `cut_release_branch.yml` | Create release branch | | `build_release_candidate.yml` | Build RC | | `finalize_release.yml` | Finalize release | | `publish_github_release_notes.yml` | Publish notes |