--- name: analyze-test-run description: "Analyze a GitHub Actions integration test run and produce a skill invocation report with failure root-cause issues. TRIGGERS: analyze test run, skill invocation rate, test run report, compare test runs, skill invocation summary, test failure analysis, run report, test results, action run report" license: MIT metadata: author: Microsoft version: "1.0.0" --- # Analyze Test Run Downloads artifacts from a GitHub Actions integration test run, generates a summarized skill invocation report, and files GitHub issues for each test failure with root-cause analysis. ## When to Use - Summarize results of a GitHub Actions integration test run - Calculate skill invocation rates for the skill under test - For azure-deploy tests: track the full deployment chain (azure-prepare → azure-validate → azure-deploy) - Compare skill invocation across two runs - File issues for test failures with root-cause context ## Input | Parameter | Required | Description | |-----------|----------|-------------| | **Run ID or URL** | Yes | GitHub Actions run ID (e.g. `22373768875`) or full URL | | **Comparison Run** | No | Second run ID/URL for side-by-side comparison | ## Workflow ### Phase 1 — Download & Parse 1. Extract the numeric run ID from the input (strip URL prefix if needed) 2. Fetch run metadata: ```bash gh run view --repo microsoft/GitHub-Copilot-for-Azure --json jobs,status,conclusion,name ``` 3. Download artifacts to a temp directory: ```bash gh run download --repo microsoft/GitHub-Copilot-for-Azure --dir "$TMPDIR/gh-run-" ``` 4. Locate these files in the downloaded artifacts: - `junit.xml` — test pass/fail/skip/error results - `*-SKILL-REPORT.md` — generated skill report with per-test details - `agent-metadata-*.md` files — raw agent session logs per test ### Phase 2 — Build Summary Report Produce a markdown report with four sections. See [report-format.md](references/report-format.md) for the exact template. **Section 1 — Test Results Overview** Parse `junit.xml` to build: | Metric | Value | |--------|-------| | Total tests | count from `` | | Executed | total − skipped | | Skipped | count of `` elements | | Passed | executed − failures − errors | | Failed | count of `` elements | | Test Pass Rate | passed / executed as % | Include a per-test table with name, duration (from `time` attribute, convert seconds to `Xm Ys`), and Pass/Fail result. **Section 2 — Skill Invocation Rate** Read the SKILL-REPORT.md "Per-Test Case Results" sections. For each executed test determine whether the skill under test was invoked. The skills to track depend on which integration test suite the run belongs to: **azure-deploy integration tests** — track the full deployment chain: | Skill | How to detect | |-------|---------------| | `azure-prepare` | Mentioned as invoked in the narrative or agent-metadata | | `azure-validate` | Mentioned as invoked in the narrative or agent-metadata | | `azure-deploy` | Mentioned as invoked in the narrative or agent-metadata | Build a per-test invocation matrix (Yes/No for each skill) and compute rates: | Skill | Invocation Rate | |-------|----------------| | azure-deploy | X% (n/total) | | azure-prepare | X% (n/total) | | azure-validate | X% (n/total) | | Full skill chain (P→V→D) | X% (n/total) | > The azure-deploy integration tests exercise the full deployment workflow where the agent is expected to invoke azure-prepare, azure-validate, and azure-deploy in sequence. This three-skill chain tracking is **specific to azure-deploy tests only**. **All other integration tests** — track only the skill under test: | Skill | Invocation Rate | |-------|----------------| | {skill-under-test} | X% (n/total) | For non-deploy tests (e.g. azure-prepare, azure-ai, azure-kusto), only track whether the primary skill under test was invoked. Do not include azure-prepare/azure-validate/azure-deploy chain columns. **Section 3 — Report Confidence & Pass Rate** Extract from SKILL-REPORT.md: - Overall Test Pass Rate (from the report's statistics section) - Average Confidence (from the report's statistics section) **Section 4 — Comparison** (only when a second run is provided) Repeat Phase 1–3 for the second run, then produce a side-by-side delta table. See [report-format.md](references/report-format.md) § Comparison. ### Phase 3 — File Issues for Failures For every test with a `` element in `junit.xml`: 1. Read the failure message and file:line from the XML 2. Read the actual line of code from the test file at that location 3. Read the `agent-metadata-*.md` for that test from the artifacts 4. Read the corresponding section in the SKILL-REPORT.md for context on what the agent did 5. Determine root cause category: - **Skill not invoked** — agent bypassed skills and used manual commands - **Deployment failure** — infrastructure or RBAC error during deployment - **Timeout** — test exceeded time limit - **Assertion mismatch** — expected files/links not found - **Quota exhaustion** — Azure region quota prevented deployment 6. Create a GitHub issue: ``` gh issue create --repo microsoft/GitHub-Copilot-for-Azure \ --title "Integration test failure: – []" \ --label "bug,integration-test" \ --body "" ``` **Title format:** `Integration test failure: {skill} – {keywords} [{root-cause-category}]` - `{keywords}`: 2-4 words from the test name — app type (function app, static web app) + IaC type (Terraform, Bicep) + trigger if relevant - `{root-cause-category}`: one of the categories from step 5 in brackets Issue body template — see [issue-template.md](references/issue-template.md). > ⚠️ **Note:** Do NOT include the Error Details (JUnit XML) or Agent Metadata sections in the issue body. Keep issues concise with the diagnosis, prompt context, skill report context, and environment sections only. > For azure-deploy integration tests, include an "azure-deploy Skill Invocation" section showing whether azure-deploy was invoked (Yes/No), with a note that the full chain is azure-prepare → azure-validate → azure-deploy. For all other integration tests, include a "{skill} Skill Invocation" section showing only whether the primary skill under test was invoked. ## Error Handling | Error | Cause | Fix | |-------|-------|-----| | `gh: command not found` | GitHub CLI not installed | Install with `winget install GitHub.cli` or `brew install gh` | | `no artifacts found` | Run has no uploadable reports | Verify the run completed the "Export report" step | | `HTTP 404` on run view | Invalid run ID or no access | Check the run ID and ensure `gh auth status` is authenticated | | `rate limit exceeded` | Too many GitHub API calls | Wait and retry, or use `--limit` on searches | ## References - [report-format.md](references/report-format.md) — Output report template - [issue-template.md](references/issue-template.md) — GitHub issue body template