# Confusion Matrix — ABIGAIL v3 (Variant B) — action_classification

- Run date: `2026-03-21T04:26:26.591Z`
- Total cases: `82`
- Unparseable: `0`
- Quarantined: `0`
- Source run file: `data\benchmark_results_sonnet.json`
- Source SHA-256: `5920174466f3e2b8848dc8472f820f135479fdaf8f266563f26f6e8f58e34fe0`
- Ground truth file: `data\ground_truth\action_classification.json`
- Ground truth SHA-256: `346acbe7427f2c3ee071c50945d3d5e1001400c5a83f81254646e0948aaee40f`
- Schema version: `1`
- Verifier version: `1.0.0`
- Generated at: `2026-04-20T11:17:07+00:00`

## Axis convention

- Rows: **ground truth** (what the USPTO PEDS record or verified truth says).
- Cols: **predicted** (what the system under test emitted).
- Cell `[row=A, col=B]` counts the cases where truth was A and the model predicted B.

## Matrix

| truth \ predicted | `NF0-F0-A0` | `NF1-F0-A0` | `NF1-F0-A1` | `NF1-F1-A0` | `NF1-F1-A1` |
|---|---|---|---|---|---|
| `NF0-F0-A0` | **1** | 0 | 0 | 0 | 0 |
| `NF1-F0-A0` | 0 | **30** | 0 | 0 | 0 |
| `NF1-F0-A1` | 0 | 0 | **7** | 0 | 0 |
| `NF1-F1-A0` | 0 | 0 | 0 | **33** | 0 |
| `NF1-F1-A1` | 0 | 0 | 0 | 0 | **11** |

## Per-class precision, recall, F1

| label | support | TP | FP | FN | precision | recall | F1 | hallucinated |
|---|---|---|---|---|---|---|---|---|
| `NF0-F0-A0` | 1 | 1 | 0 | 0 | 1.000 | 1.000 | 1.000 | no |
| `NF1-F0-A0` | 30 | 30 | 0 | 0 | 1.000 | 1.000 | 1.000 | no |
| `NF1-F0-A1` | 7 | 7 | 0 | 0 | 1.000 | 1.000 | 1.000 | no |
| `NF1-F1-A0` | 33 | 33 | 0 | 0 | 1.000 | 1.000 | 1.000 | no |
| `NF1-F1-A1` | 11 | 11 | 0 | 0 | 1.000 | 1.000 | 1.000 | no |

## Off-diagonal traces


## Verification

This artifact can be verified by: `python -m patentbench.reports.verify_confusion reports\confusion_matrices\abigail\action_classification.json`