# @qulib/mcp

**@qulib/mcp** is an MCP server that exposes Qulib so AI clients can analyze a deployed URL for release confidence, accessibility, broken links, console noise, and prioritized gaps (CLI entry `qulib-mcp`).

## Setup

To enable LLM-powered scenario generation, add your Anthropic API key to the
`env` block in your MCP host config (Claude Desktop, Claude Code, Cursor, etc.):

```json
{
  "mcpServers": {
    "qulib": {
      "command": "npx",
      "args": ["@qulib/mcp"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}
```

Without this key, qulib still runs but uses built-in template scenarios only.
Your key is never stored by qulib — it is read from your local config at runtime.

After updating this config, restart your MCP host (Claude Desktop / Claude Code / Cursor) so the new environment variables are picked up.

For verbose server-side stderr logs while troubleshooting host wiring, add:

```json
{
  "mcpServers": {
    "qulib": {
      "command": "npx",
      "args": ["@qulib/mcp"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "QULIB_DEBUG": "1"
      }
    }
  }
}
```

## MCP tools

| Tool | Purpose |
|---|---|
| **`qulib_score_confidence`** | **Flagship.** Fuses evidence from `qulib_analyze_app`, `qulib_score_automation`, and `qulib_score_api` into one verdict: **ship / caution / hold / block** with a 0–100 confidence score, L1–L5 level, per-source contributions, honesty notes, and recommended next checks. Pass `url` and/or `repoPath`. |
| `qulib_analyze_app` | Live-app quality scan: release confidence (0–100), axe-core a11y, broken links, console errors, prioritized gaps. Default payload is summary-first; pass `includeFullReport: true` for all scenarios. Optional form-login / storage-state auth. *(Canonical form; legacy alias `analyze_app` kept for backwards compatibility.)* |
| `qulib_score_automation` | Score a local repo's test-automation maturity across six dimensions (test coverage breadth, framework adoption, test-id hygiene, CI integration, auth test coverage, component test ratio) — plus a conditional 7th dimension (API coverage) when API endpoints are detected. Returns overall 0–100, level (L1–L5), and top recommendations. Each dimension carries `applicability`; score normalizes over applicable dimensions only. |
| `qulib_score_api` | Discover API endpoints in a repo and score their test coverage. Tier1=OpenAPI specs, Tier2=framework routes (Next.js, Express, Fastify, NestJS), Tier3=heuristic opt-in (tRPC). Returns an api-test-coverage dimension score with per-endpoint evidence. |
| `qulib_scaffold_tests` | Generate a ready-to-run test scaffold (Cypress config + spec files) by crawling a deployed URL. Returns `generatedTests` and `projectConfig` so an agent can write files directly. Pass `recipes` (e.g. `["auth","a11y"]`) to append proven test patterns. Supported framework: `cypress-e2e` (default); `playwright` is not yet implemented. |
| **`qulib_score_bug_report`** | LLM-as-judge of a learner bug report against a planted-bug target. Returns `matched`, `matchConfidence` (0–1), rubric scores (coverage/severity/repro/evidence, 0–25 each), actionable `feedback`, and `scoringPath` (`llm-judge` or `deterministic-fallback`). Learner report is untrusted input with prompt-injection hardening. Read-only. |
| **`qulib_score_decisions`** | Pivotal-decision evaluation: scores whether an agent made the senior-correct call at decision forks (block/pass, stop/continue, escalate/proceed). Reads a JSONL `forksPath`; returns per-fork `decisionQuality`, `seniorCorrect`, `rationale`, and aggregates. Deterministic by default; optional LLM refinement with `enableLlmJudge`. Fork log text is untrusted. Read-only. |
| `qulib_explore_auth` | List all sign-in paths (OAuth, SSO, forms, magic link) and what the agent must collect before `qulib_analyze_app`. Prefer on unfamiliar apps. *(Canonical form; legacy alias `explore_auth` kept for backwards compatibility.)* |
| `qulib_detect_auth` | Single-pass auth pattern guess with a recommendation. Lighter than `qulib_explore_auth`. *(Canonical form; legacy alias `detect_auth` kept for backwards compatibility.)* |
| `analyze_app` | Legacy alias for `qulib_analyze_app`. Identical behavior; kept for backwards compatibility through v1.0. |
| `explore_auth` | Legacy alias for `qulib_explore_auth`. Identical behavior; kept for backwards compatibility through v1.0. |
| `detect_auth` | Legacy alias for `qulib_detect_auth`. Identical behavior; kept for backwards compatibility through v1.0. |

**Example — flagship confidence call:**

```
qulib_score_confidence({ url: "https://example.com", repoPath: "/path/to/repo" })
```

Returns a verdict like:

```json
{
  "releaseConfidence": {
    "verdict": "caution",
    "confidenceScore": 54,
    "level": 3,
    "label": "Moderate confidence — proceed with known risks",
    "topRisks": ["Low crawl coverage (2 pages)", "No CI integration detected"],
    "recommendedNextChecks": ["Add CI pipeline", "Increase crawl depth"],
    "honestyNotes": ["API coverage: not_applicable (no API endpoints found — excluded from score)"]
  }
}
```

### `analyze_app` detail

- **Default payload:** `summary`, `topGaps`, `costIntelligenceSummary`, `nextDeterministicChecks`, small previews.
- **`includeFullReport: true`** — full `gapAnalysis` (all scenarios) and full `repoInventory`.
- **`agentSummary: true`** — compact gate-decision payload (`pass`/`warn`/`fail`) for CI orchestrators.
- Optional harness overrides: **`llmMaxOutputTokensPerCall`**, **`llmTokenBudget`** (legacy), **`testGenerationLimit`**, **`enableLlmScenarios`**.

Returns: release confidence score (0–100), accessibility violations (axe-core, WCAG 2 A/AA), broken links, console errors and coverage warnings, prioritized gaps with severity.

Supports optional form-login auth for scanning authenticated pages. If auth is required but not configured, the scan can stop early with `mode: auth-required` and guidance in `detectedAuth` / the decision log.

## Install for Claude Code

```bash
claude mcp add qulib --scope user npx -y @qulib/mcp
```

## Install for Claude Desktop / Cursor

Add this under `mcpServers` in `claude_desktop_config.json` (Claude Desktop) or your editor MCP settings (Cursor), adjusting paths if your client uses a different layout:

```json
{
  "mcpServers": {
    "qulib": {
      "command": "npx",
      "args": ["-y", "@qulib/mcp"]
    }
  }
}
```

## One-time browser setup

qulib uses Playwright under the hood. After your MCP host first runs the qulib server, you'll need to install Chromium:

```bash
npx playwright install chromium
```

This is a one-time step. You'll only need to do it again if Playwright's browser version is bumped in a future qulib release.

If you skip this step, the first tool call will return a clear error telling you to run the command.

## Agentic auth exploration (`explore_auth`)

On unfamiliar apps, call **`explore_auth`** before **`analyze_app`**. The response lists each sign-in path (curated public OAuth/SSO, password forms, magic-link wording, and **heuristic** unknown buttons such as tenant-specific SSO). Each path includes **`requirements`** (e.g. storage-state vs credentials) and **`suggestedAgentBehavior`**.

When the model sees **`unrecognizedButtons`**, it can ask the user to register a label on the **MCP host** with the CLI:

`qulib auth providers add --id <kebab-id> --label "..." --pattern "..."` — patterns are saved under **`~/.qulib/providers.json`** and merged with the built-in list on the next `explore_auth` / `explore-auth`. Nothing is auto-written without an explicit `providers add`.

## Compact vs full `analyze_app` response

| | Default (`includeFullReport` omitted or false) | `includeFullReport: true` |
|--|--|--|
| Size | Small: top gaps, cost summary, next checks, `repoInventorySummary` (counts only) | Full `gapAnalysis` (all scenarios) and full `repoInventory` (test files, missing test IDs) |
| When to use | Routine agent turns, chat context limits | Deep dives, exporting full scenario JSON |

Example (full):

```json
{ "url": "https://example.com", "includeFullReport": true }
```

Example (tighter LLM envelope from MCP):

```json
{
  "url": "https://example.com",
  "llmMaxOutputTokensPerCall": 2048,
  "testGenerationLimit": 5,
  "enableLlmScenarios": true
}
```

## Example usage

Ask Claude:

> "Use Qulib to analyze https://example.com and tell me if it's ready to ship."

Claude will call `analyze_app({ url: "https://example.com" })` and reason about the result.

## Authenticated scanning

### Form login (automated)

> "Use Qulib to scan my staging app at https://staging.example.com. Log in as user@example.com with password Test123, the login form is at /login with selectors [data-testid='email'], [data-testid='password'], and [data-testid='submit']."

Claude will pass auth credentials to `analyze_app`; Qulib signs in, then scans.

### OAuth, SSO, magic link, or anything that cannot be scripted

OAuth and similar flows need human consent on the provider domain; Qulib does not automate them. Use the **CLI** (same machine as the browser):

```bash
qulib auth init --base-url https://app.example.com
```

Log in manually in the opened window, press ENTER in the terminal, then reuse the saved JSON with:

```bash
qulib analyze --url https://app.example.com --auth-storage-state ./qulib-storage-state.json
```

For MCP-driven workflows, run `auth init` on the machine where the MCP server runs, then pass `auth: { type: 'storage-state', path: '/absolute/path/to/qulib-storage-state.json' }` to `analyze_app`.

### Detecting auth before you configure anything

> "Use qulib's `detect_auth` tool on https://app.example.com — what auth pattern does it use and what should I do next?"

The tool returns `type`, `oauthButtons`, `recommendation`, and related fields so the agent can explain options honestly.

## Known limitations

Qulib discovers routes by following **same-site** links from pages it visits; it is not a full multi-site crawler (no sitemap-first mode, no unbounded domain expansion). Treat the route list as a sample of what was reachable within `maxPagesToScan` and `maxDepth`.

## Repository

Source and issues: **[github.com/TapeshN/qulib](https://github.com/TapeshN/qulib)**.

## License

MIT