# Prompt Hardener

Prompt Hardener analyzes **prompt-injection risk in LLM-based agents and applications**.

It gives you a single workflow to:

- describe your system in `agent_spec.yaml`
- run deterministic security analysis across prompt, tool, and architecture layers
- generate mitigations
- validate defenses with adversarial attack simulation
- export results as Markdown, HTML, or JSON

Prompt Hardener is designed for developers and security engineers who want to understand **how prompt injection can affect an agent** and **how to reduce that risk**.

## Why use it?

- **One spec for multiple agent types**: chatbot, RAG, tool-calling agent, and MCP agent
- **Deterministic first**: `init`, `validate`, `analyze`, `report`, and `diff` do not require an LLM API key
- **Layered security view**: inspect prompt, tool, and architecture risks separately
- **Practical remediation**: get recommended fixes for prompts, policies, tools, and trust boundaries
- **Attack validation**: test your spec against built-in adversarial scenarios
- **CI-friendly output**: export Markdown, HTML, or JSON
- **Interactive UI**: explore the workflow from a local Gradio app

## Installation

Choose the installation method that fits how you want to use Prompt Hardener.

### Using [pipx](https://pipx.pypa.io/)

Recommended when you want to install Prompt Hardener as an isolated CLI tool.

```bash
pipx install https://github.com/cybozu/prompt-hardener/releases/download/v0.6.0/prompt_hardener-0.6.0-py3-none-any.whl
```

### Using [uv](https://docs.astral.sh/uv/)

Recommended if you already use `uv` for Python tooling.

```bash
uv tool install https://github.com/cybozu/prompt-hardener/releases/download/v0.6.0/prompt_hardener-0.6.0-py3-none-any.whl
```

### Using pip

Use this if you prefer a standard Python environment.

```bash
pip install https://github.com/cybozu/prompt-hardener/releases/download/v0.6.0/prompt_hardener-0.6.0-py3-none-any.whl

# Or install the latest code from main
pip install git+https://github.com/cybozu/prompt-hardener.git
```

> **Note**
> If you see an `externally-managed-environment` error, use `pipx` or `uv`, or create a virtual environment first.

### For development

```bash
git clone https://github.com/cybozu/prompt-hardener.git
cd prompt-hardener
uv sync --extra dev --frozen

# Fallback if uv is unavailable
python3 -m venv .venv
source .venv/bin/activate
pip install -e .[dev]
```

## Quick start

The fastest way to try Prompt Hardener is to use one of the included example specs.

### 1) Install

```bash
# Install with uv
uv tool install https://github.com/cybozu/prompt-hardener/releases/download/v0.6.0/prompt_hardener-0.6.0-py3-none-any.whl
```

If you prefer `pipx`, `pip`, or a development install, see [Installation](#installation).

### 2) Analyze an example spec

```bash
cp examples/chatbot-minimal/agent_spec.yaml ./agent_spec.yaml

prompt-hardener validate agent_spec.yaml
prompt-hardener analyze agent_spec.yaml --format markdown
```

That gives you a complete static analysis run without setting any API credentials.

### 3) Export a shareable report

```bash
prompt-hardener analyze agent_spec.yaml -o analyze.json
prompt-hardener report analyze.json -f html -o report.html
```

## When do you need LLM API keys?

Most of the workflow is deterministic.

| Command | API key required? | Notes |
|---|---|---|
| `init` | No | Generate a starter `agent_spec.yaml` |
| `validate` | No | Schema + semantic validation |
| `analyze` | No | Static rule-based analysis |
| `report` | No | Render JSON results as Markdown / HTML / JSON |
| `diff` | No | Compare two specs |
| `remediate` | Sometimes | Prompt-layer remediation needs an LLM; `--layers tool architecture` stays deterministic |
| `simulate` | Yes | Attack simulation is LLM-backed |

Prompt Hardener supports these providers for LLM-backed commands:

- OpenAI
- Anthropic Claude
- AWS Bedrock (Claude v3 or newer)

### Example environment variables

```bash
# OpenAI
export OPENAI_API_KEY=...

# Anthropic Claude
export ANTHROPIC_API_KEY=...

# AWS Bedrock (alternative: use an AWS profile)
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_SESSION_TOKEN=...
```

## Your first real workflow

Once you are ready to analyze your own system, the standard workflow is:

```text
init -> validate -> analyze -> remediate -> simulate -> report -> diff
```

### Create a starter spec

```bash
prompt-hardener init --type chatbot -o agent_spec.yaml
prompt-hardener init --type rag -o agent_spec.yaml
prompt-hardener init --type agent -o agent_spec.yaml
prompt-hardener init --type mcp-agent -o agent_spec.yaml
```

Edit the generated file to match your system, then run the rest of the pipeline.

### Validate

```bash
prompt-hardener validate agent_spec.yaml
```

### Analyze

```bash
prompt-hardener analyze agent_spec.yaml --format markdown
```

To limit analysis to specific layers:

```bash
prompt-hardener analyze agent_spec.yaml --layers prompt tool architecture
```

### Remediate

```bash
prompt-hardener remediate agent_spec.yaml \
  -ea openai -em gpt-4o-mini \
  -o hardened.yaml \
  -rd ./reports
```

If you only want deterministic remediation suggestions for non-prompt layers:

```bash
prompt-hardener remediate agent_spec.yaml \
  --layers tool architecture \
  -ea openai -em gpt-4o-mini
```

### Simulate attacks

```bash
prompt-hardener simulate hardened.yaml \
  -ea openai -em gpt-4o-mini \
  -o simulation.json
```

Filter simulation by category or layer when you want a focused test:

```bash
prompt-hardener simulate hardened.yaml \
  -ea openai -em gpt-4o-mini \
  --categories "persona_switch,prompt_leaking" \
  --layers "prompt,tool" \
  -o simulation.json
```

### Compare before and after

```bash
prompt-hardener diff agent_spec.yaml hardened.yaml
```

## Supported agent types

| Type | Description | Analyzed layers |
|---|---|---|
| `chatbot` | Simple conversational bots | prompt |
| `rag` | Retrieval-augmented generation systems | prompt, architecture |
| `agent` | Tool-calling agents | prompt, tool, architecture |
| `mcp-agent` | MCP-connected agents | prompt, tool, architecture |

## What goes into `agent_spec.yaml`?

`agent_spec.yaml` is the single input to the main workflow.

At minimum, you describe:

- the agent type
- the system prompt
- the provider/model
- optional tools, policies, data sources, or MCP servers depending on agent type

If writing the spec from scratch is the bottleneck, we also provide [`agent-spec-builder`](./.agents/skills/agent-spec-builder/SKILL.md), an Agent Skill. It helps create a first draft that you can refine into `agent_spec.yaml` and then validate with Prompt Hardener.

For complete field definitions and full examples, see [docs/agent-spec.md](./docs/agent-spec.md).

## Included examples

Use these to understand the spec format and try the tool quickly:

- `examples/chatbot-minimal`
- `examples/rag-internal-assistant`
- `examples/agent-basic`

## Hardening techniques

Prompt-layer remediation can apply these techniques:

- `spotlighting`
- `random_sequence_enclosure`
- `instruction_defense`
- `role_consistency`
- `secrets_exclusion`

See [docs/techniques.md](./docs/techniques.md) for details and examples.

## Attack simulation

The simulator runs adversarial scenarios against your spec so you can validate defenses, not just lint configuration.

See [docs/attack-simulation.md](./docs/attack-simulation.md) for the built-in catalog and custom scenario format.

## Reports

After `analyze`, `remediate`, or `simulate`, you can render JSON results into a friendlier format:

```bash
prompt-hardener report results.json -f markdown
prompt-hardener report results.json -f html -o report.html
```

## Web UI

```bash
prompt-hardener webui
```

Then open `http://localhost:7860` in your browser.

## Documentations

- Rule catalog: [docs/analysis-rules.md](./docs/analysis-rules.md)
- Agent spec reference: [docs/agent-spec.md](./docs/agent-spec.md)
- Attack simulation: [docs/attack-simulation.md](./docs/attack-simulation.md)
- Techniques: [docs/techniques.md](./docs/techniques.md)
- Tutorials: [docs/tutorials.md](./docs/tutorials.md)

## Legacy commands

`evaluate` and `improve` are still available for backward compatibility with v0.4.0 workflows.

New users should start with the spec-based workflow above. For legacy usage details, run:

```bash
prompt-hardener evaluate --help
prompt-hardener improve --help
```

## License

Apache-2.0