# ATR Rule Schema -- Agent Threat Rules # # Machine-readable form of the rule structure defined in SPEC.md # (Section 5). When the two disagree, SPEC.md is normative. # # Status: Draft (tracks SPEC.md v1.0) # License: MIT # Canonical reference: https://github.com/Agent-Threat-Rule/agent-threat-rules/blob/main/SPEC.md $schema: "https://json-schema.org/draft/2020-12/schema" title: ATR Rule Schema description: Schema for Agent Threat Rules (ATR) detection rules. Tracks SPEC.md v1.0. version: "1.0.0" type: object required: - schema_version - title - id - status - description - author - date - severity - maturity - tags - agent_source - detection - response # Note (v1.1): detection_tier is now OPTIONAL. It was required by the # pre-1.0 spec drafts but is superseded by detection.method (atr-method-v1.1.md §4). # Rules MAY still set detection_tier for backward compatibility with older engines. properties: # === Metadata === schema_version: type: string description: "ATR schema version this rule conforms to (e.g., \"0.1\")" title: type: string description: Human-readable rule name id: type: string pattern: "^ATR-\\d{4}-\\d{5}$" description: "Unique rule identifier. Format: ATR-YYYY-NNNNN (e.g., ATR-2026-00001)" status: type: string enum: [draft, experimental, stable, deprecated] description: Rule maturity status description: type: string description: Detailed description of the attack this rule detects author: type: string description: Rule author or organization date: type: string pattern: "^\\d{4}/\\d{2}/\\d{2}$" description: "Creation date in YYYY/MM/DD format" modified: type: string pattern: "^\\d{4}/\\d{2}/\\d{2}$" description: "Last modification date in YYYY/MM/DD format" rule_version: type: integer minimum: 1 description: "Rule version number. Bump when detection logic changes. Starts at 1." # === Classification === detection_tier: type: string enum: [pattern, signature, semantic, behavioral, protocol, trace] description: > Detection approach used by this rule. OPTIONAL (v1.1: superseded by detection.method). Kept for backward compatibility with older engines. Aligned with the 5 method values in atr-method-v1.1.md plus the legacy "protocol" value for v1.0 conformance. maturity: type: string enum: [experimental, test, stable, deprecated] description: Maturity level of this rule # === Severity === severity: type: string enum: [critical, high, medium, low, informational] description: Severity level of the detected threat # === References (alignment with existing frameworks) === references: type: object description: Mappings to established security frameworks properties: owasp_llm: type: array items: type: string description: "OWASP LLM Top 10 references (e.g., LLM01:2025)" mitre_atlas: type: array items: type: string description: "MITRE ATLAS technique IDs (e.g., AML.T0054)" mitre_attack: type: array items: type: string description: "MITRE ATT&CK technique IDs (if applicable)" cve: type: array items: type: string description: Related CVE identifiers owasp_agentic: type: array items: type: string description: "OWASP Agentic Top 10 references (e.g., ASI01, ASI02)" owasp_ast: type: array items: type: string description: "OWASP Agentic Skills Top 10 references (e.g., AST01)" safe_mcp: type: array items: type: string description: "SAFE-MCP technique IDs (e.g., SMCP-T001)" oscal_assessment_objective: type: array items: type: string description: > OSCAL Assessment Plan/Result objective IDs or component-definition UUIDs this Rule supplies evidence for. Lets the rule act as an evidence source beneath an OSCAL-driven assessment. See atr-method-v1.1.md §9. nist_csf: type: array items: type: string description: > NIST CSF 2.0 subcategory identifiers (e.g., DE.CM-09, PR.IR-01). Required for citation in NIST IR 8596 Cyber AI Profile Informative References. etsi_ts_104223: type: array items: type: string description: > ETSI TS 104 223 principle / sub-principle identifiers (e.g., P4.3). The ETSI standard upstreamed UK NCSC's AI Cyber Code of Practice (Jan 2025); maps ATR Rules to the 13 principles / 72 sub-principles. probe_id: type: array items: type: string description: > Identifier of the adversarial probe (red-team generator) whose output this Rule is designed to detect. Format: ":" e.g. "pyrit:indirect_pi_v2" or "garak:promptinject.HijackHateHumans". Lets a Rule pair with its generating probe so detection coverage can be measured end-to-end against adversarial test suites. See atr-method-v1.1.md §9.2. external_references: type: object description: > Cross-references to detection rules in other vendor / community rule registries that cover the same or related threats. Lets ATR act as a taxonomy bridge across rule formats without claiming authority over the other registry's rule IDs. See atr-method-v1.1.md §9.4. Each property is an array of opaque identifiers in the target registry's native format. ATR engines MUST NOT execute these identifiers; they are evidence only. Downstream tooling MAY use them to enrich SIEM events, correlate detections, or generate OSCAL assessment results that span rule formats. properties: cccs_yara: type: array items: type: string description: > CCCS-Yara rule names (e.g., "APT_CN_BEACON_2024"). Per the 2026-05-26 CCCS-Yara#100 closing comment, cross-reference ownership lives on the ATR side. See spec/external-registries/cccs-yara.md. sigma: type: array items: type: string description: > Sigma rule UUIDs (e.g., "12345678-1234-1234-1234-123456789abc") that cover the same or correlated threat. Lets ATR rules bridge into the wider Sigma ecosystem. yara: type: array items: type: string description: > Generic YARA rule names from public corpora (YARA-Forge, Florian Roth's signature-base, etc.) covering related artifacts. misp_taxonomy: type: array items: type: string description: > MISP taxonomy entries (e.g., "atr:category=prompt-injection" or "misp-taxonomies:dark-web=...") referencing this Rule. stix_pattern: type: array items: type: string description: > STIX 2.1 indicator pattern IDs covering the same Indicator of Compromise. research: type: array items: type: string description: "Research paper references or URLs" # === Tags (ATR classification) === tags: type: object required: [category] properties: category: type: string enum: - prompt-injection - tool-poisoning - context-exfiltration - agent-manipulation - privilege-escalation - excessive-autonomy - data-poisoning - model-abuse - skill-compromise description: Primary attack category subcategory: type: string description: More specific classification within the category confidence: type: string enum: [high, medium, low] description: Expected accuracy of this rule (high = low false positive rate) scan_target: type: string enum: [mcp, skill, both, runtime] description: "Which scan path this rule belongs to. mcp=runtime events, skill=SKILL.md static scan, both=fires in both paths, runtime=behavior monitoring." # === Agent Source (analogous to Sigma's logsource) === agent_source: type: object required: [type] description: > Defines what kind of agent data this rule inspects. Analogous to Sigma's logsource, but for agent behaviors. properties: type: type: string enum: - llm_io # LLM input/output (prompts and completions) - tool_call # Function/tool call requests - mcp_exchange # MCP protocol messages - agent_behavior # Agent behavioral metrics and patterns - multi_agent_comm # Inter-agent communication - context_window # Context window contents - memory_access # Agent memory read/write operations - skill_lifecycle # MCP skill registration, update, removal events - skill_permission # Skill permission requests and boundary checks - skill_chain # Multi-skill invocation sequences - agent_trace # Agent execution trace (OpenInference/OTel GenAI spans); see atr-method-v1.1.md description: Type of agent data stream to monitor framework: type: array items: type: string description: > Applicable AI frameworks (e.g., langchain, crewai, autogen, openai, anthropic, custom, any) provider: type: array items: type: string description: > Applicable LLM providers (e.g., ollama, openai, anthropic, any) # === Detection Logic === detection: type: object required: [conditions, condition] properties: method: type: string enum: [pattern, signature, semantic, behavioral, trace] default: pattern description: > Detection method this rule uses. Defaults to "pattern" (regex/string match on text fields) for backward compatibility with v1.0 rules. Other methods require additional fields documented in spec/atr-method-v1.1.md: - signature: exact-match on hash / package_name / registry_url (see §5) - semantic: LLM-as-judge intent classification (see §6) - behavioral: metric threshold over a time window (see §7) - trace: declarative assertions over agent execution traces (see §8) Engines that do not implement a given method MUST skip rules using it rather than fail closed on unknown method values. signature: type: object description: > REQUIRED when method=signature. See atr-method-v1.1.md §5. required: [indicators] properties: indicators: type: array minItems: 1 description: "Non-empty list of indicator objects per §5.2.1" items: type: object required: [type, value, target_field] properties: type: type: string enum: [sha256, sha512, blake2b-256, package_name, registry_url, skill_id] description: "Indicator type. Hash types require hex-encoded value (lowercase)." value: type: string description: "Indicator value (hex hash or string identifier)" target_field: type: string description: "Source field on the Input to match against (e.g., skill.content, skill.manifest.name)" provenance: type: object description: "OPTIONAL forensic provenance metadata" properties: first_observed: type: string description: "ISO 8601 date when indicator was first attributed" source: type: string attribution: type: string match_logic: type: string enum: [any, all] default: any description: "any = match if any indicator matches; all = match only if every indicator matches" semantic: type: object description: > REQUIRED when method=semantic. See atr-method-v1.1.md §4. properties: judge_model_class: type: string description: "Class of judge model (e.g., gpt-4-class, llama-prompt-guard, claude-haiku)" prompt_template: type: string description: "Prompt template with {{input}} placeholder" output_schema: type: object description: "Expected JSON shape of judge output (category, confidence, evidence)" threshold: type: number minimum: 0.0 maximum: 1.0 description: "Minimum confidence to trigger match" cache_ttl: type: integer description: "Cache TTL in seconds for identical inputs" judge_prompt_hash: type: string description: "SHA-256 hash of the canonical judge prompt for regression testing" fallback_method: type: string enum: [pattern, none] description: "Method to fall back to if judge is unavailable" trace: type: object description: > REQUIRED when method=trace. See atr-method-v1.1.md §8. properties: ingest_format: type: string enum: [openinference, otel_gen_ai] default: openinference description: "Trace ingest format the rule expects" forbid: type: array description: "Span shapes that MUST NOT appear in the trace" items: {type: object} require: type: array description: "Span shapes that MUST appear (optionally with ordering constraints)" items: {type: object} invariant: type: array description: "Attributes that MUST hold across a set of spans" items: {type: object} behavioral: type: object description: > REQUIRED when method=behavioral. See atr-method-v1.1.md §7. required: [metric, aggregation, window, operator, threshold] properties: metric: type: string description: "Name of the metric being observed (e.g., tool_calls_per_session, token_spend_usd)" aggregation: type: string enum: [count, sum, avg, max, distinct_count, rate] description: "How event values aggregate into a single metric value over the window" window: type: string description: "ISO 8601 duration (e.g., PT5M, PT1H) or shorthand (5m, 1h)" operator: type: string enum: [gt, lt, gte, lte, eq, deviation_from_baseline] description: "Comparison operator between aggregated metric and threshold" threshold: type: number description: "Numeric value compared against the aggregated metric. For deviation_from_baseline, expressed as stddev multiplier or fractional change." group_by: type: array items: {type: string} description: "Dimensions to partition the aggregation over (e.g., session.id, user.id)" filter: type: object description: "Pre-aggregation event filter using §8.3 predicate vocabulary" baseline: type: object description: "Required only when operator=deviation_from_baseline" properties: source: type: string enum: [rolling_mean, historical_percentile, fixed] lookback: type: string description: "Duration to compute baseline over (e.g., P7D)" percentile: type: number minimum: 0 maximum: 100 value: type: number deviation_unit: type: string enum: [stddev, fraction] min_events: type: integer minimum: 1 description: "Minimum event count in window before rule may fire" cooldown: type: string description: "ISO 8601 duration the rule must not re-fire on same group_by partition after Match" conditions: description: > Detection conditions. Supports two formats: 1. Array format (recommended): List of {field, operator, value} objects 2. Named-map format: Named condition blocks for complex detection logic oneOf: # -- Array format (used by most rules) -- - type: array items: type: object required: [field, operator, value] properties: field: type: string description: > Field to inspect (e.g., user_input, agent_output, tool_response, tool_name, tool_args, content) operator: type: string enum: [regex, contains, exact, starts_with] description: How the value is matched against the field value: type: string description: Pattern to match (regex string if operator is regex) description: type: string description: Human-readable description of what this condition detects language: type: string enum: [en, zh-Hant, zh-Hans, ja, es, ar] default: en description: > BCP-47 language tag this condition targets. Optional; default 'en'. Engine applies NFKC normalization at match time. Per-language conditions on the same rule are combined under condition: any. Adopted v3.0.0 (2026-05-18). # -- Named-map format (for complex/behavioral detection) -- - type: object description: Named condition blocks (referenced by the condition expression) additionalProperties: type: object properties: field: type: string description: Field to inspect patterns: type: array items: type: string description: Patterns to match against the field value match_type: type: string enum: [contains, regex, exact, starts_with] description: How patterns are matched case_sensitive: type: boolean default: false metric: type: string description: Behavioral metric to evaluate (v0.2+) operator: type: string enum: [gt, lt, eq, gte, lte, deviation_from_baseline] description: Comparison operator for behavioral thresholds threshold: type: number description: Numeric threshold for the metric window: type: string description: "Time window for behavioral analysis (e.g., 5m, 1h, 30s)" ordered: type: boolean description: Whether steps must occur in order within: type: string description: Maximum time span for the full sequence steps: type: array items: type: object description: Ordered list of conditions that form the attack sequence condition: type: string description: > How to combine conditions. Use "any" or "or" for match-any, "all" or "and" for match-all. Example: "pattern_match AND behavioral" false_positives: type: array items: type: string description: Known scenarios that may trigger false positives # === Response Actions (ATR-specific, not in Sigma) === response: type: object required: [actions] properties: actions: type: array items: type: string enum: # v1.0 vocabulary - block_input # Reject the user/agent input - block_output # Suppress the agent output - block_tool # Prevent the tool call from executing - quarantine_session # Isolate the entire session - reset_context # Clear agent context/memory - alert # Send alert to security team - snapshot # Capture full session state for forensics - escalate # Escalate to human reviewer - reduce_permissions # Reduce agent's available tools/capabilities - kill_agent # Terminate the agent process # SPEC.md Appendix A canonical action vocabulary (v1.0+) - block_request # Reject the originating request (generic) - log_alert # Emit a structured alert event without blocking - quarantine_artifact # Isolate a specific artifact (skill, tool, context blob) - require_human_review # Pause the action pending operator approval - redact_match # Hash or truncate matched substring in output - rate_limit_source # Apply rate limit to the source agent/user/session - revoke_credential # Revoke an active credential identified in the match - notify_operator # Out-of-band notification (paging, email, chat) description: Actions to take when the rule triggers auto_response_threshold: type: string enum: - low - medium - high - critical description: > Severity threshold for automatic response. Below this threshold, only alert; above, execute response actions. message_template: type: string description: > Template for alert messages. Supports placeholders: {matched_pattern}, {truncated_input}, {truncated_output}, {source_ip_or_user}, {tool_name}, {mcp_server_url}, {rule_id}, {severity} # === Test Cases === test_cases: type: object description: Validation test cases shipped with the rule properties: true_positives: type: array items: type: object properties: input: type: string tool_response: type: string agent_output: type: string expected: type: string enum: [triggered] description: type: string description: Inputs that SHOULD trigger this rule true_negatives: type: array items: type: object properties: input: type: string tool_response: type: string agent_output: type: string expected: type: string enum: [not_triggered] description: type: string description: Inputs that should NOT trigger this rule # === Evasion Tests === evasion_tests: type: array description: Optional test cases for known evasion/bypass techniques items: type: object properties: input: type: string description: The evasion attempt input expected: type: string description: Expected detection outcome bypass_technique: type: string description: Name or description of the bypass technique used notes: type: string description: Additional notes about the evasion test