# Policy Authoring Guide Write custom OPA Rego policies to control Dockerfile generation, Kubernetes manifests, and security enforcement in containerization-assist. ## Overview ### What Are Policies? Policies in containerization-assist are OPA Rego modules that control and customize the containerization workflow. They enable you to: - **Pre-configure** tool behavior before generation - **Filter and prioritize** knowledge recommendations - **Inject** organization-specific templates - **Validate** generated artifacts against compliance rules - **Customize** behavior by environment, language, cloud provider, etc. ### Policy Lifecycle ``` ┌─────────────────┐ │ Input (Tool │ │ + Context) │ └────────┬────────┘ │ ▼ ┌─────────────────┐ │ 1. Pre-Gen │ generation_config │ Configuration │ (Set defaults, constraints) └────────┬────────┘ │ ▼ ┌─────────────────┐ │ 2. Generation │ knowledge_filtering, templates │ Time │ (Filter/inject recommendations) └────────┬────────┘ │ ▼ ┌─────────────────┐ │ 3. Post-Gen │ validation_rules │ Validation │ (Check compliance) └────────┬────────┘ │ ▼ ┌─────────────────┐ │ Output │ │ (Validated) │ └─────────────────┘ ``` ### File Structure ``` my-policy.rego # Policy implementation my-policy_test.rego # OPA test suite (required) ``` --- ## Template Injection Template injection allows you to automatically inject organizational standards into generated artifacts. ### Quick Start 1. **Create a template policy** (`policies/my-templates.rego`): ```rego package containerization.templates import rego.v1 ca_cert_template := { "id": "org-ca-certs", "section": "security", "description": "Install organization CA certificates", "content": "COPY certs/ca.crt /usr/local/share/ca-certificates/\nRUN update-ca-certificates", "priority": 100 } dockerfile_templates contains ca_cert_template templates := { "dockerfile": [template | template := dockerfile_templates[_]], "kubernetes": [] } ``` 2. **Use the policy**: ```bash export CUSTOM_POLICY_PATH=policies/my-templates.rego containerization-assist generate-dockerfile --language node --environment production ``` 3. **See templates in output**: - Templates appear in recommendations with `policyDriven: true` - Automatically injected without user intervention For complete examples, see: - [Template Injection Examples](../examples/template-injection-example.md) --- ## Policy Architecture ### Basic Structure Every policy follows this template: ```rego # Policy header with metadata package containerassist.my_policy import rego.v1 # ============================================================================ # Configuration (Phase 1: Pre-Generation) # ============================================================================ generation_config contains config if { input.tool == "generate-dockerfile" # Your pre-generation configuration logic config := { "baseImage": "node:20-alpine", "requireNonRoot": true, } } # ============================================================================ # Knowledge Filtering (Phase 2: Generation-Time) # ============================================================================ knowledge_filtering contains filter if { # Filter knowledge recommendations filter := { "action": "exclude", "pattern": "*-deprecated-*", "reason": "Exclude deprecated patterns", } } # ============================================================================ # Templates (Phase 2: Generation-Time) # ============================================================================ templates contains template if { # Inject organization-specific templates template := { "id": "my-org-template", "category": "security", "recommendation": "Add company CA certificates", "code_snippet": "COPY ca-certs.pem /etc/ssl/certs/", "policyDriven": true, } } # ============================================================================ # Validation (Phase 3: Post-Generation) # ============================================================================ validation_rules contains rule if { # Validate generated content input.content != null # Your validation logic rule := { "level": "error", # or "warning", "info" "message": "Validation failed", "suggestion": "How to fix it", } } # ============================================================================ # Metadata # ============================================================================ metadata := { "name": "My Policy", "version": "1.0.0", "description": "Policy description", } ``` --- ## Phase-by-Phase Guide ### Phase 1: Pre-Generation Configuration **When:** Before tool execution starts **Purpose:** Set defaults, constraints, and configuration **Returns:** Configuration object #### Example: Dockerfile Generation Config ```rego generation_config contains config if { input.tool == "generate-dockerfile" input.environment == "production" config := { "baseImage": "gcr.io/distroless/nodejs20-debian12", "requireNonRoot": true, "requireHealthCheck": true, "enableMultiStage": true, "optimizationLevel": "aggressive", } } ``` #### Example: Kubernetes Generation Config ```rego generation_config contains config if { input.tool == "generate-k8s-manifests" # Calculate resource limits based on tier tier_cpu := tier_cpu_limits[input.tier] tier_memory := tier_memory_limits[input.tier] config := { "resources": { "requests": { "cpu": sprintf("%dm", [tier_cpu * 0.5]), "memory": sprintf("%dMi", [tier_memory * 0.75]), }, "limits": { "cpu": sprintf("%dm", [tier_cpu]), "memory": sprintf("%dMi", [tier_memory]), }, }, "replicas": tier_replicas[input.tier], "enableHPA": input.tier != "starter", } } # Helper data structures tier_cpu_limits := {"starter": 500, "pro": 2000, "enterprise": 8000} tier_memory_limits := {"starter": 512, "pro": 2048, "enterprise": 8192} tier_replicas := {"starter": 1, "pro": 3, "enterprise": 5} ``` #### Available Configuration Keys **Dockerfile:** - `baseImage`: Override default base image - `requireNonRoot`: Enforce non-root user - `requireHealthCheck`: Mandate HEALTHCHECK directive - `enableMultiStage`: Force multi-stage builds - `optimizationLevel`: "aggressive", "balanced", "quality" - `includeDevTools`: Include development tools - `includeBuildTools`: Include build-time dependencies **Kubernetes:** - `resources.requests`: CPU/memory requests - `resources.limits`: CPU/memory limits - `replicas`: Number of pod replicas - `enableHPA`: Enable HorizontalPodAutoscaler - `securityContext`: Pod security context - `networkPolicy`: "required", "recommended", "optional" - `podSecurityStandard`: "privileged", "baseline", "restricted" --- ### Phase 2a: Knowledge Filtering **When:** During tool execution **Purpose:** Filter/prioritize knowledge recommendations **Returns:** Set of filter rules #### Exclude Patterns ```rego # Block deprecated recommendations knowledge_filtering contains filter if { filter := { "action": "exclude", "pattern": "*-deprecated-*", "reason": "Deprecated patterns not allowed", } } # Environment-specific exclusions knowledge_filtering contains filter if { input.environment == "production" filter := { "action": "exclude", "pattern": "debug-*", "reason": "Debug tools not allowed in production", } } ``` #### Prioritize Patterns ```rego # Boost security recommendations knowledge_filtering contains filter if { filter := { "action": "prioritize", "tags": ["security", "hardening"], "weight": 2.0, # 2x priority "reason": "Security is top priority", } } # Cloud-specific prioritization knowledge_filtering contains filter if { input.cloudProvider == "aws" filter := { "action": "prioritize", "tags": ["ecr", "aws"], "weight": 1.5, "reason": "Prefer AWS-native solutions", } } ``` #### Filter Actions - `exclude`: Remove matching knowledge entries - `prioritize`: Boost weight of matching entries - `deprioritize`: Reduce weight of matching entries #### Pattern Matching - `*` wildcard: `"node-*"` matches `"node-security-scan"` - Tag matching: `["security", "dockerfile"]` - ID matching: `"dockerfile-user-root"` --- ### Phase 2b: Template Injection **When:** During tool execution **Purpose:** Add organization-specific recommendations **Returns:** Set of templates to inject #### Basic Template ```rego templates contains template if { input.tool == "generate-dockerfile" template := { "id": "org-ca-certificates", "category": "security", "recommendation": "Install company CA certificates", "code_snippet": `# Company CA certificates COPY certificates/ca-bundle.crt /etc/ssl/certs/company-ca.crt ENV SSL_CERT_FILE=/etc/ssl/certs/company-ca.crt`, "policyDriven": true, "priority": "high", } } ``` #### Conditional Templates ```rego # Only for production Java apps templates contains template if { input.tool == "generate-dockerfile" input.environment == "production" lower(input.language) == "java" template := { "id": "org-java-observability", "category": "monitoring", "recommendation": "Add Datadog APM agent", "code_snippet": `# Datadog APM RUN wget -O dd-java-agent.jar https://dtdg.co/latest-java-tracer ENV JAVA_TOOL_OPTIONS=-javaagent:/app/dd-java-agent.jar`, "policyDriven": true, } } ``` #### Template Structure Required fields: - `id`: Unique identifier - `category`: "security", "optimization", "monitoring", etc. - `recommendation`: Human-readable description - `code_snippet`: Code to inject - `policyDriven`: Always `true` for policy-injected templates Optional fields: - `priority`: "critical", "high", "medium", "low" - `tags`: `["production", "java"]` - `documentation`: Link to internal docs --- ### Phase 3: Post-Generation Validation **When:** After content is generated **Purpose:** Validate against compliance rules **Returns:** Set of validation rules (violations/warnings) #### Error Rules (Blocking) ```rego validation_rules contains rule if { input.tool == "generate-dockerfile" input.content != null # Check for root user contains(lower(input.content), "user root") rule := { "level": "error", # Blocks generation "message": "Root user detected in Dockerfile", "suggestion": "Add USER directive with non-root user (e.g., USER 65534)", } } ``` #### Warning Rules (Non-blocking) ```rego validation_rules contains rule if { input.tool == "generate-k8s-manifests" input.content != null # Check resource limits cpu_limit := parse_cpu(input.content.resources.limits.cpu) cpu_limit > 4000 # > 4 CPU rule := { "level": "warning", # Doesn't block "message": sprintf("High CPU limit: %dm", [cpu_limit]), "suggestion": "Consider reducing CPU limit to save costs", } } ``` #### Info Rules (Advisory) ```rego validation_rules contains rule if { input.environment == "development" rule := { "level": "info", "message": "Running in development mode", "suggestion": "Remember to use production policy before deploying", } } ``` --- ## Schema Reference ### Input Schema The `input` object contains tool context: ```rego input := { # Required "tool": "generate-dockerfile" | "generate-k8s-manifests" | ..., # Common "environment": "development" | "staging" | "production", "language": "node" | "python" | "java" | "go" | ..., # Tool-specific "repositoryPath": "/path/to/repo", "targetPlatform": "linux/amd64", "name": "my-app", "version": "1.0.0", # Custom (your organization) "tier": "starter" | "professional" | "enterprise", "cloudProvider": "aws" | "gcp" | "azure", "region": "us-east-1", "teamId": "platform-team", # Post-generation only "content": "..." | {...}, # Generated artifact } ``` ### Output Schema #### generation_config ```rego config := { # Any key-value pairs "baseImage": "node:20", "resources": {...}, ... } ``` #### knowledge_filtering ```rego filter := { "action": "exclude" | "prioritize" | "deprioritize", "pattern": "*-pattern-*", # For pattern matching "tags": ["tag1", "tag2"], # For tag matching "weight": 2.0, # For prioritize/deprioritize "reason": "Why this filter", } ``` #### templates ```rego template := { "id": "unique-id", "category": "security" | "optimization" | ..., "recommendation": "Human description", "code_snippet": "Code to inject", "policyDriven": true, "priority": "critical" | "high" | "medium" | "low", # Optional "tags": ["tag1"], # Optional "documentation": "https://...", # Optional } ``` #### validation_rules ```rego rule := { "level": "error" | "warning" | "info", "message": "What went wrong", "suggestion": "How to fix it", } ``` --- ## Best Practices ### 1. Use Descriptive IDs ```rego # ✅ Good "id": "org-security-ca-certificates" # ❌ Bad "id": "template1" ``` ### 2. Provide Helpful Messages ```rego # ✅ Good rule := { "level": "error", "message": "CPU limit (8000m) exceeds starter tier allowance (500m)", "suggestion": "Reduce CPU limit to 500m or upgrade to Professional tier" } # ❌ Bad rule := { "level": "error", "message": "CPU too high", "suggestion": "Fix it" } ``` ### 3. Test Everything Every policy should have comprehensive tests: ```rego # my-policy_test.rego package containerassist.my_policy_test import rego.v1 import data.containerassist.my_policy test_production_uses_distroless if { config := my_policy.generation_config with input as { "tool": "generate-dockerfile", "environment": "production", } contains(config.baseImage, "distroless") } ``` Run tests: ```bash opa test my-policy.rego my-policy_test.rego -v ``` ### 4. Use Helper Functions ```rego # Extract repeated logic parse_cpu(cpu_str) := millicores if { endswith(cpu_str, "m") trimmed := trim_suffix(cpu_str, "m") millicores := to_number(trimmed) } parse_cpu(cpu_str) := millicores if { not endswith(cpu_str, "m") cores := to_number(cpu_str) millicores := cores * 1000 } ``` ### 5. Environment-Aware Rules ```rego # Strict in production validation_rules contains rule if { input.environment == "production" has_issue(input.content) rule := {"level": "error", ...} } # Lenient in development validation_rules contains rule if { input.environment == "development" has_issue(input.content) rule := {"level": "warning", ...} } ``` --- ## Debugging ### Policy Simulation Tool (Recommended) The **policy simulation tool** shows how your custom policy combines with the built-in system by running tools with and without your policy: ```bash # Simulate your policy npm run policy:simulate -- \ --policy policies.user.examples/my-policy.rego \ --tool generate-dockerfile \ --input '{"language": "node", "environment": "production", "teamTier": "starter"}' ``` **What it shows:** - ✅ Generation configuration changes - ✅ Before/After output comparison - ✅ Policy-driven recommendations highlighted - ✅ Validation rules triggered **Example output:** ``` ================================================================================ 📈 SIMULATION RESULTS ================================================================================ 📊 Impact Summary: • Generation Config: ✅ Modified • Output Changed: ✅ Yes 📦 Output Comparison: WITHOUT Policy: Summary: Standard Dockerfile recommendations Recommendations: 10 total WITH Policy: Summary: Policy-customized Dockerfile Recommendations: 15 total Policy-Driven: 5 recommendations • org-ca-certificates: Install company CA certificates • tier-resource-limits: Apply tier-based resource limits ``` **Use cases:** - Preview policy impact before deployment - Understand how custom policy combines with built-in policies - Debug unexpected policy behavior - Validate policy changes ### Test Policy in Isolation For testing individual policy rules in isolation (doesn't show integration): ```bash # Test generation_config echo '{"tool": "generate-dockerfile", "environment": "production"}' | \ opa eval --data my-policy.rego \ 'data.containerassist.my_policy.generation_config' # Test templates echo '{"tool": "generate-k8s-manifests", "language": "java"}' | \ opa eval --data my-policy.rego \ 'data.containerassist.my_policy.templates' ``` ### Enable Debug Logging Set environment variable: ```bash export LOG_LEVEL=debug ``` ### Check Policy Syntax ```bash opa check my-policy.rego ``` ### Run with Coverage ```bash opa test --coverage my-policy.rego my-policy_test.rego ``` ### Trace Policy Evaluation ```rego # Add trace statements trace(sprintf("Config: %v", [config])) ``` --- ## Common Pitfalls ### 1. Forgetting `import rego.v1` ```rego # ❌ Will cause issues package containerassist.my_policy # ✅ Always import package containerassist.my_policy import rego.v1 ``` ### 2. Missing Conditionals ```rego # ❌ Fires for all tools generation_config contains config if { config := {"baseImage": "node:20"} } # ✅ Tool-specific generation_config contains config if { input.tool == "generate-dockerfile" config := {"baseImage": "node:20"} } ``` ### 3. Not Handling Null/Missing Values ```rego # ❌ Crashes if input.tier is null tier_cpu_limits[input.tier] # ✅ Safe with default tier := object.get(input, "tier", "starter") tier_cpu_limits[tier] ``` ### 4. Inefficient Validation ```rego # ❌ Checks even when content is null validation_rules contains rule if { contains(input.content, "USER root") # Crashes! } # ✅ Guard with null check validation_rules contains rule if { input.content != null is_string(input.content) contains(input.content, "USER root") } ``` ### 5. Overly Broad Patterns ```rego # ❌ Blocks too much knowledge_filtering contains filter if { filter := {"action": "exclude", "pattern": "*"} } # ✅ Specific patterns knowledge_filtering contains filter if { filter := {"action": "exclude", "pattern": "*-deprecated-*"} } ``` --- ## Additional Resources - [OPA Documentation](https://www.openpolicyagent.org/docs/latest/) - [Rego Style Guide](https://www.openpolicyagent.org/docs/latest/policy-language/) - [Policy Examples](https://github.com/Azure/containerization-assist/tree/main/policies.user.examples) - [Migration Guide](./policy-migration-v3.md) --- ## Support - GitHub Issues: [Report bugs](https://github.com/Azure/containerization-assist/issues) - Discussions: [Ask questions](https://github.com/Azure/containerization-assist/discussions) - Internal: See your organization's internal wiki for organization-specific policy guidance, if available