--- name: artifact-collector description: Use this skill when users need to collect, manage, or analyze forensic artifacts such as files, memory dumps, Windows Event Logs, Mac Unified Logs, or network packet captures (PCAP) from endpoints. --- # LimaCharlie Artifact Collector This skill provides comprehensive guidance for collecting and managing forensic artifacts from endpoints using LimaCharlie. Use this skill when users need to gather evidence, collect files, capture memory, stream logs, or perform forensic investigations. ## Quick Navigation - **SKILL.md** (this file): Overview, quick start, common workflows - **[REFERENCE.md](REFERENCE.md)**: Complete command syntax and parameters - **[EXAMPLES.md](EXAMPLES.md)**: 10 detailed investigation scenarios - **[TROUBLESHOOTING.md](TROUBLESHOOTING.md)**: Common issues and solutions --- ## Artifact Collection Overview ### What are Artifacts? Artifacts are pieces of forensic evidence collected from endpoints during security investigations or incident response. In LimaCharlie, artifacts can include: - Files from disk - Memory dumps (full system or process-specific) - Windows Event Log (WEL) data - Mac Unified Log (MUL) data - Network packet captures (PCAP) - File system metadata (MFT) ### Why Collect Artifacts? Artifact collection is critical for: - **Incident Response**: Gathering evidence during security incidents - **Forensic Analysis**: Conducting detailed investigations - **Threat Hunting**: Searching for indicators of compromise - **Compliance**: Meeting regulatory evidence preservation requirements - **Malware Analysis**: Collecting suspicious files and memory for analysis ### Prerequisites To use artifact collection features, you must: 1. Enable the **Artifact Extension** in your organization 2. Enable the **Reliable Tasking Extension** (required dependency) 3. Configure artifact collection rules (optional, for automated collection) **Note on Billing**: While the Artifact extension is free to enable, ingested artifacts incur charges. Refer to LimaCharlie pricing for artifact ingestion and retention costs. --- ## Artifact Types Overview ### 1. Files Collect files from endpoints for analysis or preservation. **Use Cases**: Retrieve suspicious executables, collect log files, gather configuration files **Collection Pattern**: Use file paths with wildcards or exact paths - `C:\Users\*\Downloads\*.exe` - `/var/log/auth.log` ### 2. Memory Dumps Capture volatile memory for forensic analysis. **Types**: Full Memory Dumps, Process Memory, Memory Strings **Use Cases**: Detect in-memory malware, analyze running processes, extract encryption keys ### 3. Windows Event Logs (WEL) Stream or collect Windows Event Log data. **Collection Modes**: - **Real-time streaming**: Use `wel://` pattern (included in sensor flat rate) - **File collection**: Collect `.evtx` files (incurs artifact costs) **Common Logs**: `wel://Security:*`, `wel://System:*`, `wel://Application:*` ### 4. Mac Unified Logs (MUL) Stream or collect macOS Unified Logging data. **Collection Pattern**: Use `mul://` prefix for real-time streaming ### 5. Network Packet Captures (PCAP) Capture network traffic for analysis (Linux only). **Use Cases**: Network forensics, protocol analysis, data exfiltration detection ### 6. File System Metadata (MFT) Collect Master File Table data from Windows systems. **Use Cases**: Timeline analysis, file system forensics, deleted file recovery --- ## Working with Timestamps **IMPORTANT**: When users provide relative time offsets (e.g., "last hour", "past 24 hours", "last week"), you MUST dynamically compute the current epoch timestamp based on the actual current time. Never use hardcoded or placeholder timestamps. ### Computing Current Epoch ```python import time # Compute current time dynamically current_epoch_seconds = int(time.time()) current_epoch_milliseconds = int(time.time() * 1000) ``` **The granularity (seconds vs milliseconds) depends on the specific API or MCP tool**. Always check the tool signature or API documentation to determine which unit to use. ### Common Relative Time Calculations **Example: "List artifacts from the last 24 hours"** ```python end_time = int(time.time()) # Current time start_time = end_time - 86400 # 24 hours ago ``` **Common offsets (in seconds)**: - 1 hour = 3600 - 24 hours = 86400 - 7 days = 604800 - 30 days = 2592000 **For millisecond-based APIs, multiply by 1000**. ### Critical Rules **NEVER**: - Use hardcoded timestamps - Use placeholder values like `1234567890` - Assume a specific current time **ALWAYS**: - Compute dynamically using `time.time()` - Check the API/tool signature for correct granularity - Verify the time range is valid (start < end) --- ## Quick Start ### Most Common Collection: Suspicious File ```bash # 1. Hash the file first (verify without downloading) file_hash --path C:\Users\alice\Downloads\suspicious.exe # 2. Collect the file artifact_get C:\Users\alice\Downloads\suspicious.exe # 3. Get file metadata file_info --path C:\Users\alice\Downloads\suspicious.exe ``` ### Quick Memory Investigation ```bash # 1. List processes os_processes # 2. Get memory map for suspicious process mem_map --pid 1234 # 3. Extract strings from memory mem_strings --pid 1234 # 4. Search for specific indicator mem_find_string --pid 1234 --string "malicious-domain.com" ``` ### Quick Log Collection ```bash # Windows Event Logs log_get Security log_get System # Or collect the .evtx files artifact_get C:\Windows\System32\winevt\Logs\Security.evtx ``` --- ## Common Workflows ### Workflow 1: Malware File Investigation **When to use**: Suspicious file detected on endpoint **Steps**: 1. **Hash first** to verify without collecting: ```bash file_hash --path C:\path\to\suspicious.exe ``` 2. **Get file details**: ```bash file_info --path C:\path\to\suspicious.exe ``` 3. **Collect the file**: ```bash artifact_get C:\path\to\suspicious.exe ``` 4. **Check surrounding context**: ```bash dir_list --path C:\path\to ``` ### Workflow 2: Process Memory Analysis **When to use**: Investigating suspicious running process **Steps**: 1. **Identify the process**: ```bash os_processes ``` 2. **Map the memory**: ```bash mem_map --pid ``` 3. **Extract strings**: ```bash mem_strings --pid ``` 4. **Search for IOCs**: ```bash mem_find_string --pid --string "suspicious-indicator" ``` 5. **Full memory dump** (if needed via Dumper extension): ```yaml extension request: {target: "memory", sid: "sensor-id", retention: 7} ``` ### Workflow 3: Authentication Investigation **When to use**: Investigating suspicious login activity **Steps**: 1. **Collect Security logs**: ```bash log_get Security ``` 2. **Check current users**: ```bash os_users ``` 3. **Check network connections**: ```bash netstat ``` 4. **Get current processes**: ```bash os_processes ``` ### Workflow 4: Automated Collection on Detection **When to use**: Set up proactive evidence collection **Example D&R Rule**: ```yaml detect: event: NEW_DOCUMENT op: and rules: - op: matches path: event/FILE_PATH re: .*\.(exe|dll|scr)$ case sensitive: false - op: contains path: event/FILE_PATH value: \Downloads\ respond: - action: report name: suspicious-file-written - action: task command: artifact_get {{ .event.FILE_PATH }} investigation: auto-collection suppression: max_count: 1 period: 1h is_global: false keys: - '{{ .event.FILE_PATH }}' ``` ### Workflow 5: Comprehensive Incident Response **When to use**: Active security incident requiring full investigation **Priority Order**: 1. **Volatile data first** (disappears when system powers off): ```bash os_processes netstat mem_map --pid mem_strings --pid ``` 2. **Critical files**: ```bash artifact_get artifact_get ``` 3. **System artifacts**: ```bash log_get Security log_get System history_dump ``` 4. **Full dumps** (via Dumper extension): - Memory dump - MFT dump --- ## Reliable Tasking Overview ### What is Reliable Tasking? Reliable Tasking allows you to queue artifact collection commands for sensors that are currently offline. Tasks are automatically delivered when the sensor comes online. ### When to Use Reliable Tasking - Sensors with intermittent connectivity - Collecting from remote/mobile devices - Ensuring collection happens on next check-in - Large-scale deployments ### Creating a Reliable Task **Via REST API**: ```bash curl --location 'https://api.limacharlie.io/v1/extension/request/ext-reliable-tasking' \ --header 'Authorization: Bearer $JWT' \ --header 'Content-Type: application/x-www-form-urlencoded' \ --data 'oid=$YOUR_OID&action=task&data={"context":"incident-response","selector":"tag==offline-hosts","task":"artifact_get C:\\Windows\\System32\\malware.exe","ttl":86400}' ``` **Key Parameters**: - `context`: Identifier for grouping related tasks - `selector`: Target criteria (sensor ID, tag, platform) - `task`: The command to execute - `ttl`: Time-to-live in seconds (default: 1 week) **Targeting Options**: - `sid`: Specific Sensor ID - `tag`: All sensors with a specific tag - `plat`: All sensors of a platform (windows, linux, macos) - `selector`: Advanced selector expression ### Tracking Responses Use the `context` parameter to track responses via D&R rules: ```yaml detect: op: contains event: RECEIPT path: routing/investigation_id value: incident-response respond: - action: report name: collection-completed - action: output name: artifact-responses ``` For complete Reliable Tasking details, see [REFERENCE.md](REFERENCE.md#reliable-tasking). --- ## Storage and Access ### Where Artifacts Are Stored Collected artifacts are stored in LimaCharlie's artifact storage with: - Configurable retention periods (default: 30 days) - Secure, encrypted storage - Access controls based on organization permissions - Unique artifact identifiers ### Accessing Collected Artifacts **Via Web UI**: 1. Navigate to **Sensors > Artifact Collection** 2. View collected artifacts list 3. Click on artifact to view details 4. Download artifact for analysis **Via REST API**: ```bash # List artifacts GET https://api.limacharlie.io/v1/orgs/{oid}/artifacts # Download specific artifact GET https://api.limacharlie.io/v1/orgs/{oid}/artifacts/{artifact_id} ``` ### Cost Optimization **Reduce Costs**: - Use `wel://` for logs instead of `.evtx` files - Set minimal necessary retention - Implement collection suppression - Filter events before collection - Use file hashes to avoid duplicate collection **Monitor Usage**: - Track artifact ingestion volumes - Review billing regularly - Set usage alerts (via Usage Alerts extension) --- ## Quick Command Reference For complete command syntax and parameters, see [REFERENCE.md](REFERENCE.md). ### File Commands - `artifact_get ` - Collect file to artifact storage - `file_get --path ` - Get file content in response - `file_hash --path ` - Calculate file hash - `file_info --path ` - Get file metadata - `dir_list --path ` - List directory contents ### Memory Commands - `mem_read --pid --base --size ` - Read process memory - `mem_map --pid ` - Get process memory map - `mem_strings --pid ` - Extract strings from memory - `mem_find_string --pid --string ` - Search memory for string ### Log Commands - `log_get ` - Get Windows Event Log (Windows only) - `history_dump` - Dump sensor's cached events ### Analysis Commands - `os_processes` - List running processes - `netstat` - Show network connections - `os_users` - List user accounts - `os_services` - List system services - `os_autoruns` - List autostart programs --- ## Best Practices 1. **Collect Volatile Data First**: Memory, processes, and network connections disappear when systems power off 2. **Use Suppression**: Prevent resource exhaustion with `max_count: 1` and `period: 1h` in automated rules 3. **Hash Before Collecting**: Use `file_hash` to verify files without downloading 4. **Use Investigation IDs**: Track related artifacts with the `investigation` parameter 5. **Set Appropriate Retention**: Balance compliance needs (7-90 days) with costs 6. **Monitor Costs**: Track artifact volumes, set alerts, review rules regularly See [TROUBLESHOOTING.md](TROUBLESHOOTING.md#cost-management) for detailed cost optimization strategies. --- ## Next Steps ### For Complete Command Reference See [REFERENCE.md](REFERENCE.md) for: - Complete command syntax - All parameters and options - Platform compatibility - Response event types - Dumper extension details ### For Investigation Examples See [EXAMPLES.md](EXAMPLES.md) for 10 detailed scenarios: - Malware incident response - Ransomware response - Data exfiltration detection - Lateral movement detection - Linux server compromise - Memory-only malware - Compliance evidence collection - And more... ### For Troubleshooting See [TROUBLESHOOTING.md](TROUBLESHOOTING.md) for: - Collection failures - Permission issues - Storage problems - Cost management - Performance optimization --- ## Summary Artifact collection is critical for incident response, forensics, threat hunting, and compliance. **Key Takeaways**: 1. Enable Artifact and Reliable Tasking extensions 2. Collect volatile data first (memory, processes, network) 3. Use appropriate collection method (manual, automated, reliable) 4. Implement suppression to prevent resource exhaustion 5. Monitor costs and retention 6. Follow evidence preservation best practices 7. Test collection rules before production deployment **Remember**: - Artifacts incur storage costs - Use templating in D&R rules for dynamic collection - Leverage Reliable Tasking for offline sensors - Preserve chain of custody - Collect only what's needed For more information, refer to: - LimaCharlie Artifact Extension documentation - Reliable Tasking Extension documentation - Endpoint Agent Commands reference - Detection & Response rules guide