---
name: performing-log-source-onboarding-in-siem
description: Perform structured log source onboarding into SIEM platforms by configuring collectors, parsers, normalization,
  and validation for complete security visibility.
domain: cybersecurity
subdomain: soc-operations
tags:
- siem
- log-onboarding
- log-management
- data-ingestion
- parsing
- normalization
- soc
version: '1.0'
author: mahipal
license: Apache-2.0
nist_csf:
- DE.CM-01
- DE.AE-02
- RS.MA-01
- DE.AE-06
---

# Performing Log Source Onboarding in SIEM

## Overview

Log source onboarding is the systematic process of integrating new data sources into a SIEM platform to enable security monitoring and detection. Proper onboarding requires planning data sources, configuring collection agents, building parsers, normalizing fields to a common schema, and validating data quality. According to the UK NCSC, onboarding should prioritize log sources that provide the highest security value relative to their ingestion cost.


## When to Use

- When conducting security assessments that involve performing log source onboarding in siem
- When following incident response procedures for related security events
- When performing scheduled security testing or auditing activities
- When validating security controls through hands-on testing

## Prerequisites

- SIEM platform deployed (Splunk, Elastic, Sentinel, QRadar, or similar)
- Network access from source systems to SIEM collectors
- Administrative access on source systems for agent installation
- Common Information Model (CIM) or equivalent schema documentation
- Change management approval for production system modifications

## Log Source Priority Framework

### Tier 1 - Critical (Onboard First)

| Source | Log Type | Security Value |
|---|---|---|
| Active Directory | Security Event Logs | Authentication, privilege escalation |
| Firewalls | Traffic logs | Network access, C2 detection |
| EDR/AV | Endpoint alerts | Malware, process execution |
| VPN/Remote Access | Connection logs | Unauthorized access |
| DNS Servers | Query logs | C2 beaconing, data exfiltration |
| Email Gateway | Email security logs | Phishing, BEC |

### Tier 2 - High Priority

| Source | Log Type | Security Value |
|---|---|---|
| Web Proxy | HTTP/HTTPS logs | Web-based attacks, data exfiltration |
| Cloud platforms (AWS/Azure/GCP) | Audit logs | Cloud security posture |
| Database servers | Audit/query logs | Data access, SQL injection |
| DHCP/IPAM | Address allocation | Asset tracking |
| File servers | Access logs | Data access monitoring |

### Tier 3 - Standard

| Source | Log Type | Security Value |
|---|---|---|
| Application servers | App logs | Application-level attacks |
| Print servers | Print logs | Data loss prevention |
| Badge/physical access | Access logs | Physical security correlation |
| Network devices (switches/routers) | Syslog | Network anomalies |

## Onboarding Process

### Step 1: Discovery and Assessment

```
1. Identify the log source:
   - System type and version
   - Log format (syslog, CEF, JSON, Windows Events, etc.)
   - Log volume estimate (EPS - events per second)
   - Network location and firewall requirements

2. Assess security value:
   - What threats can this source help detect?
   - Which MITRE ATT&CK techniques does it cover?
   - Is there an existing SIEM parser?

3. Estimate ingestion cost:
   - Daily volume in GB
   - License impact (per-GB or per-EPS pricing)
   - Storage retention requirements
```

### Step 2: Configure Log Collection

#### Syslog-Based Collection (Firewalls, Network Devices)

```conf
# rsyslog configuration for receiving syslog
# /etc/rsyslog.d/10-siem-collection.conf

# UDP reception
module(load="imudp")
input(type="imudp" port="514" ruleset="siem_forwarding")

# TCP reception
module(load="imtcp")
input(type="imtcp" port="514" ruleset="siem_forwarding")

# TLS reception
module(load="imtcp" StreamDriver.AuthMode="x509/name"
       StreamDriver.Mode="1" StreamDriver.Name="gtls")
input(type="imtcp" port="6514" ruleset="siem_forwarding")

ruleset(name="siem_forwarding") {
    # Forward to SIEM
    action(type="omfwd" target="siem.company.com" port="9514"
           protocol="tcp" queue.type="LinkedList"
           queue.filename="siem_fwd" queue.maxdiskspace="1g"
           queue.saveonshutdown="on" action.resumeRetryCount="-1")
}
```

#### Windows Event Log Collection (Splunk Universal Forwarder)

```conf
# inputs.conf on Splunk Universal Forwarder
[WinEventLog://Security]
disabled = 0
index = wineventlog
sourcetype = WinEventLog:Security
evt_resolve_ad_obj = 1
checkpointInterval = 5

[WinEventLog://System]
disabled = 0
index = wineventlog
sourcetype = WinEventLog:System

[WinEventLog://Microsoft-Windows-Sysmon/Operational]
disabled = 0
index = wineventlog
sourcetype = XmlWinEventLog:Microsoft-Windows-Sysmon/Operational
renderXml = true

[WinEventLog://Microsoft-Windows-PowerShell/Operational]
disabled = 0
index = wineventlog
sourcetype = XmlWinEventLog:Microsoft-Windows-PowerShell/Operational
```

#### Cloud Log Collection (AWS CloudTrail)

```json
{
  "AWSTemplateFormatVersion": "2010-09-09",
  "Resources": {
    "CloudTrailToSIEM": {
      "Type": "AWS::CloudTrail::Trail",
      "Properties": {
        "TrailName": "siem-cloudtrail",
        "S3BucketName": "company-cloudtrail-logs",
        "IsLogging": true,
        "IsMultiRegionTrail": true,
        "IncludeGlobalServiceEvents": true,
        "EnableLogFileValidation": true,
        "EventSelectors": [
          {
            "ReadWriteType": "All",
            "IncludeManagementEvents": true,
            "DataResources": [
              {
                "Type": "AWS::S3::Object",
                "Values": ["arn:aws:s3"]
              }
            ]
          }
        ]
      }
    }
  }
}
```

### Step 3: Parse and Normalize

#### Custom Parser Example (Splunk props.conf/transforms.conf)

```conf
# props.conf
[custom:firewall:logs]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%S%z
MAX_TIMESTAMP_LOOKAHEAD = 30
TRANSFORMS-firewall = firewall_extract_fields
FIELDALIAS-src = src_addr AS src_ip
FIELDALIAS-dst = dst_addr AS dest_ip
EVAL-action = case(fw_action=="allow", "allowed", fw_action=="deny", "blocked", true(), "unknown")
EVAL-vendor_product = "Custom Firewall"
LOOKUP-geo = geo_ip_lookup ip AS dest_ip OUTPUT country, city, latitude, longitude

# transforms.conf
[firewall_extract_fields]
REGEX = ^(\S+)\s+(\S+)\s+action=(\w+)\s+src=(\S+):(\d+)\s+dst=(\S+):(\d+)\s+proto=(\w+)\s+bytes=(\d+)
FORMAT = timestamp::$1 hostname::$2 fw_action::$3 src_addr::$4 src_port::$5 dst_addr::$6 dst_port::$7 protocol::$8 bytes::$9
```

#### CIM Field Mapping

| Raw Field | CIM Field | Data Model |
|---|---|---|
| src_addr | src_ip | Network_Traffic |
| dst_addr | dest_ip | Network_Traffic |
| dst_port | dest_port | Network_Traffic |
| fw_action | action | Network_Traffic |
| bytes_sent + bytes_recv | bytes | Network_Traffic |
| user_name | user | Authentication |
| login_result | action | Authentication |
| process_path | process | Endpoint |

### Step 4: Validate Data Quality

```spl
# Verify events are arriving
index=new_source earliest=-1h
| stats count by sourcetype, host, source

# Check field extraction quality
index=new_source earliest=-1h
| stats count(src_ip) as has_src count(dest_ip) as has_dest count(action) as has_action count by sourcetype
| eval src_coverage=round(has_src/count*100,1)
| eval dest_coverage=round(has_dest/count*100,1)
| eval action_coverage=round(has_action/count*100,1)

# Verify CIM compliance
| datamodel Network_Traffic search
| search sourcetype=new_sourcetype
| stats count by source, sourcetype

# Check for timestamp parsing issues
index=new_source earliest=-1h
| eval time_diff=abs(_time - _indextime)
| stats avg(time_diff) as avg_lag max(time_diff) as max_lag by host
| where avg_lag > 300
```

### Step 5: Enable Detection Coverage

```spl
# Verify existing correlation searches work with new source
index=new_source sourcetype=new_sourcetype
| tstats count from datamodel=Authentication by _time span=1h
| timechart span=1h count

# Create source-specific detection rule
[New Source - Authentication Anomaly]
search = index=new_source sourcetype=new_sourcetype action=failure \
| stats count by src_ip, user \
| where count > 10
```

## Onboarding Checklist

- [ ] Log source assessed and approved
- [ ] Network connectivity verified
- [ ] Collection agent/method configured
- [ ] Log forwarding confirmed
- [ ] Parser/field extraction configured
- [ ] CIM compliance validated
- [ ] Data model acceleration enabled
- [ ] Volume within license budget
- [ ] Retention policy configured
- [ ] Detection rules enabled/created
- [ ] Dashboard updated
- [ ] Documentation completed
- [ ] SOC team notified

## References

- [UK NCSC - Onboarding Systems and Log Sources](https://www.ncsc.gov.uk/collection/building-a-security-operations-centre/onboarding-systems-and-log-sources)
- [Sumo Logic - Cloud SIEM Onboarding Checklist](https://help.sumologic.com/docs/cse/get-started-with-cloud-siem/onboarding-checklist-cse/)
- [SIEM Logging Best Practices - Coralogix](https://coralogix.com/guides/siem/siem-logging/)
- [Huntress - SIEM Implementation Guide](https://www.huntress.com/siem-guide/siem-implementation-guide)